1 Introduction

Internal model control has been extensively studied in the case of linear systems, and has been shown good robust properties against disturbances. Nonlinear systems widely exist in industrial processes, and the development of internal model control for nonlinear models has already been paid great attention [1, 2]. For nonlinear systems, whether continuous systems or discrete systems, they are very difficult to establish internal models because of imperfect mathematic models of original systems. The internal model control cannot be implemented without inversion models, which hinders the internal model control method to solve control problems of nonlinear systems [3]. Hence, it is very insistent to find an approach to obtain inverse functions effectively by the information and knowledge of original systems.

In recent years, the birth of intelligent learning algorithms encourages the development of nonlinear control without precise models of processes. Neural networks, as early algorithms, have been discussed in the internal model control of nonlinear systems and have played an important role [4, 5]. Support vector machines (SVM) introduced by Vapnik are a new methodology in the area of nonlinear modeling after neural networks. While neural networks suffer from problems like the existence of many local minima and the choice of the number of hidden units [6], SVM are characterized by convex optimization problems based on sound theoretical principles, up to the determination of a few additional tuning parameters, and provide better generalization performance than that of neural networks [7, 8]. The convex quadratic programming problem is solved in dual space in order to determine the SVM model. In the optimization formulation, one works with equality instead of inequality constraints and the sum squared error instead of the epsilon-insensitive cost function, thus LS-SVM are proposed. This reformulation greatly simplifies the problem in such a way that the solution is a Karush-Kuhn-Tucker (KKT) system [9, 10]. Because errors in SVM and LS-SVM are only the noises of output variables, which is unreasonable since the input variables may be polluted by noises. Reference [11] proposed the total least squares method to deal with the problem and the normal least square support vector machine in reference [12] considered that noises of the input variable based on the total least squares method.

In this paper, we consider a novel LS-SVM with general errors that include the noises of input variables and output variables and an iterative algorithm is introduced to solve the LS-SVM. Then we propose the internal model control of LS-SVM with general errors for MIMO nonlinear discrete systems. We use LS-SVM to approximate the inverse system based on input-output data from the original system. The internal model is a pseudo linear system by connecting with the inverse model and the original system. The internal model controller is then designed by cascading a filter and the inversion of pseudo linear system. We focus on robust properties of internal model control in the case of disturbing signals and parameters varying. Comparatively speaking, the inverse control is a simple open-loop method whose ability is limit for disturbances and parameters varying. Simulation shows that the internal model control strategy based on LS-SVM is effective and has good performance.

This paper is organized as follows. In Sect. 2, we give a description of the class of MIMO nonlinear discrete systems, which helps to model an inverse system, and the control method based on inverse system is also be introduced. In Sect. 3, LS-SVM with general errors are given and an iterative algorithm is used to solve the LS-SVM, then the inverse model is approached by the novel LS-SVM based on effective data. In Sect. 4, we present the approach of internal model control and analyze the ability of the close-loop system. In Sect. 5, simulated processes are conducted to illustrate robustness of the internal model control system based on LS-SVM, comparing to the inverse system method proposed in the past. Conclusion is in Sect. 6.

2 System description and the control based on inverse system

We are interested in reversible MIMO nonlinear discrete systems. This kind of system is described by the following discrete nonlinear input-output model, \(\Upsigma: {\bf u}(k)\rightarrow {\bf y}(k)\)

$$ F({\bf y}(k+\alpha),{\bf y}(k+\alpha-1), \ldots, {\bf y}(k+\alpha-p), {\bf u}(k),{\bf u}(k-1),\ldots, {\bf u}(k-q))=0, $$
(1)

where \({\bf y}=(y_{1},\ldots,y_{n})\in R^{n}, {\bf u}=(u_{1},\ldots,u_{m})\in R^{m}, {\bf y}(k+\alpha-p)=(y_{1}(k+\alpha_{1}-p_{1}),y_{2}(k+\alpha_{2}-p_{2}),\ldots,y_{n}(k+\alpha_{n}-p_{n})), {\bf u}(k-q)=(u_{1}(k-q_{1}),u_{2}(k-q_{2}),\ldots, u_{m}(k-q_{m})), \alpha\) expressed relative delays of outputs to inputs, q denoted the input delays and \(max\{p_{1},p_{2}, \ldots,p_{n}\}\) denoted the order of the system.

The inverse system of the described system \(\Upsigma\) is expressed in the formula, \(\Upsigma^{'}:{\bf y}(k) \rightarrow {\bf u}(k)\)

$$ {\bf u}(k)=F^{-1}({\bf y}(k+\alpha),{\bf y}(k+\alpha-1),\ldots, {\bf y}(k+\alpha-p),{\bf u}(k-1),\ldots, {\bf u}(k-q)). $$
(2)

Denote \(\varphi(k)={\bf y}(k+\alpha)\) and \(\varphi_{1}(k)=y_{1}(k+\alpha_{1}),\ldots, \varphi_{n}(k)=y_{n}(k+\alpha_{n}).\) \( {\mathbf{y}}(k + \alpha - 1) = z^{{ - 1}} \varphi (k),{\mathbf{y}}(k + \alpha - 2) = z^{{ - 2}} \varphi (k), \ldots ,{\mathbf{y}}(k + \alpha - p) = z^{{ - p}} \varphi (k). \) For a reversible MIMO nonlinear discrete system, we express the α-th order inverse system [13] as follows, \(\Upsigma^{''}:\varphi(k) \rightarrow {\bf u}(k)\)

$$ \begin{aligned} {\bf u}(k)&=F^{-1}(\varphi(k),{\bf y}(k+\alpha-1),\ldots, {\bf y}(k+\alpha-p),{\bf u}(k-1),\ldots, {\bf u}(k-q))\\ &=F^{-1}(\varphi(k),z^{-1}\varphi(k),\ldots,z^{-p}\varphi(k), {\bf u}(k-1),\ldots, {\bf u}(k-q)).\\ \end{aligned} $$
(3)

The formula with the input \(\varphi(k)\) and the output u(k) is the inverse expression. We cascade the inverse system and the original system to rebuild a composite system as shown in the Fig. 1. The composite system is a pseudo-linear system with the following decoupling transfer function,

$$ G_{{ij}} (z) = {\frac{{y_{i} (z)}}{{\varphi _{j} (z)}}} = \left\{ {\begin{array}{*{20}c} {z^{{ - \alpha _{i} }} , \quad i = j.} \hfill \\ {0, \quad i \ne j.} \hfill \\ \end{array} } \right. $$
(4)

The relationship of nonlinear coupling still exists in the composite system, but it has the standard linear relationship in view of the transfer function, namely the original MIMO system is decoupled into independent single-input single-output pseudo-linear subsystems.

Fig. 1
figure 1

A MIMO nonlinear system is decoupled into single-input single-output α-th order pseudo-linear systems

3 The inverse model based on LS-SVM

3.1 LS-SVM for nonlinear function estimation

Given a training data set of M points \(\{{\bf x}_{i},y_{i}\}^{M}_{i=1}\) with input data \({\bf x}_{i}\in R^{n_{1}}\) and output data y i  ∈ R, one considers the following optimization problem [14] in primal space,

$$ \left\{ \begin{array}{ll} min &J_{1}={\frac{1}{2}}\|{\bf w}\|^{2}+{\frac{\gamma}{2}}\sum\nolimits^{M}_{i=1}e^{2}_{i}\\ \hbox{s.t. }& y_{i}={\bf w}^{{\rm T}}\phi({\bf x}_{i})+b+e_{i},\quad i=1,\ldots,M. \end{array} \right. $$
(5)

A function \(\phi:R^{n_{1}}\rightarrow R^{n_{2}},\) it maps the input space into a high dimensional (possibly infinite dimensional) feature space. A weight vector w, a error variable e i  ∈ R and a bias term b  ∈ R are in primal space. The cost function J consists of a fitting error and a regularization term. The relative importance of these terms is determined by the positive real constant γ. In the case, a smaller γ value can avoid the over fitting of noise data. The regression model of LS-SVM is \(f({\bf x})={\bf w}^{{\rm T}}\phi({\bf x})+b\) in primal space. The weight vector may be infinite dimensional, which makes a calculation of w from (5) impossible in general. Therefore, one computes the model in the dual space instead of the primal space. The Lagrange function constructed for problem (5) is

$$ L_{1}({\bf w},e_{i},\beta_{i},b)=J_{1}+\sum^{M}_{i=1}\beta_{i}[y_{i}-{\bf w}^{{\rm T}}\phi({\bf x}_{i})-b-e_{i}]. $$
(6)

\(\beta_{i}, i=1,\ldots,M\) is a Lagrange multiplier. According to KKT conditions [15], let the first order derivatives of L be zeros, namely

$$ {\frac{\partial L}{\partial {\bf w}}}=0;\quad{\frac{\partial L} {\partial e_{i}}}=0;\quad{\frac{\partial L}{\partial \beta_{i}}}=0;\quad{\frac{\partial L}{\partial b}}=0. $$
(7)

The following equations can be acquired

$$ \begin{aligned} {\bf w}&=\sum^{M}_{i=1}\beta_{i}\phi({\bf x}_{i});\quad\sum^{M}_{i=1}\beta_{i}=0;\\ \beta_{i}&=\gamma e_{i};\quad{\bf w}^{{\rm T}}\phi({\bf x}_{i})+b+e_{i}-y_{i}=0. \end{aligned} $$
(8)

According to Mercer kernel conditions, one can choose a kernel \(K(\cdot,\cdot),\) such that \(K({\bf x}_{1},{\bf x}_{2})=\phi({\bf x}_{1})^{{\rm T}}\phi({\bf x}_{2}).\) After elimination of w and e i , the optimization problem leads to the following linear system,

$$ \left[ \begin{array}{llll} 0& 1 & \cdots & 1\\ 1& K({\bf x}_{1},{\bf x}_{1})+{\frac{1}{\gamma}} & \cdots & K({\bf x}_{1},{\bf x}_{M})\\ \vdots& \vdots & \ddots & \vdots\\ 1& K({\bf x}_{M},{\bf x}_{1}) & \cdots & K({\bf x}_{M},{\bf x}_{M})+{\frac{1}{\gamma}}\\ \end{array} \right] \left[ \begin{array}{l} b\\ \beta_{1}\\ \vdots\\ \beta_{M} \end{array} \right] = \left[ \begin{array}{l} 0\\ y_{1}\\ \vdots\\ y_{M} \end{array} \right] . $$
(9)

The linear equation can be rewritten as:

$$ \left[ \begin{array}{ll} {\bf A} & {\bf I}\\ {\bf I}^{{\rm T}} & 0 \\ \end{array} \right] \left[ \begin{array}{ll} \varvec{\beta}\\ b \end{array} \right] = \left[ \begin{array}{ll} {\bf y}\\ 0 \end{array} \right], $$
(10)

where \({\bf A}={\bf K}+\varvec{\theta}, {\bf K}={(K_{ij})_{M\times M}}, \varvec{\theta}=diag({\frac{1}{\gamma}},\ldots,{\frac{1} {\gamma}}), {\bf I}=(1;1;\ldots;1), \varvec{\beta}=(\beta_{1};\beta_{2};\ldots;\beta_{M}),\) and \({\bf y}=(y_{1};y_{2}; \ldots;y_{M})\). We focus on an RBF kernel with parameters γ and σ2 in this paper. β i and b are the solution to the linear equations. The LS-SVM model at x becomes

$$ y({\bf x})=\sum^{M}_{i=1}\beta_{i}K({\bf x},{\bf x}_{i})+b. $$
(11)

Errors in optimization problem of LS-SVM only measures noises of output variables which use errors between expected outputs and predictions as the empirical errors, and then minimize the sum of square errors. But the noises of input variables still exist. So we describe the empirical errors [12] in the features pace as follows.

$$ \frac{y_{i}-{\bf w}^{{\rm T}}\phi({\bf x}_{i})+b}{\sqrt{1+\parallel {\bf w}\parallel^{2}}}=e_{i}, \quad i=1,\ldots,M. $$
(12)

The revised optimization problem in primal space is:

$$ min J_{2}={\frac{1}{2}}\|{\bf w}\|^{2}+{\frac{\gamma} {2}}\sum^{M}_{i=1}e^{2}_{i};\hbox{s.t.}(y_{i}-{\bf w}^{{\rm T}}\phi({\bf x}_{i})+b)/\sqrt{1+\parallel {\bf w}\parallel^{2}}=e_{i}. $$
(13)

We discuss an iterative learning algorithm to solve the optimization problem in the formula (13) in the dual space. At the t − th iteration, the Lagrange function is also rewritten as:

$$ L^{(t)}_{2}= {\frac{1}{2}}\|{\bf w}^{(t)}\|^{2}+{\frac{\gamma} {2}}\sum^{M}_{i=1}(e^{(t)})^{2}_{i}+\sum^{M}_{i=1}\beta^{(t)}_{i}\left(\frac{y_{i}-({\bf w}^{(t)})^{{\rm T}}\phi({\bf x}_{i})-b^{(t)}}{\sqrt{1+\parallel {\bf w}^{(t-1)}\parallel^{2}}}-e^{(t)}_{i}\right). $$
(14)

According to KKT conditions, partial differentiation of the Lagrange function is the following,

$$ \begin{aligned} {\frac{\partial L^{(t)}_{2}}{\partial {\bf w}^{(t)}}}&=0\Rightarrow {\bf w}^{(t)}=\sum^{M}_{i=1}{\frac{\beta_{i}^{(t)}}{\sqrt{1+\parallel {\bf w}^{(t-1)}\parallel^{2}}}}\phi({\bf x}_{i});\\ {\frac{\partial L^{(t)}_{2}}{\partial e_{i}^{(t)}}}&=0\Rightarrow e^{(t)}_{i}={\frac{1}{\gamma}}\beta^{(t)}_{i};\\ {\frac{\partial L^{(t)}_{2}}{\partial \beta_{i}^{(t)}}}&=0\Rightarrow\frac{y_{i}-({\bf w}^{(t)})^{{\rm T}}\phi({\bf x}_{i})-b^{(t)}}{\sqrt{1+\parallel {\bf w}^{(t-1)}\parallel^{2}}}=e^{(t)}_{i};\\ {\frac{\partial L^{(t)}_{2}}{\partial b^{(t)}}}&=0\Rightarrow\sum^{M}_{i=1}{\frac{\beta^{(t)}_{i}}{\sqrt{1+\parallel {\bf w}^{(t-1)}\parallel^{2}}}}=0.\\ \end{aligned} $$
(15)

We have the equation

$$ \sum^{M}_{i=1}\frac{\beta_{i}^{(t)}K({\bf x}_{i},{\bf x}_{j})}{\sqrt{1+\parallel {\bf w}^{(t-1)}\parallel^{2}}}+{\frac{1} {\gamma}}\beta^{(t)}_{i} \sqrt{1+\parallel {\bf w}^{(t-1)}\parallel^{2}}+b^{(t)}=y_{i}; $$
(16)

Define \(\alpha_{i}^{(t)}={\frac{\beta_{i}^{(t)}}{\sqrt{1+\parallel {\bf w}^{(t-1)}\parallel^{2}}}}, i=1,\ldots,M,\) we have equations

$$ \sum^{M}_{i=1}\alpha_{i}^{(t)}K({\bf x}_{i},{\bf x}_{j})+{\frac{1} {\gamma}}\alpha^{(t)}_{i}(1+\parallel {\bf w}^{(t-1)}\parallel^{2})+b^{(t)}=y_{i}; $$
(17)
$$ \sum^{M}_{i=1}\alpha_{i}^{(t)}=0. $$
(18)

So we get the linear equation:

$$ \left[ \begin{array}{ll} {\bf A}+\parallel {\bf w}^{(t-1)}\parallel^{2}\varvec{\theta} & {\bf I}\\ {\bf I}^{{\rm T}} & 0\\ \end{array} \right] \left[ \begin{array}{l} \varvec{\alpha}^{(t)}\\ b^{(t)} \end{array} \right] = \left[ \begin{array}{l} {\bf y}\\ 0 \end{array} \right] . $$
(19)

where still \({\bf A}={\bf K}+\varvec{\theta}, \varvec{\alpha}^{(t)}=(\alpha^{(t)}_{1};\alpha^{(t)}_{2};\ldots;\alpha^{(t)}_{M}).\) In order to solve the Eq. (19), we need to update \(\alpha^{(t)}, b^{(t)}\) and \(\parallel {\bf w}^{(t-1)}\parallel^{2}\) to find the solution of the optimization problem (13).

Lemma

LetAbe an invertible matrix, for the given matrixAandUVD, defineB = D − VA−1U, then the inverse matrix

$$ \left[ \begin{array}{ll} {\bf A} & {\bf U}\\ {\bf V} & {\bf D} \end{array} \right] ^{-1} = \left[ \begin{array}{ll} {\bf A}^{-1}+{\bf A}^{-1}{\bf U}{\bf B}^{-1}{\bf V}{\bf A}^{-1} & -{\bf A}^{-1}{\bf U}{\bf B}^{-1}\\ -{\bf B}^{-1}{\bf V}{\bf A}^{-1} &{\bf B}^{-1} \end{array} \right] $$
(20)

Specially, if U = I and \({\bf U}={\bf V}^{{\rm T}}\) hold, it has

$$ \left[ \begin{array}{ll} {\bf A} & {\bf I}\\ {\bf I}^{{\rm T}} & 0 \end{array} \right] ^{-1} = \left[ \begin{array}{ll} {\bf A}^{-1}-\frac{{\bf A}^{-1}{\bf I}{\bf I}^{{\rm T}{\bf A}^{-1}}}{{\bf I}^{{\rm T}}{\bf A}^{-1}{\bf I}} &\frac{{\bf A}^{-1}{\bf I}}{{\bf I}^{{\rm T}}{\bf A}^{-1}{\bf I}}\\ \frac{{\bf I}^{{\rm T}{\bf A}^{-1}}}{{\bf I}^{{\rm T}}{\bf A}^{-1}{\bf I}} &{\frac{-1}{{\bf I}^{\rm T}{\bf A}^{-1}{\bf I}}} \end{array} \right] $$
(21)

The original matrix in LS-SVM linear equation is invertible and the revised matrix still is invertible due to the same as rank of the original matrix. We use special symbols respectively to express the original matrix and the revised matrix:

$$ \Uppsi= \left[ \begin{array}{ll} {\bf A} & {\bf I}\\ {\bf I}^{{\rm T}} & 0\\ \end{array} \right], $$
(22)
$$ \Uppsi^{(t)}= \left[ \begin{array}{ll} {\bf A}+\parallel {\bf w}^{(t-1)}\parallel^{2}\varvec{\theta} & {\bf I}\\ {\bf I}^{{\rm T}} & 0 \end{array} \right]. $$
(23)

According to the linear Eq. (19), our motivation is to get \((\Uppsi^{(t)})^{-1}.\) From the lemma and the formula (21), if we hope to get \((\Uppsi^{(t)})^{-1},\) we need \(({\bf A}+\parallel {\bf w}^{(t-1)}\parallel^{2}\varvec{\theta})^{-1}.\) Since A is symmetric and positive definite, there exists an orthogonal matrix P and \({\bf P}^{{\rm T}}={\bf P}^{-1},\) so that \({\bf A}={\bf P}^{{\rm T}}\varvec{\Uplambda} {\bf P},\) where \(\varvec{\Uplambda}=diag(\lambda_{1},\lambda_{2},\ldots,\lambda_{M})\) and \(\lambda_{1},\lambda_{2},\ldots, \lambda_{M}\) are the positive eigenvalues of A. So we have

$$ {\bf A}+\parallel {\bf w}^{(t-1)}\parallel^{2}\varvec{\theta}={\bf P}^{{\rm T}}diag \left\{\lambda_{1}+\frac{\parallel {\bf w}^{(t-1)}\parallel^{2}}{\gamma},\ldots,\lambda_{M}+\frac{\parallel {\bf w}^{(t-1)}\parallel^{2}}{\gamma}\right\}{\bf P} $$
(24)
$$ ({\bf A}+\parallel {\bf w}^{(t-1)}\parallel^{2}\varvec{\theta})^{-1}={\bf P}^{{\rm T}}diag\left\{{\frac{1}{\lambda_{1}}}+{\frac{\gamma}{\parallel {{{\bf w}}^{(t-1)}}\parallel^{2}}},\ldots,{\frac{1} {\lambda_{M}}}+{\frac{\gamma}{\parallel {{{\bf w}}^{(t-1)}}\parallel^{2}}}\right\}{\bf P} $$
(25)

We derive the inverse of \({\bf A}+\parallel {\bf w}^{(t-1)}\parallel^{2}\varvec{\theta},\) then update \((\Uppsi^{(t)})^{-1}\) by using the formula (21), and find the solution of (19). The iterative updating algorithm can be summarized as follows.

Algorithm 1

Iterative updating learning algorithm for solving the novel LS-SVM.

  1. 1.

    Set parameters of LS-SVM. Find the orthogonal matrix P and the diagonal matrix \(\varvec{\Uplambda},\) so that \({\bf A}={\bf P}^{{\rm T}}\varvec{\Uplambda}{\bf P}^{-1}.\) Store P and \(\varvec{\Uplambda}.\)

  2. 2.

    Computer the inverse of \(\Uppsi\)using the formula (21) and solve the problem (10). Set the solution of LS-SVM as \(\varvec{\beta}^{0}, b^{0}.\) Let t = 1 and \(\varvec{\alpha}^{0}=\varvec{\beta}^{0}\) and computer \(\parallel {\bf w}^{(0)} \parallel^{2}=(\alpha^{(0)})^{{\rm T}} {\bf K} \alpha^{(0)}\).

  3. 3.

    Computer \(({\bf A}+\parallel {\bf w}^{(t-1)}\parallel^{2}\varvec{\theta})^{-1}\) using the formula (25), then \((\Uppsi^{(t)})^{-1}\) can be computed using (21).

  4. 4.

    The solution of the novel LS-SVM in the formula (13) can be obtained by multiplying \((\Uppsi^{(t)})^{-1}.\) Record the solution of the Eq. (13) as \(\varvec{\alpha}^{(t)}\) and b (t), and computer \(\parallel {\bf w}^{(t)} \parallel^{2}=(\varvec{\alpha}^{(t)})^{{\rm T}} {\bf K} \varvec{\alpha}^{(t)}.\)

  5. 5.

    If the stop condition \(\eta=\parallel \frac{\sqrt{\parallel{\bf w}^{(t)}\parallel^{2}}}{\parallel{\bf w}^{(t)}\parallel^{2}}-\frac{\sqrt{\parallel{\bf w}^{(t-1)}\parallel^{2}}} {\parallel{\bf w}^{(t-1)} \parallel^{2}} \parallel < \zeta\) holds for a positive number \(\zeta,\) go to 6; Otherwise, set t = t + 1, go to 3.

  6. 6.

    Let \(\beta_{i}=\frac{\alpha_{i}^{(t)}}{\sqrt{1+\parallel {\bf w}^{(t-1)} \parallel^{2}}}\) and b = b (t). The output of the novel LS-SVM is \(y({\bf x})=\sum^{M}_{i=1}\beta_{i}K({\bf x},{\bf x}_{i})+b.\)

3.2 α-th order inverse model based on LS-SVM

For the described nonlinear discrete system, the α-th order inverse system is expressed in the formula (3). Both the precise mathematic model of the original system and the explicit expression of u(k) cannot be obtained. For the imperfect model, we adapt the novel LS-SVM to approximate the inverse model based on the input-output data acquired from the original system. Because LS-SVM can only be used for the estimation of single output functions, in order to identify multiple output objects, it is necessary to learn respectively for each subsystem. The number of subsystems is equal to the number of output variables.

Algorithm 2

Inverse model approaching algorithm based on the novel LS-SVM.

  1. 1.

    Select a proper excitation signal, such as the white noise, etc.

  2. 2.

    Obtain input-output data of original system by using the excitation signal as the input. Sort data into training and testing samples in the form of \(\{S_{i},u_{i}\}^{M}_{i=1}.\)

  3. 3.

    Use parameters γ and σ2 and train the novel LS-SVM to acquire the inverse submodel of each decoupling subsystem.

  4. 4.

    Test the generalization ability of inverse models using testing data.

  5. 5.

    Assemble all inverse submodels to get the inverse model of original system due to decoupling subsystems.

After acquiring the so-called inverse system of the original MIMO system, cascade the inverse model and the original system to constitute the pseudo-linear system which is a simple open loop control based on inverse system.

4 The internal model control of the MIMO nonlinear discrete system

The internal model control for discrete processes has the following properties [16].

Property 1

Stability Criterion. When the internal model is exact, stability of both controller and plant is sufficient for overall system stability.

Property 2

Perfect Controller. Under the assumption that the internal model is perfect and that the plant and the controller is stable, if there is no disturbance, the perfect control can be achieved when the controller is the inverse of internal model.

The pseudo-linear system acquired by connecting with the inverse model and the original system has basically linearization, the basic diagram of internal model control based on LS-SVM is shown in the Fig. 2. G m (z) represents the pulse transfer function matrix of internal model, G(z) represents the controlled plant, G c (z) represents the pulse transfer function matrix of internal model controller, D(z) is the disturbance function, R(z) is the input function, Y(z) is the output function. The internal model control strategy provides a feedback control for nonlinear systems. One usually just chooses a diagonal matrix constituted by relative orders of independent subsystems as the transfer function of the internal model, namely \(G_{m}(z)=diag\{z^{-\alpha_{1}},z^{-\alpha_{2}},\ldots,z^{-\alpha_{n}}\}.\) Considering that the actual composite system G(z) maybe have an error in modeling, the pseudo-linear system can be assumed \(G(z)=G_{m}(z)(1+h_{m}(z)),\) h m (z) expresses the unmodeled error function. We assume that h m (z) is linear and bounded. The MIMO internal model control is illustrated by the Fig. 3.

Fig. 2
figure 2

The diagram of internal model control based on LS-SVM

Fig. 3
figure 3

The MIMO internal model control system

The internal model controller denotes as G f (z) that is the product of a robust filter F(z) and G c (z). Still let \(G_{c}(z)=G_{m}^{-1}(z).\) The robust filter F(z) is usually to reduce the sensitivity of the internal model control system. Reference [16] offers a detailed introduction about how to design the robust filter. The internal model controller can be rewritten as \(G_{f}(z)=F(z)G_{m}^{-1}(z).\) According to Property 1, which requires that the object and the controller are input-output stable to make sure the control system is stable. For a decoupling linear system, \(G_{m}(z)=diag\{z^{-\alpha_{1}},z^{-\alpha_{2}},\ldots,z^{-\alpha_{n}}\},\) we ask \(G_{m}^{-1}(z)=diag\{1,1,\ldots,1\}\) in order to keep the controller stable. Using the following filter

$$ F(z) = diag\left\{ {{\frac{{1 - l_{1} }}{{1 - l_{1} z^{{ - \alpha _{1} }} }}}, \ldots ,{\frac{{1 - l_{n} }}{{1 - l_{n} z^{{ - \alpha _{n} }} }}}} \right\}, $$
(26)

the output of the closed-loop system can be described as

$$ Y(z)={\frac{G_{f}(z)G(z)R(z)+(1-G_{f}(z)G_{m}(z))D(z)} {1+G_{f}(z)(G(z)-G_{m}(z))}}, $$
(27)

and the error is

$$ \begin{aligned} E(z)&=Y(z)-R(z)\\ &={\frac{(G_{f}(z)G_{m}(z)-1)R(z)+(1-G_{f}(z)G_{m}(z))D(z)} {1+G_{f}(z)(G(z)-G_{m}(z))}}. \end{aligned} $$
(28)

5 Simulation

In this section, aiming at the multivariate, nonlinear and strong coupling plant, we illustrate the performance of internal model control based on LS-SVM. The discrete model in the simulation is as follows,

$$ \begin{aligned} y_{1}(k)&={\frac{0.6y_{1}(k-1)\sin(y_{1}(k-2))}{1+y_{1}^{2}(k-1)+y_{2}^{2}(k-2)}}\\ &\quad+0.3u_{1}(k-2)+u_{1}(k-1)+0.2u_{2}(k-2),\\ y_{2}(k)&={\frac{0.5y_{2}(k-1)\cos(y_{2}(k-2))}{1+y_{2}^{2}(k-1)+y_{1}^{2}(k-2)}}\\ &\quad+0.4u_{2}(k-2)+u_{2}(k-1)+0.5u_{1}(k-2). \end{aligned} $$
(29)

Suppose that the precise mathematic model of original system is unknown and it is reversible. α1 = 1, α2 = 1, m = 2, \(n=2, p_{1}=1, p_{2}=1, q_{1}=2, q_{2}=2.\)

5.1 The internal model control and the open loop control based on inverse system

Give white noise sequences to the two input ends and the above model is used to produce data of 1000 groups. Utilize 500 groups to train and the other 500 groups to test. Fitting factors of every group are \(S_{1}=\{y_{1}(k),y_{1}(k-1),y_{1}(k-2),u_{1}(k-2),y_{2}(k-2),u_{2}(k-2)\}\) and \(S_{2}=\{y_{2}(k),y_{2}(k-1),y_{2}(k-2),u_{2}(k-2),y_{1}(k-2),u_{1}(k-2)\}\) respectively. With RBF kernel function, γ = 1000 and σ2 = 30 are selected. We can obtain inverse submodels by the novel LS-SVM. The index of root mean square error is denoted as \(RSME=\sqrt{{\frac{\sum_{i=1}^{n}x_{i}^{2}}{n}}}.\) RSMEs of inverse submodels for testing data equal 0.136 and 0.096. The Fig. 4 shows testing curves and error curves of the inverse models approximated.

Fig. 4
figure 4

Testing curve and error curve of the inverse model approximated by the novel LS-SVM

Due to a decoupling linear system, assemble all inverse submodels to get the inverse model of original system. As the relative order of the system \(\alpha_{1}=\alpha_{2}=1,\) the internal model G m (z) takes as \(diag\{{\frac{1}{z}},{\frac{1}{z}}\}.\) The simple controller is \(G_{f}(z)=G_{m}^{-1}(z)=diag\{1,\ldots,1\}\). \(G_{f}(z)=F(z)G_{m}^{-1}(z)\) is a internal model control with the filter \(F(z)=diag\{{\frac{1-l_{i}}{1-l_{i}z^{-1}}}\},\) \(0\leq l_{i} \leq 1, i=1,2.\) Set \(l_{1}=l_{2}=0.5\) in the simulation.

By cascading the inverse system, the multivariable coupling system has been decoupled into two pseudo-linear systems. The performance of open-loop system based on inverse control under a given square-wave reference is shown in the Fig. 5. Comparing with the open-loop system only based on inverse control under the same reference, the internal model control system achieves a good tracking to a square-wave. In order to reduce jitter, we introduced a filter into the simple internal model control system. The performance of open-loop system based on inverse control is bigger errors and jitter than that of the internal model control system according to tracking curves from trajectories shown in the figure. We use a mixture signal of sin waves with different frequency as a reference signal again. The tracking performances of the open-loop system and the internal model control system for the reference signal are shown in the Fig. 6, which illustrate a unit delay tracking for the reference signal. The internal model control system has better tracking performance than that of the open-loop system.

Fig. 5
figure 5

Systems trajectories under a given square-wave reference. The first subfigure is the curves of open-loop control, the second one is the curves of simple internal model control, and the third one is the curves of internal model control with a filter

Fig. 6
figure 6

Systems trajectories under a mixture reference signal. The first one is the curves of open-loop control, the second one is the curves of internal model control with a filter

5.2 Robustness of disturbance rejection

At k = 100 and k = 110, the two decoupling subsystems are interfered respectively by an external step signal with the amplitude of 0.1. The square-wave response of internal model control system can be observed from the Fig. 7, the strategy of simple internal model control has good robustness to inhibit a step disturbance and keep following the reference signal after a little jitter, but the duration of the jitter is a litter longer than the time excepted. After adding a filter, the duration of the jitter is reduced. When the open loop system is disturbed by the same step signal, the system can not follow the reference signal and deviates from the reference signal greatly, which shows weak robustness. Disturbances lead to bad steady-state errors in the open-loop system.

Fig. 7
figure 7

Systems trajectories with external step disturbance. The first subfigure is the curves of open-loop control, the second one is the curves of simple internal model control, and the third one is the curves of internal model control with a filter

5.3 Robustness of variable parameters

Parameters of the nonlinear system change, namely the original nonlinear system changes into the following formula,

$$ \begin{aligned} y_{1}(k)&={\frac{0.8y_{1}(k-1)\sin(y_{1}(k-2))}{1+y_{1}^{2}(k-1)+y_{2}^{2}(k-2)}}\\ &\quad+0.6u_{1}(k-2)+u_{1}(k-1)+0.6u_{2}(k-2),\\ y_{2}(k)&={\frac{0.6y_{2}(k-1)\cos(y_{2}(k-2))}{1+y_{2}^{2}(k-1)+y_{1}^{2}(k-2)}}\\ &\quad+0.6u_{2}(k-2)+u_{2}(k-1)+0.6u_{1}(k-2). \end{aligned} $$
(30)

When parameters of nonlinear system vary as described in (30), it is equal to the mismatch between the object and the model. Simulation results are shown in the Fig. 8. The simple internal model control can still achieve tracking the reference, but a large jitter exists. Under a filter, the jitter is reduced. But the disturbance of parameters varying induces a larger oscillations for the open loop system.

Fig. 8
figure 8

Systems trajectories with parameters varying. The first subfigure is the curves of open-loop control, the second one is the curves of simple internal model control, and the third one is the curves of internal model control with a filter

6 Conclusion

In this study, we firstly introduce the novel LS-SVM which considers noises of input variables and output variables, then the internal model control based on the novel LS-SVM for MIMO nonlinear discrete systems is presented. The proposed method overcomes the problem of the imperfect mathematic model of original system and identifies the inverse model accurately by way of input-output data. The advantage of internal model control is the excellent robustness with respect to a disturbance signal and a model mismatch, which is illustrated in the simulation.