1 Introduction

In recent decades, parameter identification techniques have attracted extensive interest in system modeling, signal processing, control system design and fault diagnosis, etc. [5, 18, 24, 34]. In system modeling, block-oriented models (B-O-M) have become one of the popular nonlinear modeling approaches because of their advantages to capture the behavior of the practical system, and to provide a model analysis for control system. On the other hand, B-O-M consist of interaction of linear dynamic submodels and static nonlinear submodels. Such combination characteristic is flexible to use and easy to understand. By far the simplest and most widely used combination models are Hammerstein (for short, H-S) and Wiener systems (W-S). The extension versions involve Wiener–Hammerstein (for short, W–H-S) and Hammerstein–Wiener systems (H–W-S). The extended systems can provide higher modeling capabilities than the H-S and W-S. In this paper, the focus is made on the identification and system modeling of the W–H-S as depicted in Fig. 1, where \(L_{1}\) is the first subsystem, the middle friction nonlinearity is denoted by \(f(\cdot )\), and the third subsystem is \(L_{2}\).

Fig. 1
figure 1

Wiener–Hammerstein systems with friction

Many efforts have been made on the parameter identification and system modeling of W–H-S [6, 11, 20, 26, 30]. Giordano et al. [6] developed a modified fractional approach to divide the poles and zeros for two linear subsystems of W–H-S by transforming discrete problem into continuous problem way, and estimated initial value of systems based on the best linear approximation method and Newton algorithm. Mzyk et al. [20] utilized a kernel-type nonparametric estimate to estimate static nonlinear part of W–H-S and used kernel-correlation estimator to identify the linear dynamic characteristic. Dong et al. [2] applied two-step estimation method to obtain the parameter information for the micro-positioning stage with piezoelectric actuator based on the W–H-S with hysteresis nonlinearity. In Schoukens et al. [23], the W–H-S are approximated based on the best linear approximation technique, and the parameter of models are estimated by using the L–M approach. In some mild assumptions, the presented estimation schemes are effective. However, in aforementioned works, the W–H-S in which the middle nonlinear block is a linear combination of basis functions (e.g., [20, 23, 30]) or rational fraction function (e.g., [6]) is presented to model the practical system. When the practical system exhibits a strong nonlinear information, the W–H-S with a linear combination of basis function or rational fraction function may not effectively capture system dynamics or even may not be valid. Therefore, it is necessary to replace the above nonlinear block by using an appropriate nonlinear element when we model the complex nonlinear system such as the servo system and X–Y positioning table. In [32], a switched nonlinear autoregressive moving average eXogenous model developed by Zhang et al. is used to model the turntable servo system, in which system parameters are estimated by using the particle swarm optimization algorithm. A cascade model for the servo system is presented in the literature [31], in which the particle swarm optimization is used to identify the system parameter, the cross-validation shows that there is a good correspondence between the real equipment and the identified model. Na et al. [22] used an improved Lyapunov–Krasovskii functions to handle the unknown time-delays, and a novel high-order neural network model is used to model the unknown deadzone and others nonlinearities of robotic servo system. Finally, the tracking control is achieved by developing an adaptive controller. In the above literature, the modeling of servo system is based on the global approximation method, partial approximation method or simple cascade model, which may produce an unsatisfactory modeling performance due to the complex dynamic of servo system. Inspired by literature [2], the complex W–H-S model is used to model for servo system based on a novel friction submodel.

In recent years, the design and development of multi-innovation identification idea have been proven to be simple and effective in theory [17, 25] and engineering application [14, 33]. The basic idea of multi-innovation theory is to expand the scalar innovation into multi-innovation vector or matrix by using the several groups of data, to improve the utilization rate of estimator based on the current and past time system information, which can enhance the estimation accuracy and convergence speed of identification algorithm. Multi-innovation theory is a natural extension of scalar innovation, which only uses available system information and does not need an additional design to improve the identification performance. Inspired by the multi-innovation identification idea, a new reframed identification approach for the identification of Wiener–Hammerstein with friction and the modeling of the turntable servo system are designed by using the intermediary step updating. The major contributions of this paper are provided as follows:

  1. (1)

    A new reframed multi-innovation least-squares scheme (RMILS) is proposed to solve the problem of multi-innovation length p in conventional MILS method by using the intermediary step updating. The multi-innovation update is transformed into p sub-innovations update, which effectively addresses inversion of covariance matrix. Moreover, the consistency proof of the presented estimator has indicated that the estimation value can converge to desired value.

  2. (2)

    According to the nonlinear characteristic of turntable servo system, we choose the friction nonlinearity as nonlinear submodel rather than a linear combination of basis functions or rational fraction function, which is more suitable for modeling the turntable servo system.

  3. (3)

    The efficacy and usefulness of the developed method is checked on the system identification and modeling of turntable servo system based on the W–H-S with frication nonlinearity. Numerical verification and experimental analyses are convincingness and rationality for using the algorithm presented in practice.

The remainder of this paper is given as follows. The W–H-S with friction is introduced in Sect. 2. Section 3 derives the RMILS approach. The convergence analysis of the proposed algorithm is given in Sect. 4. Section 5 provides the numerical example and experiment, and discussion is given in Sect. 6. Section 7 shows some conclusions.

2 System Description and Identification Model

The structure of the W–H-S with friction nonlinearity to model the turntable servo system is displayed in Fig. 1, where u(t) and y(t) denote the input–output of the system, x(t), v(t) are outputs of the linear subsystem (\(L_{1}\)) and friction. d(t) represents the noise signal. \(L_{1}\) and \(L_{2}\) can be expressed as

$$\begin{aligned} x(t)= & {} \sum _{i=1}^{n_{a}} a_{i} u(t-i)-\sum _{j=1}^{n_{b}} b_{j} x(t-j), \end{aligned}$$
(1)
$$\begin{aligned} y(t)= & {} \sum _{m=1}^{n_{c}} c_{m} v(t-m)-\sum _{n=1}^{n_{d}} d_{n} y(t-n)+d(t). \end{aligned}$$
(2)

Assume that the linear dynamic subsystems \(L_{1}\) and \(L_{2}\) are stable, \(n_{a}\), \(n_{b}\), \(n_{c}\) and \(n_{d}\) are known. \(a_{1}, \ldots ,a_{n_a},\;b_{1},\ldots ,b_{n_b},\;c_{1},\ldots ,c_{n_c}\) and \(d_{1},\ldots ,d_{n_d}\) are unknown, \(u(t)=0, x(t)=0, v(t)=0\) and \(y(t)=0\) for \(t\le 0\). Let both \(a_{1}=1, c_{1}=1\) for the unique description of identification model [3, 12].

The identifications of discontinuous or piecewise continuous friction model parameters are time-consuming [1]. Moreover, these models may lead to a problematic for smooth control actions [21]. In this paper, a newly proposed continuously differentiable friction model [19, 29] is applied, which is described by the following parameterized form:

$$\begin{aligned} v(t)&=\gamma _{1}\tanh (\gamma _{2}\dot{x}(t))-\gamma _{1}\tanh (\gamma _{3}\dot{x}(t))+\gamma _{4}\tanh (\gamma _{5}\dot{x}(t))+\gamma _{6}\dot{x}(t), \end{aligned}$$
(3)

where \(\gamma _{i}, i=1,2,\ldots ,6\) represent positive parameters.

Compared with the mentioned friction model (e.g., Coulomb, LuGre and GMS), friction model (3) possesses a continuously differentiable form to enable more suitable parameter identification. It can capture the following major behavior discussed in friction modeling: (1) \(\gamma _{1}\) and \(\gamma _{4}\) denote the coefficient of the static friction. (2) \(\tanh (\gamma _{2}\dot{x})-\tanh (\gamma _{3}\dot{x})\) can capture the so-called Stribeck effect. (3) \(\gamma _{4}\tanh (\gamma _{5}\dot{x})\) dominates the Coulomb friction. (4) The viscous dissipation term is given by \(\gamma _{6}\dot{x}\). For more information on friction model (3), the reader can refer to [19].

To achieve the system modeling of the turntable servo system, based on the property of function \(\tanh (\cdot )\), the output and input characteristics of Eq. (3) can be described by the following equation:

$$\begin{aligned} v(t)=\gamma _{6}x(t)-\gamma _{6}x(t-1)+\beta _{1}\hbox {sign}(x(t)-x(t-1))-\beta _{2}\tanh (x(t)-x(t-1)), \end{aligned}$$
(4)

where \(\beta _{1},\beta _{2}\) are positive parameters, \(\hbox {sign}(\cdot )\) denotes the sign function.

Substituting (1) and (4) into (2) and applying the key-term separation principle [27] yields

$$\begin{aligned} y(t)&=c_{1}\gamma _{6}\sum _{i=1}^{n_{a}} a_{i} u(t-i-1)-c_{1}\gamma _{6}\sum _{j=1}^{n_{b}} b_{j} x(t-j-1)-c_{1}\gamma _{6}x(t-2)\nonumber \\&\quad +c_{1}\beta _{1}\hbox {sign}(x(t-1)-x(t-2))-c_{1}\beta _{2}\tanh (x(t-1)-x(t-2))\nonumber \\&\quad +\sum _{m=2}^{n_{c}} c_{m} v(t-m)-\sum _{n=1}^{n_{d}} d_{n} y(t-n)+d(t). \end{aligned}$$
(5)

In order to obtain the identification model, (5) can be equivalently written as

$$\begin{aligned} y(t)=\varphi ^T(t)\theta +d(t), \end{aligned}$$
(6)

where the data vector \(\varphi (t)\) and the vector \(\theta \) are listed as follows:

$$\begin{aligned} \varphi (t)&=[u(t-2),u(t-3),\ldots ,u(t-n_{a}-1),-x(t-2),-x(t-3),\ldots ,\\&\quad -x(t-n-{b}-1),\hbox {sign}(x(t-1)-x(t-2)), -\tanh (x(t-1)-x(t-2)),\\&\quad v(t-2),v(t-3),\ldots ,v(t-n_{c}),-y(t-1),-y(t-2),\ldots ,-y(t-n_{d})]^\mathrm{T}, \end{aligned}$$

and

$$\begin{aligned} \begin{array}{ll} \theta =&{}[c_{1}a_{1}\gamma _{6},\ldots ,c_{1}a_{n_a}\gamma _{6},c_{1}b_{1}\gamma _{6}+c_{1}\gamma _{6},c_{1}b_{2}\gamma _{6},\ldots c_{1}b_{n_b}\gamma _{6}, c_{1}\beta _{1},c_{1}\beta _{2},\\ &{}c_{2},c_{3},\ldots ,c_{n_c}, d_{1},d_{2},\ldots ,d_{n_d}]^\mathrm{T}, \end{array} \end{aligned}$$

where

$$\begin{aligned} \begin{array}{lll} \gamma _{6}=c_{1}a_{1}\gamma _{6},a_{2}=c_{1}a_{2}\gamma _{6}/\gamma _{6},\ldots ,\ a_{n_a}=c_{1}a_{n_a}\gamma _{6}/\gamma _{6}, b_{1}=(c_{1}b_{1}\gamma _{6}+c_{1}\gamma _{6})/\gamma _{6} -1,\\ b_{2}=c_{1}b_{2}\gamma _{6}/\gamma _{6},\ldots ,b_{n_b}=c_{1}b_{n_b}\gamma _{6}/\gamma _{6}, \beta _{1}=c_{1}\beta _{1},\beta _{2}=c_{1}\beta _{2}. \end{array} \end{aligned}$$

3 Reframed Multi-innovation Least-Squares Algorithm

In the field of system identification, recursive least squares method (RLS) is one of the most popular identification algorithms [8, 9]. The multi-innovation least-squares algorithm (MILS) is designed for the purpose of improving the convergence rate of the RLS [4, 28]. To display the advantage of the RMILS, the MILS will be briefly introduced.

For the identification model (6), defining the criterion function \(J(\theta )=[y(t)-\varphi ^T(t)\theta ]^2\) and minimizing the quadratic criterion \(J(\theta )\), then we can get the RLS method, in which \(e(t)=y(t)-\varphi ^T(t)\hat{\theta }(t-1)\) represents the scalar innovation [10]. Expanding the scalar innovation e(t) to the innovation vector (multi-innovation) \(\mathbf{E} (p,t)\) by using newest p data: \(\mathbf{Y} (p,t)=[y(t),y(t-1),\ldots ,y(t-p+1)]^\mathrm{T}\), \(\mathbf{E} (p,t)=[e(t),e(t-1),\ldots ,e(t-p+1)]^\mathrm{T}\), \({\phi }^T(p,t)=[\varphi ^T(t),\varphi ^T(t-1),\ldots ,\varphi ^T(t-p+1)]^\mathrm{T}\), where p represents the innovation length (\(p\ge 1\)).

Based on the innovation vector, MILS is written as follows:

$$\begin{aligned} \hat{\theta }(t)= & {} \hat{\theta }(t-1)+L(t)[\mathbf{Y} (p,t)-\varvec{\phi }^T(p,t)\hat{\theta }(t-1)], \end{aligned}$$
(7)
$$\begin{aligned} L(t)= & {} P(t-1)\varvec{\phi }(p,t)[I_{p}+\varvec{\phi }^T(p,t)P(t-1)\varvec{\phi }(p,t)]^{-1}, \end{aligned}$$
(8)
$$\begin{aligned} P(t)= & {} P(t-1)-L(t)\varvec{\phi }^T(p,t)P(t-1), P(0)=p_{0}I. \end{aligned}$$
(9)

Noting that (7), (8), and (9) involve the unknown variables \(\varphi (t)\) in \(\varvec{\phi }(p,t)\), and \(x(t-i)\), \(v(t-i)\) in \(\varphi (t)\), one solution is to reconstruct the auxiliary model through the usage of the auxiliary model method [15, 16]. The basic idea of the auxiliary model method is to replace the unknown variable by using their estimated values, i.e., \(x(t-i)\) is replaced by \(\hat{x}(t-i)\), where \(\hat{x}(t-i)\) is the estimate of \(x(t-i)\). Then, the information vector \(\varphi (t)\) in (6) is replaced by using \(\hat{\varphi }(t)\) such that \(\varvec{\phi }(p,t)\) is also known.

In the MILS method, the inverse of the covariance matrix in Eq. (8) needs to be calculated, which leads to a large amount of computation and singular value phenomenon. To tackle the above problem, the multi-innovation updating is divided into p sub-innovations updating steps by introducing the intermediary step updating. Then, the covariance matrix is transformed into scalar operation, which decreases the computational burden of the algorithm and improves the identification performance of the method. The decomposition is described as follows.

When \(i=1,\ldots ,p\), the matrix \([1+\varphi ^T(t,i)P_{i-1}(t)\varphi (t,i)]\) is transformed into scalar operation. To achieve this purpose, y(t) and \(\varphi (t)\) are modified to y(ti) and \(\varphi (t,i)\), then the RMILS is written as follows:

$$\begin{aligned} \hat{\theta }_{i}(t)= & {} \hat{\theta }_{i-1}(t)+L_{i}(t)e(t,i), \end{aligned}$$
(10)
$$\begin{aligned} e(t,i)= & {} y(t,i)-\varphi ^T(t,i)\hat{\theta }_{i-1}(t),\end{aligned}$$
(11)
$$\begin{aligned} L_{i}(t)= & {} P_{i-1}(t)\varphi (t,i)[1+\varphi ^T(t,i)P_{i-1}(t)\varphi (t,i)]^{-1},\end{aligned}$$
(12)
$$\begin{aligned} P_{i}(t)= & {} P_{i-1}(t)-L_{i}(t)\varphi ^T(t,i)P_{i-1}(t),\end{aligned}$$
(13)
$$\begin{aligned} P_{0}(t)= & {} P_{p}(t-1),L_{0}(t)=L_{p}(t-1),\hat{\theta }_{0}(t)=\hat{\theta }_{p}(t-1). \end{aligned}$$
(14)

Remark 1

Compared with the MILS method, the RMILS approach can exact the system information as far as possible by using p the sub-innovation update steps under the same conditions without computing the inverse of covariance, which effectively improves the performance of the parameter identification. In iterative algorithm [30], the estimated parameter is updated by using L sets of data each time, which is repeated k times. In presented method, the estimated parameter is updated by using a set of data each time, which is computed from 1 to p. For the same system, in general, the length of innovation p is much less than the iterative time k.

In Eqs. (10)–(14), i represents the sub-innovation updating step, and \(\hat{\theta }_{0}(0)=I/p_{0}\). when \(i=p\) holds, then the next updating will be started.

4 Convergence Analysis

In this section, the convergency of the RMILS scheme will be studied. Assume that \(\left\{ d(t),\mathscr {F}_{t}\right\} \) is a bounded martingale, in which the algebra sequence \(\left\{ \mathscr {F}_{t}\right\} \) is constituted by \(\left\{ d(t)\right\} \), and the noise \(\left\{ d(t)\right\} \) satisfies [7]

  1. (A1)

    \(E[d(t)|\mathscr {F}_{t-1}]=0, \hbox {a.s.}\),

  2. (A2)

    \(E[\Vert d(t)\Vert ^2|\mathscr {F}_{t-1}]=\sigma ^2_{d}(t)\le \sigma _{d}^2<\infty , \hbox {a.s.}\),

  3. (A3)

    \(\limsup \limits _{t\rightarrow \infty } \frac{1}{t}\sum \limits _{i=1}^{t}\Vert d(i)\Vert ^2\le \sigma _{d}^2<\infty ,\hbox {a.s.}\).

Theorem 1

For the system (6) and the RMILS algorithm (10)–(14), assume that the input u(t) satisfies persistent excitation condition, such that, \(\alpha I\le 1/t\sum \nolimits _{j=1}^{t}\sum \nolimits _{i=1}^{p}\)\(\hat{\varphi }(j,i)\hat{\varphi }^T(j,i)\le \beta I, t\ge t_{0}\) and (A1)–(A3) hold, then the parameter estimation \((\widetilde{\theta }(t)=\hat{\theta }_{p}(t)-\theta )\) converges to zero, i.e., \(\lim \nolimits _{t\rightarrow \infty }\Vert \hat{\theta }_{p}(t)-\theta \Vert ^2=0 , \;a.s.\).

Proof

Define the parameter error vector \(\widetilde{\theta }_{i}(t)=\hat{\theta _{i}}(t)-\theta ,\; i=1,2,\ldots ,p\). Then, subtracting \(\theta \) both sides of (10), it yields

$$\begin{aligned} \widetilde{\theta }_{i}(t)&=\widetilde{\theta }_{i-1}(t)+P_{i}(t)\hat{\varphi }(t,i)[\varphi ^T(t,i)\theta -\hat{\varphi }^T(t,i)\hat{\theta }_{i-1}(t)]+P_{i}(t)\hat{\varphi }(t,i)d(t,i)\nonumber \\&=\widetilde{\theta }_{i-1}(t)+P_{i}(t)\hat{\varphi }(t,i)\{-\hat{\varphi }^T(t,i)\widetilde{\theta }_{i-1}(t)+[\varphi (t,i)-\hat{\varphi }^T(t,i)]^\mathrm{T}\theta +d(t,i)\}\nonumber \\&=\widetilde{\theta }_{i-1}(t)+P_{i}(t)\hat{\varphi }(t,i)[-\widetilde{y}(t,i)+\triangle (t)+d(t,i)], \end{aligned}$$
(15)

where \(\widetilde{y}(t,i)=\hat{\varphi }^T (t,i)\hat{\theta }_{i-1}(t)-\hat{\varphi }^T (t,i)\theta =\hat{\varphi }^T(t,i)\widetilde{\theta }_{i-1}(t)\), \(\triangle (t)=[\varphi (t,i)-\hat{\varphi }^T(t,i)]^\mathrm{T}\theta \).

Based on (15) and \(P_{i}^{-1}(t)=P_{i-1}^{-1}(t)+\hat{\varphi }(t,i)\hat{\varphi }^T(t,i)\), we obtain

$$\begin{aligned} \widetilde{\theta }_{i}^T(t)P_{i}^{-1}(t)\widetilde{\theta }_{i}(t)&=\{\widetilde{\theta }_{i-1}(t)+P_{i}(t)\hat{\varphi }(t,i)[-\widetilde{y}(t,i)+\triangle (t)+d(t,i)]\}^TP_{i}^{-1}(t)\nonumber \\&\quad \times \{\widetilde{\theta }_{i-1}(t)+P_{i}(t)\hat{\varphi }(t,i)[-\widetilde{y}(t,i)+\triangle (t)+d(t,i)]\}\nonumber \\&=\widetilde{\theta }_{i-1}^T(t)[P_{i-1}^{-1}(t)+\hat{\varphi }(t,i)\hat{\varphi }^T(t,i)]\widetilde{\theta }_{i-1}(t)+2\widetilde{y}(t,i)[-\widetilde{y}(t,i) \nonumber \\&\quad +\triangle (t)+d(t,i)]+\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)[\widetilde{y}(t,i)^2+d(t,i)^2\nonumber \\&\quad +\triangle (t)^2-2\widetilde{y}(t,i)d(t,i)-2\widetilde{y}(t,i)\triangle (t)+2\triangle (t)d(t,i)]\nonumber \\&=\widetilde{\theta }_{i-1}^T(t)P_{i-1}^{-1}(t)\widetilde{\theta }_{i-1}(t)-[1-\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)]\widetilde{y}(t,i)^2\nonumber \\&\quad +2[1-\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)]\widetilde{y}(t,i)[\triangle (t)+d(t,i)]\nonumber \\&\quad +\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i) \times [d(t,i)^2+\triangle (t)^2+2\triangle (t)d(t,i)]. \end{aligned}$$
(16)

Assume that \(\{\triangle (t)\}\) is bounded, e.g, \(\triangle (t)^2\le \varepsilon <\infty \). Since \(\widetilde{\theta }_{i-1}^T(t)P_{i-1}^{-1}(t)\widetilde{\theta }_{i-1}(t), \widetilde{y}(t,i),\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)\) and \(\triangle (t)\) are uncorrelated with d(ti), and \(1-\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)=[1+\hat{\varphi }^T(t,i)P_{i-1}(t)\hat{\varphi }(t,i)]^{-1}\ge 0\). Define the function \(W_{i}(t)=E[\widetilde{\theta }_{i}^T(t)P_{i}^{-1}(t)\widetilde{\theta }_{i}(t)]\), the conditional expectation is taken on both sides of (16) with respect to \(\mathscr {F}_{t-1}\) and using Assumptions (A1)–(A3), we have

$$\begin{aligned} W_{i}(t)\le W_{i}(t-1)+E\{\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)[d(t,i)^2+\triangle (t)^2]\}. \end{aligned}$$
(17)

Summing for \(W_{i}(t)\) from \(i=1\) to \(i=p\). From (17), we obtain

$$\begin{aligned} W_{p}(t)&\le W_{p}(t-1)+\sum \limits _{i=1}^{p}E\{\hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)[\sigma _{v}^2+\varepsilon ]\}\nonumber \\&\le W_{p}(0)+E\left\{ \sum \limits _{i=1}^{p}\sum \limits _{j=1}^{t}\hat{\varphi }^T(j,i)P_{i}(j)\hat{\varphi }(j,i)[\sigma _{v}^2+\varepsilon ]\right\} . \end{aligned}$$
(18)

Since \(P_{i-1}^{-1}(t)=P_{i}^{-1}(t)-\hat{\varphi }(t,i)\hat{\varphi }^T(t,i)\), we have

$$\begin{aligned} |P_{i-1}^{-1}(t)|= & {} |P_{i}^{-1}(t)||I-P_{i}(t)\hat{\varphi }(t,i)\hat{\varphi }^T(t,i)|,\\ \hat{\varphi }^T(t,i)P_{i}(t)\hat{\varphi }(t,i)= & {} \frac{|P_{i}^{-1}(t)|-|P_{i-1}^{-1}(t)|}{|P_{i}^{-1}(t)|}=\int _{P_{i}^{-1}(t-1)}^{P_{i}^{-1}(t)}\frac{\text {d}x}{|P_{i}^{-1}(t)|}. \end{aligned}$$

Replacing t with j, and summing for i from 1 to p yield

$$\begin{aligned} \sum \limits _{i=1}^{p}\sum \limits _{j=1}^{t}\hat{\varphi }^T(j,i)P_{i}(j)\hat{\varphi }(j,i) \le \sum \limits _{i=1}^{p}\int _{P_{i}^{-1}(0)}^{P_{i}^{-1}(t)}\frac{\text {d}x}{x}=\sum \limits _{i=1}^{p}[\ln |P_{i}^{-1}(t)|+n\ln {p_{0}}]. \end{aligned}$$
(19)

Applying the persistent excitation condition \(\alpha I\le 1/t\sum _{i=1}^{p}\sum _{j=1}^{t}\hat{\varphi }(j,i)\hat{\varphi }^T(j,i) \le \beta I, t\ge t_{0}\) to \(P_{i}^{-1}(t)=\sum _{i=1}^{p}\sum _{j=1}^{t}\hat{\varphi }(j,i)\hat{\varphi }^T(j,i)+P_{i}^{-1}(0)\), substituting \(|P_{i}^{-1}(t)|\) into (19), we have

$$\begin{aligned} \sum \limits _{i=1}^{p}\sum \limits _{j=1}^{t}\hat{\varphi }^T(j,i)P_{i}(j)\hat{\varphi }(j,i)&=\sum \limits _{i=1}^{p}[\ln |P_{i}^{-1}(t)|+n\ln p_{0}]\nonumber \\&\le pn\ln (\beta t+1/p_{0})+pn\ln p_{0}. \end{aligned}$$
(20)

According to the definition of \(W_{i}(t)\), we have

$$\begin{aligned} W_{p}(t)\ge p(\alpha t+1/p_{0})E[\Vert \widetilde{\theta }(t)\Vert ]^2. \end{aligned}$$
(21)

Since \(W_{p}(0)=E[\widetilde{\theta }^T(0)P^{-1}(0)\widetilde{\theta }(0)]=p n/p_{0}^2\), substituting (20), (21) into (18) yields

$$\begin{aligned}&p(\alpha t+1/p_{0})E[\Vert \widetilde{\theta }(t)\Vert ]^2\le W_{p}(t)\nonumber \\&\quad \le W_{p}(0)+E\left\{ \sum \limits _{i=1}^{p}\sum \limits _{j=1}^{t}\hat{\varphi }^T(j,i)P_{i}(j)\hat{\varphi }(j,i)[\sigma _{v}^2+\varepsilon ]\right\} \nonumber \\&\quad \le p n/p_{0}^2+[pn\ln (\beta t+1/p_{0})+pn\ln p_{0}][\sigma _{v}^2+\varepsilon ]. \end{aligned}$$
(22)

Taking the limits of both sides of (22) with t, we have

$$\begin{aligned} \lim \limits _{t\rightarrow \infty }[\Vert \widetilde{\theta }(t)\Vert ]^2&\le \lim \limits _{t\rightarrow \infty } \frac{n/p_{0}^2+[n\ln (\beta t+1/p_{0})+n\ln p_{0}][\sigma _{v}^2+\varepsilon ]}{(\alpha t+1/p_{0})}=0. \end{aligned}$$

Theorem 1 has been shown. \(\square \)

5 Illustrative Example and Application

5.1 Illustrative Example

Consider the following W–H-S with friction block:

  • \(L_{1}: x(t)=u(t-1)+2.2u(t-2)-1.0x(t-1)-0.5x(t-2)\),

  • \(L_{2}: y(t)=v(t-1)+0.2v(t-2)-0.15y(t-1)-0.4y(t-2)+d(t),\)

  • \(\hbox {Friction:}\; v(t)=0.05x(t)-0.05x(t-1)+0.7\hbox {sign}(x(t)-x(t-1))-0.2\tanh (x(t)-x(t-1))\).

As can be seen from the example considered above, the desired parameter values are \(a1=1\), \(a2=2.2\), \(b1=1.0\), \(b2=0.5\), \(c1=1\), \(c2=0.2\), \(d1=0.15\), \(d2=0.4\), \(\gamma _{6}=0.05\), \(\beta _{1}=0.7\), \(\beta _{2}=0.2\). In the simulation, the input u(t) is mean \(\mu =0\) and variance \(\sigma ^2=1\) persistent excitation signal. The white noise d(t) is uncorrelated with input persistent excitation signal whose signal-to-noise ratio is \(\hbox {SNR}=7.4620\). The sample length is \(\hbox {N}=1000\), \(p=4\), the initial value is chosen as \(\hat{\theta }(0)=\mathbf 1 _{9}/10^2\) (1 represents the unit column vector).

Applying the MILS and RMILS algorithms to estimate the parameter value \(\theta \) of the W–H-S with friction, the comparative parameter convergence histories by both identification approaches are, respectively, depicted in Fig. 2a–c. It can be observed that the estimated values with solid line quickly tend to their expected value with dashed black line during the samples from 0 to 500, whereafter, the parameter estimation can reach their true values after about sample data 800. In order to reveal the evolutionary processes of the identification, the identification errors (\(\delta =\Vert \hat{\theta }(t)-\theta \Vert /\Vert \theta \Vert \)) by MILS algorithm with solid blue line and RMILS algorithm with solid red line are shown in Fig. 2d. It is shown that the estimation errors by both identification methods are rapidly decreasing in the initial stage of parameter estimation, and after the sample data 1000, the estimation errors tend to small steady value. The contrastive results shown in Fig. 2 indicate that the proposed algorithm (RMILS) provides a better estimation property both in the convergence speed and estimation precision compared to the MILS algorithm.

Fig. 2
figure 2

Estimation curves by both identification methods

To test the results of parameter estimate, the contrastive modeling verifications are implemented based on the identification results, as shown in Fig. 3. The estimated model outputs by both algorithms with the dashed red line can capture the dynamics of the actual outputs with the solid blue line, which validates the usefulness of both algorithms. Compared with the MILS algorithm, the proposed algorithm produces an excellent predictive performance with a smaller model error.

Fig. 3
figure 3

Model validation for both algorithms

The following indices are used to quantitatively analyze the capability of the estimator, in which the small indices mean the superiority of the estimator. (1) Normalized mean squared error (NMSE), \(\hbox {NMSE}=\frac{1}{N'\sigma _{y}^2}\sum \nolimits _{i=1}^{N'}(y_{i}-\hat{y}_{i})^2\); (2) prediction error mean, \(\mu _\mathrm{e}=\frac{1}{N'}\sum \nolimits _{i=1}^{N'}e_\mathrm{sim}(t)\); (3) root mean square, \(e_{RMSe}=\sqrt{\frac{1}{N'}\sum \nolimits _{i=1}^{N'}e_\mathrm{sim}(t)^2}\), where \(N'\) represents the sample length, \(\sigma _{y}^2\) is the output variance and \(e_\mathrm{sim}(t)=y_{t}-\hat{y}_{t}\) is the error of prediction [19].

Based on the definition of \(\mu _\mathrm{e},e_\mathrm{RMSe}\) and NMSE, results of the performance indices are listed in Table 1. Table 1 displays the performance values of the identification algorithms, it is clear that the indices are small values, which indicate that the identification algorithms produce accurate parameter estimation. Table 1 also shows that the performance of the RMILS method outperforms that of the MILS approach.

Table 1 Performance indices of the identification algorithms

5.2 Application

From the simulation results, it can be seen that the proposed algorithm has certain advantages in convergence speed and estimation accuracy compared to the MILS. To confirm the usefulness of the presented estimator, the RMILS is utilized to model the turntable servo system based on the W–H-S with friction. A turntable system is used as a test rig, which is depicted in Fig. 4. In this experimental platform, the stabilized platform and motor can be considered as dynamic subsystems, respectively. Friction lies in the stabilized platform and motor. To make a long story short, the turntable servo system is a W–H-S with friction system. We can use the W–H-S with friction to model the turntable system. In this rig, the motion of x-axis (or y-axis) is driven by a permanent magnet synchronous motor (HC-UFS13). A digital signal processor (TMS3202812) is used as the controller and motor drive card (MR-J2S-10A) performing as the driver, resolution of the encoder is 800 divisions. The results of the experiment are displayed on a PC Pentium 2.8 GHz.

Fig. 4
figure 4

Turntable servo system

Based on the structure of the turntable system, the Wiener–Hammerstein model can be written as follows:

  1. (1)

    Linear submodel \(L_{1}\) for the motor system: \(x(t)=u(t-1)+A_{2}u(t-2)-B_{1}x(t-1)-B_{2}x(t-2),\)

  2. (2)

    friction model \(f(\cdot )\) for the actual nonlinearity: \(v(t)=\alpha _{1}x(t)-\alpha _{1}x(t-1)+\alpha _{2}\hbox {sign}(x(t)-x(t-1))-\alpha _{3}\tanh (x(t)-x(t-1)),\)

  3. (3)

    linear submodel \(L_{2}\) for the stabilized platform (Load): \(y(t)=v(t-1)+C_{2}v(t-2)-D_{1}y(t-1)-D_{2}y(t-2)\).

In this experiment, a PID algorithm is employed as the controller, whose parameters are chosen as \(k_{p}=40, k_{i}=0.001,k_{d}=0.2\). The reference signal \(y_{d}=0.8\sin (2/5\pi t)\) is taken as the input signal. The sampling rate is 0.01 s. The initial value of \(\theta \) is \(\theta _{0}=[0.04,0.06,0.044,0.0058,0.001,0.036,0.001,0.11,0.01]^\mathrm{T}\), the innovation length \(p=4\), the covariance matrix is \(P=10^6I\).

The contrastive parameter identification profiles by MILS and RMILS algorithms are plotted in Fig. 5, respectively. We can find that the estimated parameters have oscillations in the initial stage of parameter estimation by both identification schemes, but the estimated values can give close to steady values with the increase in time. The proposed approach provides smaller oscillation and faster convergence speed than the MILS algorithm. According to Fig. 5, we can obtain the corresponding estimated values \(\hat{A}_{2}=0.8545\), \(\hat{B}_{1}=-0.9703\), \(\hat{B}_{2}=0.6861\), \(\hat{\alpha }_{1}=0.0436\), \(\hat{\alpha }_{2}=0.006\), \(\hat{\alpha }_{3}=0.0532\), \(\hat{C}_{2}=-0.7049\), \(\hat{D}_{1}=0.3246\) and \(\hat{D}_{2}=0.4869\).

Fig. 5
figure 5

Comparative estimations by both approaches

To illustrate the superiority of the RMILS scheme, Fig. 6 shows the tracking outputs and errors for both algorithms based on the estimation results. It can be clearly observed that the obtained model outputs can track the reference output and the presented scheme produces a classy tracking performance compared with the MILS algorithm. Tracking output results indicate that the W–H-S with friction can model for the turntable servo system, and also demonstrate the advantage of RMILS. Table 2 summarizes the model error indices comparisons between the RMILS scheme and MILS algorithm. It is obvious that the developed method can give more accurate modeling results by the values of \(\mu _\mathrm{e}, e_\mathrm{RMSe}\) and NMSE.

Fig. 6
figure 6

Track output and error by both approaches

Table 2 Comparison of output error indices

6 Discussion

Compared with our previous work in [13], this paper has the following differences. Firstly, the purpose and system characteristics of the paper are different: The publication in [13] considers the parameter estimation of the Wiener–Hammerstein systems with backlash in which backlash is a piecewise nonlinearity. However, in this paper, we use the Wiener–Hammerstein with friction to model the turntable servo system in which friction is a newly proposed continuously differentiable nonlinearity and apply the actual data of turntable servo system to identify the parameters of Wiener–Hammerstein model rather than the simulated data in [13]. Secondly, the starting point for improving the algorithm is different: In our previous work [13], which is used to improve the multi-innovation length problem for gradient method, estimator type is gradient-type method. While the identifier is least-squares-type algorithm in this paper, which is applied to address the inversion of covariance matrix for least-squares algorithm,

7 Conclusion

In this paper, a RMILS approach of estimating and modeling the Wiener–Hammerstein systems with friction nonlinearity has been proposed through the usage of the multi-step updating. To obtain the identification model of the modeling turntable servo system, the key-term separation principle is employed to address the W–H model. The components of the turntable servo system can be described by the corresponding parts of the W–H model. Then, the RMILS scheme is proposed to estimate the parameters of this model based on the system data of turntable system. By decomposing the multi-innovation into p sub-innovations, the RMILS algorithm can effectively handle the problem for the inversion of covariance matrix. Compared with the MILS algorithm, example and experiment results indicate the superiority of the proposed approach in this paper.