Some Stochastic Gradient Algorithms for Hammerstein Systems with Piecewise Linearity

Pu, Yan; Yang, Yongqing; Chen, Jing

doi:10.1007/s00034-020-01554-z

Some Stochastic Gradient Algorithms for Hammerstein Systems with Piecewise Linearity

Published: 30 September 2020

Volume 40, pages 1635–1651, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Some Stochastic Gradient Algorithms for Hammerstein Systems with Piecewise Linearity

Download PDF

540 Accesses
6 Citations
Explore all metrics

Abstract

Some stochastic gradient (SG) algorithms for Hammerstein systems with piecewise linearity are developed in this paper. Due to the complexity of the nonlinear structure, the key term separation is used to transfer the nonlinear model into a regression model, and then, some SG algorithms are proposed for this model. Since the SG algorithm has slow convergence rate, a forgetting factor SG algorithm and an Aitken SG algorithm are provided. Compared with the forgetting factor SG algorithm, the Aitken SG algorithm has smaller variance of estimation error, which means the Aitken SG algorithm is more effective. Two simulation examples are provided to show the effectiveness of the proposed algorithms.

Stochastic Gradient Algorithm for Hammerstein Systems with Piece-Wise Linearities

Parameter Learning Algorithms of Hammerstein Nonlinear Systems

Maximum Likelihood Iterative Algorithm for Hammerstein Systems with Hard Nonlinearities

Article 18 May 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Parameter estimation plays an important role in controller design [13, 14] because the controller design of dynamic system is usually established on the premise that the parameters of dynamic systems are known [19, 28]. Compared with the linear systems, nonlinear systems are more extensive in engineering practice [6, 10], and they can be roughly divided into four categories: Hammerstein systems [36], Wiener systems [18], Hammerstein–Wiener systems and Wiener–Hammerstein systems [2, 35]. Recently, many identification algorithms have been developed for these nonlinear systems, such as the stochastic gradient (SG) algorithms [39], the expectation maximization algorithms and the iterative algorithms [4]. The SG algorithm updates the parameter estimates only through the latest input–output data at each sampling instant, and it does not need to compute the inverse matrix; thus, it has less computational loads. Its variants include the multi-innovation stochastic gradient algorithms [16, 42] and the gradient-based iterative algorithms [12].

The idea of the gradient-based identification algorithms is first to determine the search direction and then to calculate the step size for each sampling instant. Although the computational effort of the SG algorithm is small, its convergence rate is slow because of its zigzag search directions. In general, there are two methods to improve the convergence rates. One is to obtain the optimal direction at each sampling instant. For example, for control problems with undetermined final time, Hussu provided a conjugate-gradient method [20]. The other method is to compute a suitable step size at each sampling instant. For instance, Chen and Ding introduced a convergence index into a modified stochastic gradient(M-SG) algorithm to improve the convergence rate [7]. Ma et al. [30] studied a forgetting factor stochastic gradient (FF-SG) algorithm for Hammerstein systems with saturation and preload nonlinearities. Although the M-SG and FF-SG algorithms can improve the convergence rates, they also bring some issues, such as severe oscillation when the estimates of the parameters approach to the true values [24].

One may ask whether it is feasible to develop a modified SG algorithm, which can not only quickly estimate the parameters, but also decrease the variances of the estimation errors. For this sake, the Aitken method is introduced in this paper. The Aitken method is a sequence acceleration method, used for accelerating the convergence rate of sequences. It is efficient for accelerating the convergence rate of a sequence which is converging linearly. For example, Pavaloiu et al. [34] studied an Aitken–Newton iterative method for nonlinear equations, which is more competitive than some optimization methods with the same convergence order. Bumbariu [5] developed an improved Aitken acceleration method for solving nonlinear equations which computes the solutions of the nonlinear equations with fast convergence rates. The proposed approaches of this paper have some interesting features.

1.
Using the key term separation method, which can transform a complex Hammerstein systems with piecewise linearity into a simplify regression model.
2.
Studying an FF-SG algorithm for this nonlinear system, which can improve the convergence rate.
3.
Developing an Aitken-based SG algorithm, which has quick convergence rates and small estimation error variances.
4.
Extending the proposed methods to identify the Hammerstein systems with colored noise.

In summary, this paper is listed as follows. Section 2 introduces the Hammerstein model. Section 3 presents some SG algorithms. Section 4 studies the Aitken-based SG algorithm for the piecewise linear system with colored noise. In Sect. 5, two illustrative examples are provided. Section 6 gives the conclusions of this paper and the directions of future research.

2 The Hammerstein System with Piecewise Linearity

The piecewise linear system is a special kind of switching systems, which widely exists in engineering practice [27, 37]. Such a system can be used to model or approximately describe the processes with different gains in different input intervals, e.g., the systems of flight control, circuits and biology [26, 31].

Consider the Hammerstein system with piecewise linearity as follows:

$$\begin{aligned} A(\zeta )y(\tau )= & {} B(\zeta )f(q(\tau ))+v(\tau ), \end{aligned}$$

(1)

where $q(\tau )$ is the input which is taken as a persistent excitation signal sequence with zero mean and unit variance, $y(\tau )$ is the output, $v(\tau )$ is a white noise with zero mean and variance $\sigma ^2$, and a piecewise linearity $f(q(\tau ))$ is shown in Fig. 1, which can be written as

$$\begin{aligned} f(q(\tau ))= \left\{ \begin{array}{ll} m_1q(\tau ), &{} \quad q(\tau )\geqslant 0,\\ m_2q(\tau )), &{} \quad q(\tau )<0,\\ \end{array}\right. \end{aligned}$$

(2)

where the corresponding segment slopes are $m_1$ and $m_2$.

The polynomials $A(\zeta )$ and $B(\zeta )$ are expressed as

$$\begin{aligned} A(\zeta )= & {} 1+a_1\zeta ^{-1}+a_2\zeta ^{-2}+\cdots +a_n\zeta ^{-n},\\ B(\zeta )= & {} b_0+b_1\zeta ^{-1}+\cdots +b_{n-1}\zeta ^{1-n}. \end{aligned}$$

Since the piecewise linearity is expressed by two equations, the Hammerstein system may be illustrated by two models [3]. Then, the considered Hammerstein model is equivalent to a switching model [1]. It is well known that the identification for switching models is more challenging. In order to simplify the identification process, the key term separation method is introduced [8, 9].

Define a switching function,

$$\begin{aligned} s(\tau ):=s[q(\tau )]=\left\{ \begin{array}{ll}0, &{}\quad q(\tau )>0, \\ 1, &{}\quad q(\tau )\le 0. \end{array}\right. \end{aligned}$$

Then, the nonlinear part $f(q(\tau ))$ of input is written as

$$\begin{aligned} f(q(\tau ))= & {} m_1s(-q(\tau ))q(\tau )+m_2s(q(\tau ))q(\tau ). \end{aligned}$$

(3)

The nonlinear model can be written as

$$\begin{aligned} A(\zeta )y(\tau )= & {} B(\zeta )m_1s(-q(\tau ))q(\tau )+B(\zeta )m_2s(q(\tau ))q(\tau )+v(\tau ). \end{aligned}$$

(4)

Define the information vector $\varvec{{\chi }}(\tau )$ and the parameter vector $\varvec{{\xi }}$ as

$$\begin{aligned} \varvec{{\chi }}(\tau )= & {} [-y(\tau -1), -y(\tau -2), \ldots ,-y(\tau -n), s(-q(\tau ))q(\tau ),s(-q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(-q(\tau -n+1))q(\tau -n+1),s(q(\tau ))q(\tau ),s(q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(q(\tau -n+1))q(\tau -n+1)]^{\mathrm{T}}\in {\mathbb R}^{3n}, \end{aligned}$$

(5)

$$\begin{aligned} \varvec{{\xi }}= & {} [a_1, a_2, \ldots , a_n, b_0m_1, b_1m_1,\ldots ,b_{n-1}m_1,b_0m_2,b_1m_2,\ldots ,b_{n-1}m_2]^{\mathrm{T}}\in {\mathbb R}^{3n}. \end{aligned}$$

(6)

Then, the nonlinear model can be simplified as a regression model:

$$\begin{aligned} y(\tau )=\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\xi }}+v(\tau ). \end{aligned}$$

(7)

The proposed algorithms in this paper are based on this identification model. Many identification methods are derived based on the identification models of dynamical systems [29, 32, 33], which can be used to estimate the parameters of bilinear systems [23, 38, 47, 48] , and can be applied to fields such as chemical process control systems. From Eq. (7), it can be seen that the parameters can be estimated by all the traditional identification algorithms in the cost of heavy computational demands [17].

Remark 1

In this paper, $b_0$ is assumed to be equal to 1; otherwise, $b_i$ cannot be separated from $b_im_k$. Assume the parameter estimates are

$$\begin{aligned} \hat{\varvec{{\xi }}}=[\hat{a}_1, \hat{a}_2, \ldots , \hat{a}_n, \hat{m}_1, \hat{b}_1\hat{m}_1,\ldots ,\hat{b}_{n-1}\hat{m}_1,\hat{m}_2,\hat{b}_1\hat{m}_2,\ldots ,\hat{b}_{n-1}\hat{m}_2]^{\mathrm{T}} \end{aligned}$$

Once the parameter estimates have been obtained, we can get $\hat{m}_1$ and $\hat{m}_2$ first, and then, based on $\hat{m}_1$ and $\hat{m}_2$, we can get $\hat{b}_i=\frac{\hat{b}_i\hat{m}_k}{\hat{m}_k}, i=1,\ldots ,n-1, k=1,2$.

3 Some Stochastic Gradient Algorithms

The SG algorithm can be realized online, which updates parameters according to the latest input–output data [11]. Therefore, it has less computational efforts. However, this algorithm has slow convergence rates. In this section, some modified SG algorithms will be investigated.

3.1 The Traditional Stochastic Gradient Algorithm

Define the cost function

$$\begin{aligned} J(\varvec{{\xi }})=[y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\xi }}]^2. \end{aligned}$$

Assume that the parameter estimates at time $\tau $ are $\hat{\varvec{{\xi }}}(\tau -1)$, the key of the SG algorithm is to get a better estimate $\hat{\varvec{{\xi }}}(\tau )$ which satisfies

$$\begin{aligned} J(\hat{\varvec{{\xi }}}(\tau ))=[y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau )]^2 \leqslant J(\hat{\varvec{{\xi }}}(\tau -1))=[y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)]^2. \end{aligned}$$

(8)

$\hat{\varvec{{\xi }}}(\tau )$ is obtained based on $\hat{\varvec{{\xi }}}(\tau -1)$ and is written by

$$\begin{aligned} \hat{\varvec{{\xi }}}(\tau )=\hat{\varvec{{\xi }}}(\tau -1)+\lambda (\tau )\varvec{{\chi }}(\tau )[y(\tau ) -\varvec{{\chi }}^{\mathrm{T}}(t)\hat{\varvec{{\xi }}}(\tau -1)]. \end{aligned}$$

(9)

In order to keep (8) holding, substituting (9) into (8) gets

$$\begin{aligned} J(\hat{\varvec{{\xi }}}(\tau ))=\big \{y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )(\hat{\varvec{{\xi }}}(\tau -1) +\lambda (\tau )\varvec{{\chi }}(\tau )[y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)])\big \}^2. \end{aligned}$$

Keep $\hat{\varvec{{\xi }}}(\tau -1)$ fixing and define $J(\lambda (\tau ))$ as

$$\begin{aligned} J(\lambda (\tau ))=\big \{y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )(\hat{\varvec{{\xi }}}(\tau -1)+\lambda (\tau )\varvec{{\chi }}(\tau )[y(\tau ) -\varvec{{\chi }}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)])\big \}^2. \end{aligned}$$

Let

$$\begin{aligned} \frac{\partial J(\lambda (\tau ))}{\partial \lambda (\tau )}=\frac{\partial \big [y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )(\hat{\varvec{{\xi }}}(\tau -1) +\lambda (\tau )\varvec{{\chi }}(\tau )[y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)])\big ]^2}{\partial \lambda (\tau )}. \end{aligned}$$

Setting the above derivative equal to zero obtains

$$\begin{aligned} \lambda (\tau )=\frac{1}{\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\chi }}(\tau )}. \end{aligned}$$

Then, we can get the steepest descent algorithm

$$\begin{aligned} \hat{\varvec{{\xi }}}(\tau )= & {} \hat{\varvec{{\xi }}}(\tau -1)+\frac{\hat{\varvec{{\chi }}}(\tau )}{\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\chi }}(\tau )}\big (y(\tau )-{\varvec{{\chi }}}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)\big ). \end{aligned}$$

(10)

However, when $\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\chi }}(\tau )$ is small, the correction items $\frac{\hat{\varvec{{\chi }}}(\tau )}{\varvec{{\chi }}^{\mathrm{T}}(\tau ) \varvec{{\chi }}(\tau )}\big (y(\tau )-{\varvec{{\chi }}}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)\big )$ would be large, which leads to the steepest descent algorithm be divergent. With this in mind, we define

$$\begin{aligned} \lambda (\tau )=\rho +\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\chi }}(\tau ),\rho \geqslant 0 \end{aligned}$$

Then, we get the projection algorithm

$$\begin{aligned} \hat{\varvec{{\xi }}}(\tau )= & {} \hat{\varvec{{\xi }}}(\tau -1)+\frac{\hat{\varvec{{\chi }}}(\tau )}{\rho +\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\chi }}(\tau )}\big (y(\tau )-{\varvec{{\chi }}}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)\big ). \end{aligned}$$

(11)

Since $\rho $ is a constant, the unchanged step size will make the estimate of the algorithm oscillate seriously when the estimates are closing to the true values. In order to solve this problem, we replace $\rho $ by $\lambda (\tau -1)$. Then, the SG algorithm to estimate the parameter $\varvec{{\xi }}$ is listed as follows,

$$\begin{aligned} \hat{\varvec{{\xi }}}(\tau )= & {} \hat{\varvec{{\xi }}}(\tau -1)+\frac{\hat{\varvec{{\chi }}}(\tau )}{\lambda (\tau )}\big (y(\tau )-{\varvec{{\chi }}}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)\big ), \end{aligned}$$

(12)

$$\begin{aligned} {\varvec{{\chi }}}(\tau )= & {} [-y(\tau -1), -y(\tau -2), \ldots ,-y(\tau -n), s(-q(\tau ))q(\tau ),s(-q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(-q(\tau -n+1))q(\tau -n+1),s(q(\tau ))q(\tau ),s(q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(q(\tau -n+1))q(\tau -n+1)]^{\mathrm{T}}. \end{aligned}$$

(13)

$$\begin{aligned} \lambda (\tau )= & {} \lambda (\tau -1)+\Vert \hat{\varvec{{\chi }}}(\tau )\Vert ^{2}, \lambda (0)=1. \end{aligned}$$

(14)

Remark 2

Although the traditional SG algorithm has less computational efforts, it also brings some challenging issues, e.g., slow convergence rates, especially for systems with large number of unknown parameters.

3.2 Two Modified Stochastic Gradient Algorithms

In order to increase the convergence rates, two modified SG algorithms for the Hammerstein system are developed in this subsection. A forgetting factor SG (FF-SG) algorithm is first introduced,

$$\begin{aligned} \hat{\varvec{{\xi }}}(\tau )= & {} \hat{\varvec{{\xi }}}(\tau -1)+\frac{\hat{\varvec{{\chi }}}(\tau )}{\lambda (\tau )}\big (y(\tau )-{\varvec{{\chi }}}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1)\big ), \end{aligned}$$

(15)

$$\begin{aligned} {\varvec{{\chi }}}(\tau )= & {} [-y(\tau -1), -y(\tau -2), \ldots ,-y(\tau -n), s(-q(\tau ))q(\tau ),s(-q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(-q(\tau -n+1))q(\tau -n+1),s(q(\tau ))q(\tau ),s(q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(q(\tau -n+1))q(\tau -n+1)]^{\mathrm{T}}. \end{aligned}$$

(16)

$$\begin{aligned} \lambda (\tau )= & {} r \lambda (\tau -1)+\Vert \hat{\varvec{{\chi }}}(\tau )\Vert ^{2},\ \lambda (0)=1, \ 0.8 \leqslant r \leqslant 1. \end{aligned}$$

(17)

Remark 3

The FF-SG algorithm introduces a forgetting factor r in the step size [15, 43, 46], which will make the step size larger at each sampling instant. Therefore, the FF-SG algorithm has quicker convergence rates compared with the traditional SG algorithm.

Remark 4

Although the FF-SG algorithm can increase the convergence rates, it brings some challengings, such as large estimation error variances.

To make the variance of the estimation error smaller, another modified SG algorithm will be studied in the following, which is termed as the Aitken-based SG (A-SG) algorithm. Assume that the parameter estimate $\hat{\varvec{{\xi }}}(\tau )$ converges to the true value $\varvec{{\xi }}$, which means that

$$\begin{aligned} \lim _{\tau \rightarrow \infty }[\hat{\varvec{{\xi }}}(\tau )-{\varvec{{\xi }}}] =\lim _{\tau \rightarrow \infty }[\hat{\varvec{{\xi }}}(\tau -1)-{\varvec{{\xi }}}]. \end{aligned}$$

(18)

It is equivalent to

$$\begin{aligned} \lim _{\tau \rightarrow \infty }\frac{\hat{\xi }_{\varrho }(\tau )- {\xi }_{\varrho }}{\hat{\xi }_{\varrho }(\tau -1)- {\xi }_{\varrho }}={1}, \end{aligned}$$

(19)

where ${\xi }_{\varrho }$ is the $\varrho $th element in the parameter vector ${\varvec{{\xi }}}$, $\varrho =1,2,\ldots ,3n$. When $\tau $ is large enough, the equivalent expression of (19) can be written as

$$\begin{aligned} \frac{\hat{\xi }_{\varrho }(\tau )-{\xi }_{\varrho }}{\hat{\xi }_{\varrho }(\tau -1)-{\xi }_{\varrho }} =\frac{\hat{\xi }_{\varrho }(\tau -1)-{\xi }_{\varrho }}{\hat{\xi }_{\varrho }(\tau -2)-{\xi }_{\varrho }}. \end{aligned}$$

(20)

From (19) and (20), it follows that

$$\begin{aligned} {[}\hat{\xi }_{\varrho }(\tau )+\hat{\xi }_{\varrho }(\tau -2)-2\hat{\xi }_{\varrho }(\tau -1)]{\xi }_{\varrho } = \hat{\xi }_{\varrho }(\tau )\hat{\xi }_{\varrho }(\tau -2)-\hat{\xi }^2_{\varrho }(\tau -1). \end{aligned}$$

(21)

Then, the Aitken accelerated iteration formula for ${\xi }_{\varrho }$ can be written as

$$\begin{aligned} {\xi }_{\varrho }=\hat{\xi }_{\varrho }(\tau -2) -\frac{\big (\hat{\xi }_{\varrho }(\tau -1)-\hat{\xi }_{\varrho }(\tau -2)\big )^2}{\hat{\xi }_{\varrho }(\tau ) +\hat{\xi }_{\varrho }(\tau -2)-2\hat{\xi }_{\varrho }(\tau -1)}. \end{aligned}$$

(22)

However, the parameter $\varvec{{\xi }}$ cannot be computed by Eq. (18) because it is not a scalar, but a vector. In order to get the vector, the parameter $\varvec{{\xi }}$ is rewritten as

$$\begin{aligned} \varvec{{\xi }}=[a_1, a_2, \ldots , a_n, m_1, b_1m_1,\ldots ,b_{n-1}m_1,m_2,b_1m_2,\ldots ,b_{n-1}m_2]^{\mathrm{T}} \end{aligned}$$

Then, Eq. (21) is equivalent to the 3n equations as follows,

$$\begin{aligned}&[\hat{a}_{i}(\tau )+\hat{a}_{i}(\tau -2)-2\hat{a}_{i}(\tau -1)]{a}_{i}= \hat{a}_{i}(\tau )\hat{a}_{i}(\tau -2)-\hat{a}^2_{i}(\tau -1), \quad i=1,\ldots ,n,\\&[\hat{m}_k(\tau )+\hat{m}_k(\tau -2)-\hat{m}_k(\tau -1)]m_k= \hat{m}_k(\tau )\hat{m}_k(\tau -2)-\hat{m}^2_k(\tau -1), \quad k=1,2, \\&[\hat{b}_{j}(\tau )\hat{m}_k(\tau )+\hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2)-2\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)]b_j m_k\\&\quad =\hat{b}_{j}(\tau )\hat{m}_k(\tau )\hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2)-\hat{b}^2_{j}(\tau -1)\hat{m}^2_k(\tau -1), \quad j=1,\ldots ,n-1. \end{aligned}$$

Then, we have

$$\begin{aligned} {a}_{i}= & {} \hat{a}_{i}(\tau -2)-\frac{(\hat{a}_{i}(\tau -1)-\hat{a}_{i}(\tau -2))^2}{\hat{a}_{i}(\tau )+\hat{a}_{i}(\tau -2)-2\hat{a}_{i}(\tau -1)},\\ {m}_{k}= & {} \hat{m}_{k}(\tau -2)-\frac{(\hat{m}_{k}(\tau -1)-\hat{m}_{k}(\tau -2))^2}{\hat{m}_{k}(\tau )+\hat{m}_{k}(\tau -2)-2\hat{m}_{k}(\tau -1)},\\ b_j m_k= & {} \hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2)\\&-\frac{(\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)-\hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2))^2}{\hat{b}_{j}(\tau )\hat{m}_k(\tau )+\hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2)-2\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)}. \end{aligned}$$

Define

$$\begin{aligned} \bar{a}_{i}(\tau )= & {} \hat{a}_{i}(\tau -2)-\frac{(\hat{a}_{i}(\tau -1)- \hat{a}_{i}(\tau -2))^2}{\hat{a}_{i}(\tau )+\hat{a}_{i}(\tau -2)-2\hat{a}_{i}(\tau -1)},\\ \bar{m}_{k}(\tau )= & {} \hat{m}_{k}(\tau -2)-\frac{(\hat{m}_{k}(\tau -1) -\hat{m}_{k}(\tau -2))^2}{\hat{m}_{k}(\tau )+\hat{m}_{k}(\tau -2)-2\hat{m}_{k} (\tau -1)},\\ \bar{b}_j(\tau ) \bar{m}_k(\tau )= & {} \hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2)\\&-\frac{(\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)-\hat{b}_{j}(\tau -2) \hat{m}_k(\tau -2))^2}{\hat{b}_{j}(\tau )\hat{m}_k(\tau )+\hat{b}_{j}(\tau -2) \hat{m}_k(\tau -2)-2\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)}. \end{aligned}$$

The Aitken-based SG algorithm is obtained as follows,

$$\begin{aligned} \bar{a}_{i}(\tau )= & {} \hat{a}_{i}(\tau -2)-\frac{(\hat{a}_{i}(\tau -1) -\hat{a}_{i}(\tau -2))^2}{\hat{a}_{i}(\tau )+\hat{a}_{i}(\tau -2) -2\hat{a}_{i}(\tau -1)}, \end{aligned}$$

(23)

$$\begin{aligned} \bar{m}_{k}(\tau )= & {} \hat{m}_{k}(\tau -2) -\frac{(\hat{m}_{k}(\tau -1)-\hat{m}_{k}(\tau -2))^2}{\hat{m}_{k}(\tau ) +\hat{m}_{k}(\tau -2)-2\hat{m}_{k}(\tau -1)}, \end{aligned}$$

(24)

$$\begin{aligned} \bar{b}_j(\tau )\bar{m}_k(\tau )= & {} \hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2)\nonumber \\&-\frac{(\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)-\hat{b}_{j}(\tau -2) \hat{m}_k(\tau -2))^2}{\hat{b}_{j}(\tau )\hat{m}_k(\tau )+\hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2) -2\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)},\nonumber \\ \end{aligned}$$

(25)

$$\begin{aligned} \hat{\varvec{{\xi }}}(\tau )= & {} \hat{\varvec{{\xi }}}(\tau -1) +\frac{\varvec{{\chi }}(\tau )}{\lambda (\tau )}e(\tau ),\ \end{aligned}$$

(26)

$$\begin{aligned} e(\tau )= & {} y(\tau )-\varvec{{\chi }}^{\mathrm{T}}(\tau )\hat{\varvec{{\xi }}}(\tau -1), \end{aligned}$$

(27)

$$\begin{aligned} {\varvec{{\chi }}}(\tau )= & {} [-y(\tau -1), -y(\tau -2), \ldots ,-y(\tau -n), s(-q(\tau ))q(\tau ),\nonumber \\&s(-q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(-q(\tau -n+1))q(\tau -n+1),s(q(\tau ))q(\tau ), s(q(\tau -1))q(\tau -1),\ldots ,\nonumber \\&\ s(q(\tau -n+1))q(\tau -n+1)]^{\mathrm{T}}, \end{aligned}$$

(28)

$$\begin{aligned} \lambda (\tau )= & {} \lambda (\tau -1)+\varvec{{\chi }}^{\mathrm{T}}(\tau )\varvec{{\chi }}(\tau ). \end{aligned}$$

(29)

The A-SG algorithm starts the iterations as follows.

1.
To initialize: Let $\tau =1$, $\hat{\varvec{{\xi }}}(0)=\mathbf{1}_{3n}/p_0$, $p_0=10^6$ and $\lambda (0)=1$.
2.
Let $y(\tau )=0, q(\tau )=0$, $\tau \leqslant 0$, and give an error tolerance number $\varepsilon $.
3.
Collect the input–output data $\{q(\tau ), y(\tau )\}$.
4.
Form ${\varvec{{\chi }}}(\tau )$ by (28).
5.
Compute $e(\tau )$ and $\lambda (\tau )$ by (27) and (29), respectively.
6.
Update the estimation vector $\hat{\varvec{{\xi }}}(\tau )$ by (26).
7.
Compute each estimate $\bar{a}_{i}(\tau ), i=1,\ldots , n$, $\bar{m}_{k}(t), k=1,2$ and $\bar{b}_j(\tau )\bar{m}_k(\tau ), j=1, \ldots , n-1$ by (23)–(25), and then form $\bar{\varvec{{\xi }}}(\tau )$.
8.
Compare $\bar{\varvec{{\xi }}}(\tau )$ and $\bar{\varvec{{\xi }}}(\tau -1)$: if $\Vert \bar{\varvec{{\xi }}}(\tau )-\bar{\varvec{{\xi }}}(\tau -1)\Vert \leqslant \varepsilon $, then obtain the $\bar{\varvec{{\xi }}}(\tau )$ and go to the next step; otherwise, increase $\tau $ by 1 and go to step 3.
9.
Compute $\bar{m}_k(\tau )$ first, and then calculate $\bar{b}_i(\tau )=\frac{\bar{b}_i(\tau )\bar{m}_k(\tau )}{\bar{m}_k(\tau )}$.

Remark 5

The A-SG algorithm utilizes three connected parameter estimates to obtain an optimal parameter estimate, which does not use a large step size to speed up the convergence. Therefore, the A-SG algorithm has quicker convergence rates but smaller estimation error variances.

4 The Identification for the Hammerstein Piecewise Linearity System with Colored Noise

In this part, the SG algorithms are developed to identify the Hammerstein system with colored noise, which contains unmeasurable noise variables in the information vector.

4.1 Problem Description and Identification Model

Consider the Hammerstein piecewise linearity system with colored noise as follows,

$$\begin{aligned} y(\tau )=\frac{B(\zeta )}{A(\zeta )}\bar{x}(\tau )+\frac{D(\zeta )}{A(\zeta )}v(\tau ), \end{aligned}$$

(30)

where $D(\zeta ):=1+d_1\zeta ^{-1}+d_2\zeta ^{-2}+\cdots +d_n\zeta ^{-n_d}$, the definitions of $A(\zeta ),B(\zeta )$ and the piecewise linearity part are the same as those in Section 2.

By utilizing the key term separation technique, the system can be transformed into

$$\begin{aligned} A(\zeta )y(\tau )= & {} B(\zeta )m_1s(-q(\tau ))q(\tau ) +B(\zeta )m_2s(q(\tau ))u(\tau )+D(\zeta )v(\tau ). \end{aligned}$$

(31)

Then, the system is written by

$$\begin{aligned} y(\tau )= & {} -\sum \limits _{i=1}^{n}{{a_i}y(\tau -i)}+m_1s(-q(\tau ))q(\tau ) +\sum \limits _{i=1}^{n-1}{m_1b_is(-q(\tau -i))q(\tau -i)}\nonumber \\&+m_2s(q(\tau ))q(\tau )+\sum \limits _{i=1}^{n-1}{m_2b_is(q(\tau -i))q(\tau -i)} -\sum \limits _{i=1}^{n_d}{{d_i}v(\tau -i)}+v(\tau ).\nonumber \\ \end{aligned}$$

(32)

Define the information vector $\varvec{{\psi }}(\tau )$ and the parameter vector $\varvec{{\vartheta }}$ as

$$\begin{aligned} \varvec{{\psi }}(\tau ):= & {} [-y(\tau -1), -y(\tau -2), \ldots ,-y(\tau -n), s(-q(\tau ))q(\tau ),\nonumber \\&\ s(-q(\tau -1))q(\tau -1),\ldots ,s(-q(\tau -n+1))q(\tau -n+1),s(q(\tau ))u(\tau ),\nonumber \\&\ s(q(\tau -1))q(\tau -1),\ldots ,s(q(\tau -n+1))q(\tau -n+1),\nonumber \\&v(\tau -1),v(\tau -2),\ldots ,v(\tau -n_d),]^{\mathrm{T}}\in {\mathbb R}^{3n+n_d}, \end{aligned}$$

(33)

$$\begin{aligned} \varvec{{\vartheta }}:= & {} [a_1, a_2, \ldots , a_n, m_1, b_1m_1,\ldots ,b_{n-1}m_1,m_2,b_1m_2,\ldots ,b_{n-1}m_2,\nonumber \\&d_1, d_2, \ldots , d_{n_d}]^{\mathrm{T}}\in {\mathbb R}^{3n+n_d}. \end{aligned}$$

(34)

Then, the nonlinear system can be expressed as a simple form,

$$\begin{aligned} y(\tau )=\varvec{{\psi }}^{\mathrm{T}}(\tau )\varvec{{\vartheta }}+v(\tau ). \end{aligned}$$

(35)

4.2 The Aitken Stochastic Gradient Algorithm

Since the information vector in the Hammerstein piecewise linearity system with colored noise contains the unmeasured noise variables $v(\tau -i)$, we denote $\hat{v}(\tau )$ and $\hat{\varvec{{\psi }}}(\tau )$ as the estimates of the $v(\tau )$ and $\varvec{{\psi }}(\tau )$ at time $\tau $, respectively. Let $\hat{\varvec{{\vartheta }}}(\tau )$ be the estimate of $\varvec{{\vartheta }}$ at time $\tau $ and define the innovation $e(\tau )$ at time $\tau $ as follows,

$$\begin{aligned} e(\tau ):=y(\tau )-{\hat{\varvec{{\psi }}}}^{\mathrm{T}}(\tau ){\hat{\varvec{{\vartheta }}}(\tau -1)}, \end{aligned}$$

(36)

where

$$\begin{aligned} \hat{\varvec{{\psi }}}(\tau )= & {} [-\hat{y}(\tau -1), -\hat{y}(\tau -2), \ldots , -\hat{y}(\tau -n), \hat{ s}(-q(\tau ))\hat{q}(\tau ),\nonumber \\&\hat{s}(-q(\tau -1))\hat{q}(\tau -1),\ldots , \hat{s}(-q(\tau -n+1))\hat{q}(\tau -n+1),\hat{s}(q(\tau ))\hat{q}(\tau ),\nonumber \\&\hat{s}(q(\tau -1))\hat{q}(\tau -1),\ldots , \hat{s}(q(\tau -n+1))\hat{q}(\tau -n+1),\nonumber \\&e(\tau -1),e(\tau -2),\ldots ,e(\tau -n_d)]^{\mathrm{T}}\in {\mathbb R}^{3n+n_d}. \end{aligned}$$

(37)

Remark 6

Since the information vector $\varvec{{\psi }}(\tau )$ contains the unmeasurable variables $v(\tau -i)$, their estimates $e(\tau -i)$ can be used to replace these unknown noise variables $v(\tau -i)$ in the information vector.

By using the Aitken accelerated iteration technique, the Aitken SG (A-SG) algorithm for the Hammerstein system with colored noise is developed as follows,

$$\begin{aligned} \bar{a}_{i}(\tau )= & {} \hat{a}_{i}(\tau -2)-\frac{(\hat{a}_{i}(\tau -1) -\hat{a}_{i}(\tau -2))^2}{\hat{a}_{i}(\tau )+\hat{a}_{i}(\tau -2) -2\hat{a}_{i}(\tau -1)}, \end{aligned}$$

(38)

$$\begin{aligned} \bar{m}_{k}(\tau )= & {} \hat{m}_{k}(\tau -2)-\frac{(\hat{m}_{k}(\tau -1) -\hat{m}_{k}(\tau -2))^2}{\hat{m}_{k}(\tau )+\hat{m}_{k}(\tau -2) -2\hat{m}_{k}(\tau -1)}, \end{aligned}$$

(39)

$$\begin{aligned} \bar{b}_j(\tau )\bar{m}_k(\tau )= & {} \hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2)\nonumber \\&-\frac{(\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1) -\hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2))^2}{\hat{b}_{j}(\tau )\hat{m}_k(\tau )+\hat{b}_{j}(\tau -2)\hat{m}_k(\tau -2) -2\hat{b}_{j}(\tau -1)\hat{m}_k(\tau -1)}, \end{aligned}$$

(40)

$$\begin{aligned} \bar{d}_{i}(\tau )= & {} \hat{d}_{i}(\tau -2)-\frac{(\hat{d}_{i}(\tau -1) -\hat{d}_{i}(\tau -2))^2}{\hat{d}_{i}(\tau )+\hat{d}_{i}(\tau -2) -2\hat{d}_{i}(\tau -1)}, \end{aligned}$$

(41)

$$\begin{aligned} \bar{\varvec{{\vartheta }}}(\tau )= & {} [\bar{a}_1(\tau ), \bar{a}_2(\tau ), \ldots , \bar{a}_n(\tau ), \bar{m}_1(\tau ), \bar{b}_1(\tau )\bar{m}_1(\tau ),\ldots , \bar{b}_{n-1}(\tau )\bar{m}_1(\tau ),\nonumber \\&\bar{m}_2(\tau ),\bar{b}_1(\tau )\bar{m}_2(\tau ),\ldots , \bar{b}_{n-1}(\tau )\bar{m}_2(\tau )]^{\mathrm{T}}, \end{aligned}$$

(42)

$$\begin{aligned} \hat{\varvec{{\vartheta }}}(\tau )= & {} \hat{\varvec{{\vartheta }}}(\tau -1) +\frac{\hat{\varvec{{\psi }}}(\tau )}{\lambda (\tau )}e(\tau ), \end{aligned}$$

(43)

$$\begin{aligned} e(\tau )= & {} y(\tau )-\hat{\varvec{{\psi }}}^{\mathrm{T}}(\tau )\hat{\varvec{{\vartheta }}}(\tau -1), \end{aligned}$$

(44)

$$\begin{aligned} \hat{\varvec{{\psi }}}(\tau )= & {} [-\hat{y}(\tau -1), -\hat{y}(\tau -2), \ldots ,-\hat{y}(\tau -n), \hat{ s}(-q(\tau ))\hat{q}(\tau ),\nonumber \\&\ \hat{s}(-q(\tau -1))\hat{q}(\tau -1),\ldots , \hat{s}(-q(\tau -n+1))\hat{q}(\tau -n+1),\nonumber \\&\ \hat{s}(q(\tau ))\hat{q}(\tau ),\hat{s}(q(\tau -1))\hat{q}(\tau -1), \ldots ,\nonumber \\&\hat{s}(q(\tau -n+1))\hat{q}(\tau -n+1),e(\tau -1),e(\tau -2),\ldots ,e(\tau -n_d)]^{\mathrm{T}}, \end{aligned}$$

(45)

$$\begin{aligned} \lambda (\tau )= & {} \lambda (\tau -1)+\hat{\varvec{{\psi }}}^{\mathrm{T}}(\tau )\hat{\varvec{{\psi }}}(\tau ). \end{aligned}$$

(46)

The flowchart of the A-SG algorithm is presented in Fig. 2. The proposed methods in this paper can combine other identification methods [40, 41] to study the parameter estimation problems of different systems with colored noises such as nonlinear systems [21, 22] and can be applied to other studies such as signal modeling and communication networked systems.

5 Numerical Examples

Example 1

Consider the following Hammerstein model,

$$\begin{aligned} A(\zeta )y(\tau )= & {} B(\zeta )f(q(\tau ))+v(\tau ),\\ y(\tau )= & {} -a_1y(\tau -1)-a_2y(\tau -2)+f(q(\tau ))+b_1f(q(\tau -1))+v(\tau )\\= & {} -0.15y(\tau -1)-0.46y(\tau -2)+f(q(\tau ))+0.9f(q(\tau -1))+v(\tau ),\\ f(q(\tau ))= & {} \bigg \{\begin{array}{ll} 0.3q(\tau ), &{} 0\le q(\tau ),\\ 0.2q(\tau ), &{}~q(\tau )\le 0,\\ \end{array} \\ \varvec{{\xi }}= & {} [a_1,a_2,m_1,b_1m_1,m_2,b_1m_2]^{\mathrm{T}}=[0.15,0.46,0.3,0.27,0.2,0.18]^{\mathrm{T}},\\ \varvec{{\chi }}(\tau )= & {} [-y(\tau -1),-y(\tau -2),s(-q(\tau ))q(\tau ),s(-q(\tau -1))q(\tau -1),s(q(\tau ))q(\tau ),\\&s(q(\tau -1))q(\tau -1)]^{\mathrm{T}}, \end{aligned}$$

where $\{v(\tau )\}$ is taken as a white noise sequence with zero mean and variance $\sigma ^2=0.10^2$, and $\{q(\tau )\}$ is an input sequence with zero mean and unit variance.

The SG, the FF-SG and the A-SG algorithms are applied to estimate the parameters of the piecewise linear system. The estimation errors $\delta :=\Vert \hat{\varvec{{\xi }}}-\varvec{{\xi }}\Vert /\Vert \varvec{{\xi }}\Vert $ or $\delta :=\Vert \bar{\varvec{{\xi }}}-\varvec{{\xi }}\Vert /\Vert \varvec{{\xi }}\Vert $ versus $\tau $ are shown in Fig. 3 and Tables 1, 2, 3. The means and variances of these three algorithms are given in Table 4.

Example 2

Consider the following Hammerstein model with colored noise,

$$\begin{aligned} y(\tau )= & {} \frac{B(\zeta )}{A(\zeta )}\bar{x}(\tau ) +\frac{D(\zeta )}{A(\zeta )}v(\tau ),\\ y(\tau )= & {} -a_1y(\tau -1)-a_2y(\tau -2)+f(q(\tau ))\\&+\,b_1f(q(\tau -1))+v(\tau )+d_1v(\tau -1)\\= & {} -0.21y(\tau -1)-0.10y(\tau -2)+f(q(\tau ))\\&+\,0.5f(q(\tau -1))+v(\tau )-0.38v(\tau -1),\\ f(q(\tau ))= & {} \bigg \{\begin{array}{ll} 2.0q(\tau ), &{} \quad 0\le q(\tau ),\\ 1.4q(\tau ), &{}\quad q(\tau )\le 0,\\ \end{array}\\ \varvec{{\vartheta }}= & {} [a_1,a_2,m_1,b_1m_1,m_2,b_1m_2,d_1]^{\mathrm{T}}=[0.21,0.1,2.1,1.0,1.4,0.7,-0.38]^{\mathrm{T}},\\ \varvec{{\psi }}(\tau )= & {} [-y(\tau -1),-y(\tau -2),s(-q(\tau ))q(\tau ),s(-q(\tau -1))q(\tau -1),\\&s(q(\tau ))q(\tau ),s(q(\tau -1))q(\tau -1),v(\tau -1)]^{\mathrm{T}}, \end{aligned}$$

where $\{v(\tau )\}$ is taken as a white noise sequence with zero mean and variance $\sigma ^2=0.10^2$, and $\{q(\tau )\}$ is an input sequence with zero mean and unit variance.

The SG, the FF-SG and the A-SG algorithms are applied to estimate the parameters of the piecewise linear system with colored noise, and the estimation errors $\delta :=\Vert \hat{\varvec{{\vartheta }}}-\varvec{{\vartheta }}\Vert /\Vert \varvec{{\vartheta }}\Vert $ or $\delta :=\Vert \bar{\varvec{{\vartheta }}}-\varvec{{\vartheta }}\Vert /\Vert \varvec{{\vartheta }}\Vert $ versus $\tau $ are shown in Fig. 4.

Table 1 The SG algorithm estimates and errors

Full size table

Table 2 The FF-SG algorithm estimates and errors

Full size table

Table 3 The A-SG algorithm estimates and errors

Full size table

Table 4 The means and variances of the parameter estimation errors

Full size table

From these two examples, we can get the following finds.

1.
Tables 1, 2, 3 show that the FF-SG algorithm and the A-SG algorithm are better than the SG algorithm.
2.
Figures 3 and 4 show that the estimation error curve of the FF-SG algorithm oscillates seriously when the errors converge to zero, but the estimation error curve of the A-SG algorithm is relatively smooth.
3.
Table 4 shows that the A-SG algorithm is the most effective algorithm among these three algorithms.
4.
The algorithms proposed in this paper can not only identify the Hammerstein system with white noise, but also the Hammerstein system with colored noise.

6 Conclusions

In this paper, some SG algorithms are proposed for Hammerstein systems with piecewise linearity. The key term separation method is used to transform the nonlinear model into a regression model. In order to accelerate the convergence rate of the SG algorithm, an FF-SG algorithm and an A-SG algorithm are studied. Compared with the FF-SG algorithm, the A-SG algorithm has almost the same estimation error mean but smaller estimation error variance. Therefore, the A-SG algorithm has a wider application prospect in system identification.

The purpose of this paper is to develop two accelerated SG algorithms for nonlinear systems. These methods can combine other identification algorithms, e.g., recursive least squares algorithm, expectation–maximization algorithm, to study the parameter estimation issues of time-delay systems, switching systems and neural network learning systems [25, 44, 45].

Data Availability Statement

All data generated or analyzed during this study are included in this article.

References

M. Ahmadi, H. Mojallali, Identification of multiple-input single-output Hammerstein models using Bezier curves and Bernstein polynomials. Appl. Math. Modell. 35(4), 1969–1982 (2011)
MathSciNet MATH Google Scholar
E.W. Bai, An optimal two-stage identification algorithm for Hammerstein-Wiener nonlinear systems. Automatica 34(3), 333–338 (1998)
MathSciNet MATH Google Scholar
E.W. Bai, Identification of linear systems with hard input nonlinearities of known structure. Automatica 38(5), 853–860 (2002)
MathSciNet MATH Google Scholar
G. Bottegal, A.Y. Aravkin, H. Hjalmarsson, G. Pillonetto, Robust EM kernel-based methods for linear system identification. Automatica 67, 114–126 (2016)
MathSciNet MATH Google Scholar
O. Bumbariu, A new Aitken type method for accelerating iterative sequences. Appl. Math. Comput. 219(1), 78–82 (2012)
MathSciNet MATH Google Scholar
G.Y. Chen, M. Gan, G.L. Chen, Generalized exponential autoregressive models for nonlinear time series: Stationarity, estimation and applications. Inf. Sci. 438, 46–57 (2018)
MathSciNet MATH Google Scholar
J. Chen, Modified stochastic gradient algorithms with fast convergence rates. J. Vib. Control 17(9), 1281–1286 (2011)
MathSciNet MATH Google Scholar
J. Chen, Y.J. Liu, Q.M. Zhu, Multi-step-length gradient iterative algorithm for equation-error type models. Syst. Control Lett. 115, 15–21 (2018)
MathSciNet MATH Google Scholar
J. Chen, X.P. Wang, R. Ding, Gradient based estimation algorithm for Hammerstein systems with saturation and dead-zone nonlinearities. Appl. Math. Modell. 36, 238–243 (2012)
MATH Google Scholar
J. Chen, Q.M. Zhu, J. Li, Biased compensation recursive least squares algorithm for rational models. Nonlinear Dyn. 91(2), 797–807 (2018)
MATH Google Scholar
F. Ding, X.P. Liu, G. Liu, Identification methods for Hammerstein nonlinear systems. Digit. Signal Process 21(2), 215–238 (2011)
Google Scholar
F. Ding, Y.J. Liu, B. Bao, Gradient based and least squares based iterative estimation algorithms for multi-input multi-output systems. Proc. Inst. Mech. Eng. Part I: J. Syst. Control Eng. 226(1), 43–55 (2012)
Google Scholar
F. Ding, L. Lv, J. Pan, X.K. Wan, X.B. Jin, Two-stage gradient-based iterative estimation methods for controlled autoregressive systems using the measurement data. Int. J. Control Autom. Syst. 18(4), 886–896 (2020)
Google Scholar
F. Ding, L. Xu, D.D. Meng et al., Gradient estimation algorithms for the parameter identification of bilinear systems using the auxiliary model. J. Comput. Appl. Math. 369, 112575 (2020)
MathSciNet MATH Google Scholar
F. Ding, L. Xu, Q.M. Zhu, Performance analysis of the generalised projection identification for time-varying systems. IET Control Theory Appl. 10(18), 2506–2514 (2016)
MathSciNet Google Scholar
F. Ding, X. Zhang, L. Xu, The innovation algorithms for multivariable state-space models. Int. J. Adapt. Control Signal Process. 33(11), 1601–1608 (2019)
MathSciNet MATH Google Scholar
M. Gan, C.L.P. Chen, G.Y. Chen, L. Chen, On some separated algorithms for separable nonlinear squares problems. IEEE Trans. Cybern. 48(10), 2866–2874 (2018)
Google Scholar
A. Hagenblad, L. Ljung, A. Wills, Maximum likelihood identification of Wiener models. Automatica 44(11), 2697–2705 (2008)
MathSciNet MATH Google Scholar
J.T. Hu, G.X. Sui, X.X. Lv, X.D. Li, Fixed-time control of delayed neural networks with impulsive perturbations. Nonlinear Anal.: Modell. Control 23(6), 904–920 (2018)
MathSciNet MATH Google Scholar
A. Hussu, The conjugate-gradient method for optimal control problems with undetermined final time. Int. J. Control 15(1), 79–82 (1972)
MATH Google Scholar
Y. Ji, X.K. Jiang, L.J. Wan, Hierarchical least squares parameter estimation algorithm for two-input Hammerstein finite impulse response systems. J. Frankl. Inst. 357(8), 5019–5032 (2020)
MathSciNet MATH Google Scholar
Y. Ji, C. Zhang, Z. Kang, T. Yu, Parameter estimation for block-oriented nonlinear systems using the key term separation. Int. J. Robust Nonlinear Control 30(9), 3727–3752 (2020)
MathSciNet Google Scholar
M.H. Li, X.M. Liu, Maximum likelihood least squares based iterative estimation for a class of bilinear systems using the data filtering technique. Int. J. Control Autom. Syst. 18(6), 1581–1592 (2020)
Google Scholar
J.S. Li, Y.Y. Zheng, Z.P. Lin, Recursive identification of time-varying systems: Self-tuning and matrix RLS algorithms. Syst. Control Lett. 66, 104–110 (2014)
MathSciNet MATH Google Scholar
X. Li, D. O’Regan, H. Akca, Global exponential stabilization of impulsive neural networks with unbounded continuously distributed delays. IMA J. Appl. Math. 80(1), 85–99 (2015)
MathSciNet MATH Google Scholar
X. Liu, J. Cao, W. Yu, Q. Song, Nonsmooth finite-time synchronization of switched coupled neural networks. IEEE Trans. Cybern. 46(10), 2360–2371 (2016)
Google Scholar
X. Liu, J. Lam, W. Yu, G. Chen, Finite-time consensus of multiagent systems with a switching protocol. IEEE Trans. Neural Netw. Learn. Syst. 27(4), 853–862 (2016)
MathSciNet Google Scholar
X.Y. Liu, H.S. Su, M.Z.Q. Chen, A switching approach to designing finite-time synchronization controllers of coupled neural networks. IEEE Trans. Neural Netw. Learn. Syst. 27(2), 471–482 (2016)
MathSciNet Google Scholar
H. Ma, J. Pan et al., Partially-coupled least squares based iterative parameter estimation for multi-variable output-error-like autoregressive moving average systems. IET Control Theory Appl. 13(18), 3040–3051 (2019)
Google Scholar
J.X. Ma, W.L. Xiong et al., Data filtering based forgetting factor stochastic gradient algorithm for Hammerstein systems with saturation and preload nonlinearities. J. Frankl. Inst. 353(16), 4280–4299 (2016)
MathSciNet MATH Google Scholar
H. Oktem, A survey on piecewise-linear models of regulatory dynamical systems. Nonlinear Anal.: Theory, Method Appl. 63(3), 336–349 (2005)
MathSciNet MATH Google Scholar
J. Pan, X. Jiang, X.K. Wan, W. Ding, A filtering based multi-innovation extended stochastic gradient algorithm for multivariable control systems. Int. J. Control Autom. Syst. 15(3), 1189–1197 (2017)
Google Scholar
J. Pan, H. Ma, X. Zhang et al., Recursive coupled projection algorithms for multivariable output-error-like systems with coloured noises. IET Signal Process. 14(7), 455–466 (2020)
Google Scholar
I. Pavaloiu, E. Catinas, On a robust Aitken–Newton method based on the Hermite polynomial. Appl. Math. Comput. 287, 224–231 (2016)
MathSciNet MATH Google Scholar
C. Philippe, S.C. Johan, Hammerstein–Wiener system estimator initialization. Automatica 40(9), 1543–1550 (2004)
MathSciNet MATH Google Scholar
H. Salhi, S. Kamoun, A recursive parametric estimation algorithm of multivariable nonlinear systems described by Hammerstein mathematical models. Appl. Math. Modell. 39(16), 4951–4962 (2015)
MathSciNet MATH Google Scholar
J. Vörös, Parameter identification of Wiener systems with multisegment piecewise-linear nonlinearities. Syst. Control Lett. 56(2), 99–105 (2007)
MathSciNet MATH Google Scholar
L.J. Wang, Y. Ji, L.J. Wan, N. Bu, Hierarchical recursive generalized extended least squares estimation algorithms for a class of nonlinear stochastic systems with colored noise. J. Frankl. Inst. 356(16), 10102–10122 (2019)
MathSciNet MATH Google Scholar
X.H. Wang, T. Hayat, A. Alsaedi, Combined state and multi-innovation parameter estimation for an input nonlinear state space system using the key term separation. IET Control Theory Appl. 10(13), 1503–1512 (2016)
MathSciNet Google Scholar
L. Xu, The damping iterative parameter identification method for dynamical systems based on the sine signal measurement. Signal Process. 120, 660–667 (2016)
Google Scholar
L. Xu, F. Ding, Iterative parameter estimation for signal models based on measured data. Circuits Syst. Signal Process. 37(7), 3046–3069 (2018)
MathSciNet MATH Google Scholar
L. Xu, F. Ding, Recursive least squares and multi-innovation stochastic gradient parameter estimation methods for signal modeling. Circuits Syst. Signal Process. 36(4), 1735–1753 (2017)
MATH Google Scholar
L. Xu, W.L. Xiong, A. Alsaedi, T. Hayat, Hierarchical parameter estimation for the frequency response based on the dynamical window data. Int. J. Control Autom. Syst. 16(4), 1756–1764 (2018)
Google Scholar
X. Yang, X. Li, Q. Xi, P. Duan, Review of stability and stabilization for impulsive delayed systems. Math. Biosci. Eng. 15(6), 1495–1515 (2018)
MathSciNet MATH Google Scholar
X. Zhang, F. Ding, Adaptive parameter estimation for a general dynamical system with unknown states. Int. J. Robust Nonlinear Control 30(4), 1351–1372 (2020)
MathSciNet Google Scholar
X. Zhang, F. Ding, L. Xu, Recursive parameter estimation methods and convergence analysis for a special class of nonlinear systems. Int. J. Robust Nonlinear Control 30(4), 1373–1393 (2020)
MathSciNet Google Scholar
X. Zhang, F. Ding, L. Xu, E.F. Yang, Highly computationally efficient state filter based on the delta operator. Int. J. Adapt. Control Signal Process. 33(6), 875–889 (2019)
MathSciNet MATH Google Scholar
X. Zhang, Q.Y. Liu et al., Recursive identification of bilinear time-delay systems through the redundant rule. J. Frankl. Inst. 357(1), 726–747 (2020)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61973137), the Funds of the Science and Technology on Near-Surface Detection Laboratory (No. TCGZ2019A001) and the Fundamental Research Funds for the Central Universities (No. JUSRP22016).

Author information

Authors and Affiliations

School of Science, Jiangnan University, Wuxi, 214122, People’s Republic of China
Yan Pu, Yongqing Yang & Jing Chen
School of Internet of Things Engineering, Jiangnan University, Wuxi, 214122, People’s Republic of China
Yan Pu

Authors

Yan Pu
View author publications
You can also search for this author in PubMed Google Scholar
Yongqing Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongqing Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pu, Y., Yang, Y. & Chen, J. Some Stochastic Gradient Algorithms for Hammerstein Systems with Piecewise Linearity. Circuits Syst Signal Process 40, 1635–1651 (2021). https://doi.org/10.1007/s00034-020-01554-z

Download citation

Received: 04 March 2020
Revised: 15 September 2020
Accepted: 18 September 2020
Published: 30 September 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s00034-020-01554-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Some Stochastic Gradient Algorithms for Hammerstein Systems with Piecewise Linearity

Abstract

Similar content being viewed by others

Stochastic Gradient Algorithm for Hammerstein Systems with Piece-Wise Linearities

Parameter Learning Algorithms of Hammerstein Nonlinear Systems

Maximum Likelihood Iterative Algorithm for Hammerstein Systems with Hard Nonlinearities

1 Introduction

2 The Hammerstein System with Piecewise Linearity

Remark 1

3 Some Stochastic Gradient Algorithms

3.1 The Traditional Stochastic Gradient Algorithm

Remark 2

3.2 Two Modified Stochastic Gradient Algorithms

Remark 3

Remark 4

Remark 5

4 The Identification for the Hammerstein Piecewise Linearity System with Colored Noise

4.1 Problem Description and Identification Model

4.2 The Aitken Stochastic Gradient Algorithm

Remark 6

5 Numerical Examples

Example 1

Example 2

6 Conclusions

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Some Stochastic Gradient Algorithms for Hammerstein Systems with Piecewise Linearity

Abstract

Similar content being viewed by others

Stochastic Gradient Algorithm for Hammerstein Systems with Piece-Wise Linearities

Parameter Learning Algorithms of Hammerstein Nonlinear Systems

Maximum Likelihood Iterative Algorithm for Hammerstein Systems with Hard Nonlinearities

1 Introduction

2 The Hammerstein System with Piecewise Linearity

Remark 1

3 Some Stochastic Gradient Algorithms

3.1 The Traditional Stochastic Gradient Algorithm

Remark 2

3.2 Two Modified Stochastic Gradient Algorithms

Remark 3

Remark 4

Remark 5

4 The Identification for the Hammerstein Piecewise Linearity System with Colored Noise

4.1 Problem Description and Identification Model

4.2 The Aitken Stochastic Gradient Algorithm

Remark 6

5 Numerical Examples

Example 1

Example 2

6 Conclusions

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation