Hierarchical Stochastic Gradient Algorithm and its Performance Analysis for a Class of Bilinear-in-Parameter Systems

Ding, Feng; Wang, Xuehai

doi:10.1007/s00034-016-0367-7

Hierarchical Stochastic Gradient Algorithm and its Performance Analysis for a Class of Bilinear-in-Parameter Systems

Published: 21 July 2016

Volume 36, pages 1393–1405, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Hierarchical Stochastic Gradient Algorithm and its Performance Analysis for a Class of Bilinear-in-Parameter Systems

Download PDF

Feng Ding¹ &
Xuehai Wang²

454 Accesses
45 Citations
Explore all metrics

Abstract

This paper considers the parameter identification for a special class of nonlinear systems, i.e., bilinear-in-parameter systems. Based on the hierarchical identification principle, a hierarchical stochastic gradient (HSG) estimation algorithm is presented. The basic idea is to decompose a bilinear-in-parameter system into two subsystems and to derive the HSG identification algorithm for estimating the system parameters by replacing the unknown variables in the information vectors with their estimates obtained at the previous time. The convergence analysis of the proposed algorithm indicates that the parameter estimation errors converge to zero under persistent excitation conditions. The simulation results show that the proposed algorithm is effective.

Convergence Analysis of the Hierarchical Least Squares Algorithm for Bilinear-in-Parameter Systems

Article 18 February 2016

Gradient Parameter Estimation of a Class of Nonlinear Systems Based on the Maximum Likelihood Principle

Article 21 April 2022

Hierarchical Gradient-Based Iterative Parameter Estimation Algorithms for a Nonlinear Feedback System Based on the Hierarchical Identification Principle

Article 17 August 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Parameter estimation algorithms are often obtained through minimizing a criterion function. The gradient search, least squares search and Newton search are the useful tools for solving nonlinear optimization problems [15, 23, 44–46]. Nonlinearities exist widely in industrial processes [21]. Typical nonlinear systems are the block-oriented systems, including input nonlinear systems [25, 30, 38, 42], output nonlinear systems [11, 41] or Wiener nonlinear systems [9], input–output (i.e., Hammerstein–Wiener) nonlinear systems [2, 31] and feedback nonlinear systems [14]. When the static nonlinear part of the block-oriented systems can be expressed as a linear combination of the known basis functions, the corresponding systems are the Hammerstein systems, Wiener systems and their combinations [16, 40]. A direct method of identifying the block-oriented nonlinear systems is the over-parametrization method [3]. By re-parameterizing the nonlinear systems, the output appears to be linear on the unknown parameter space so that any linear identification algorithms can be applied [4]. However, the resulting identification model contains the cross-products between the parameters in the nonlinear part and those in linear part, leading to estimate more parameters than the nonlinear system.

In the area of system identification, linear-in-parameter output error moving average systems are common, for which Wang and Tang [36] presented a recursive least squares estimation algorithm and discussed several gradient-based iterative estimation algorithms using the filtering technique [37]; Wang and Zhu [39] presented a multi-innovation parameter estimation algorithm. The system that includes the product terms of parameters is called the bilinear-in-parameter system. Bai and Liu [5] discussed the least squares solution of the normalized iterative method, the over- parametrization method and the numerical method for bilinear-in-parameter systems; Wang et al. [24] revisited the unweighted least squares solution and extended to identify the case of colored noise; Abrahamsson et al. [1] presented a two-stage method based on the approximation of a weighting matrix and discussed the applications to submarine detection. Other methods include the Kalman filtering-based identification approaches [10, 23].

The convergence of identification algorithms is a basic topic for system identification and attracts much attention. Recently, an auxiliary model-based recursive least squares algorithm and an auxiliary model-based hierarchical gradient algorithm have been proposed for dual-rate state space systems [12] and for multivariable Box–Jenkins systems using the data filtering [32–34]. The modeling and multi-innovation parameter identification has been proposed for Hammerstein nonlinear state space systems using the filtering technique [35]; a recursive parameter and state estimation algorithm has been proposed for an input nonlinear state space system using the hierarchical identification principle [29]; an auxiliary model-based gradient algorithm has been reported for the time-delay systems by transforming the input–output representation into a regression model and its convergence was studied [13]. The convergence analysis of the hierarchical least squares algorithm has been analyzed for bilinear-in-parameter systems [26]. On the basis of the work in [26], this paper derives a hierarchical stochastic gradient (HSG) algorithm for bilinear-in-parameter systems based on the decomposition idea and analyzes its performances.

The rest of this paper is organized as follows. Section 2 presents an HSG algorithm for bilinear-in-parameter systems. Section 3 analyzes the performance of the HSG algorithm. Section 4 provides an illustrative example to show that the proposed algorithm is effective. Finally, a brief summary of the main contents is given in Sect. 5.

2 System Description and the HSG Algorithm

Consider the following bilinear-in-parameter systems [5, 26],

$$\begin{aligned} y(t)=\varvec{a}^{\tiny \text{ T }}\varvec{F}(t)\varvec{b}+v(t), \end{aligned}$$

(1)

where y(t) is the system output, $\varvec{F}(t) \in {\mathbb {R}}^{m\times n}$ is composed of available measurement data, v(t) is a white noise sequence with zero mean and finite variance $\sigma ^2$ and $\varvec{a}=[a_1, a_2,\ldots , a_m]^{\tiny \text{ T }}\in {\mathbb {R}}^m$ and $\varvec{b}=[b_1, b_2, \ldots , b_n]^{\tiny \text{ T }}\in {\mathbb {R}}^n$ are the unknown parameter vectors to be estimated.

For the identification model in (1), assume that m and n are known, and $y(t)=0$, $v(t)=0$ for $t\leqslant 0$. Note that for any pair $\lambda \varvec{a}$, $\varvec{b}/\lambda $ , the system in (1) has the identical input–output relationship, so the constant $\lambda $ has to be fixed. Without generality, we adopt the following assumption.

Assumption 1

$\lambda =\Vert \varvec{b}\Vert $, and the first element of $\varvec{b}$ is positive, i.e., $b_1>0$, where the norm of the vector $\varvec{X}$ is defined by $\Vert \varvec{X}\Vert ^2:=\mathrm{tr}[\varvec{X}\varvec{X}^{\tiny \text{ T }}]$.

Define the vector ${\varvec{\psi }}(t):=\varvec{F}(t)\varvec{b}\in {\mathbb {R}}^m$, ${\varvec{\varphi }}(t):=\varvec{F}^{\tiny \text{ T }}(t)\varvec{a}\in {\mathbb {R}}^n$. Then Eq. (1) can be written as

$$\begin{aligned} y(t)={\varvec{\psi }}^{\tiny \text{ T }}(t)\varvec{a}+v(t), \end{aligned}$$

(2)

or

$$\begin{aligned} y(t)={\varvec{\varphi }}^{\tiny \text{ T }}(t)\varvec{b}+v(t). \end{aligned}$$

(3)

Define the following two cost functions:

$$\begin{aligned} J_1(\varvec{a}):= & {} \Vert y(t)-{\varvec{\psi }}^{\tiny \text{ T }}(t)\varvec{a}\Vert ^2, \\ J_2(\varvec{b}):= & {} \Vert y(t)-{\varvec{\varphi }}^{\tiny \text{ T }}(t)\varvec{b}\Vert ^2. \end{aligned}$$

Using the negative gradient search and minimizing $J_1(\varvec{a})$ and $J_2(\varvec{b})$, we obtain the estimates $\hat{\varvec{a}}(t)$ of $\varvec{a}$ in Subsystem (2) and $\hat{\varvec{b}}(t)$ of $\varvec{b}$ in Subsystem (3) at time t:

$$\begin{aligned} \hat{\varvec{a}}(t)= & {} \hat{\varvec{a}}(t-1)+\frac{{\varvec{\psi }}(t)}{r_1(t)} \left[ y(t)-{\varvec{\psi }}^{\tiny \text{ T }}(t)\hat{\varvec{a}}(t-1)\right] , \end{aligned}$$

(4)

$$\begin{aligned} r_1(t)= & {} r_1(t-1)+\Vert {\varvec{\psi }}(t)\Vert ^2,\ r_1(0)=1, \end{aligned}$$

(5)

$$\begin{aligned} \hat{\varvec{b}}(t)= & {} \hat{\varvec{b}}(t-1)+\frac{{\varvec{\varphi }}(t)}{r_2(t)} \left[ y(t)-{\varvec{\varphi }}^{\tiny \text{ T }}(t)\hat{\varvec{b}}(t-1)\right] , \end{aligned}$$

(6)

$$\begin{aligned} r_2(t)= & {} r_2(t-1)+\Vert {\varvec{\varphi }}(t)\Vert ^2,\ r_2(0)=1. \end{aligned}$$

(7)

Since the vectors ${\varvec{\psi }}(t)$ and ${\varvec{\varphi }}(t)$ contain the unknown parameter vectors $\varvec{b}$ and $\varvec{a}$, the algorithm in (4)–(7) is impossible to implement. This problem can be solved by replacing $\varvec{b}$ and $\varvec{a}$ with their corresponding estimates $\hat{\varvec{b}}(t-1)$ and $\hat{\varvec{a}}(t-1)$ at time $t-1$. Letting $\hat{{\varvec{\psi }}}(t):=\varvec{F}(t)\hat{\varvec{b}}(t-1)\in {\mathbb {R}}^m$ and $\hat{{\varvec{\varphi }}}(t):=\varvec{F}^{\tiny \text{ T }}(t)\hat{\varvec{a}}(t-1)\in {\mathbb {R}}^n$, we have the following HSG algorithm for bilinear-in-parameter systems in (1):

$$\begin{aligned} \hat{\varvec{a}}(t)= & {} \hat{\varvec{a}}(t-1)+\frac{\varvec{F}(t)\hat{\varvec{b}}(t-1)}{r_1(t)} \left[ y(t)-\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)\right] , \end{aligned}$$

(8)

$$\begin{aligned} r_1(t)= & {} r_1(t-1)+\Vert \varvec{F}(t)\hat{\varvec{b}}(t-1)\Vert ^2,\ r_1(0)=1, \end{aligned}$$

(9)

$$\begin{aligned} \hat{\varvec{b}}(t)= & {} \hat{\varvec{b}}(t-1)+\frac{\varvec{F}^{\tiny \text{ T }}(t)\hat{\varvec{a}}(t-1)}{r_2(t)} \left[ y(t)-\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)\right] , \end{aligned}$$

(10)

$$\begin{aligned} r_2(t)= & {} r_2(t-1)+\Vert \varvec{F}^{\tiny \text{ T }}(t)\hat{\varvec{a}}(t-1)\Vert ^2,\ r_2(0)=1. \end{aligned}$$

(11)

The initial values are taken to be $\hat{\varvec{a}}(0)=\mathbf{1}_m/p_0$, $\hat{\varvec{b}}(0)=\mathbf{1}_n/p_0$, where $p_0$ is a large number, e.g., $p_0=10^6$.

3 The Convergence Analysis

Lemma 1

[8] Assume that the nonnegative sequences T(t), $\eta (t)$ and $\zeta (t)$ satisfy the inequality

$$\begin{aligned} T(t)\leqslant T(t-1)+\eta (t)-\zeta (t) \end{aligned}$$

and $\sum \limits _{t=1}^{\infty }\eta (t)<\infty $, then we have $\sum \limits _{t=1}^{\infty }\zeta (t)<\infty $ and T(t) is bounded.

The proof of Lemma 1 is straightforward and hence omitted.

Theorem 1

For the system in (1) and the HSG algorithm in (8)–(11), assume that v(t) is a white noise sequence with zero mean and variances $\sigma ^2$, and there exist an integer N and two positive constants $c_1$ and $c_2$ such that the following persistent excitation conditions hold:

$$\begin{aligned}&({\mathrm{A1}}) \quad \sum _{j=0}^{N-1}\frac{\hat{{\varvec{\psi }}}(t+j)\hat{{\varvec{\psi }}}^{\tiny \text{ T }}(t+j)}{r_1(t+j)}\geqslant c_1\varvec{I}_m,\ \mathrm{a.s.},\\&({\mathrm{A2}}) \quad \sum _{j=0}^{N-1}\frac{\hat{{\varvec{\varphi }}}(t+j)\hat{{\varvec{\varphi }}}^{\tiny \text{ T }}(t+j)}{r_2(t+j)}\geqslant c_2\varvec{I}_n,\ \mathrm{a.s.}, \end{aligned}$$

Then the parameter estimation errors converge to zero, i.e.,

$$\begin{aligned} \Vert \hat{\varvec{a}}(t)-\varvec{a}\Vert \rightarrow 0,\quad \Vert \hat{\varvec{b}}(t)-\varvec{b}\Vert \rightarrow 0. \end{aligned}$$

Proof

Define two parameter error vectors:

$$\begin{aligned} \tilde{\varvec{a}}(t):= & {} \hat{\varvec{a}}(t)-\varvec{a}\in {\mathbb {R}}^m, \end{aligned}$$

(12)

$$\begin{aligned} \tilde{\varvec{b}}(t):= & {} \hat{\varvec{b}}(t)-\varvec{b}\in {\mathbb {R}}^n. \end{aligned}$$

(13)

Substituting (1) and (8) into (12), we have

$$\begin{aligned} \tilde{\varvec{a}}(t)= & {} \tilde{\varvec{a}}(t-1)+\frac{\hat{{\varvec{\psi }}}(t)}{r_1(t)} \left[ y(t)-\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)\right] \nonumber \\= & {} \tilde{\varvec{a}}(t-1)+\frac{\hat{{\varvec{\psi }}}(t)}{r_1(t)} \left[ \varvec{a}^{\tiny \text{ T }}\varvec{F}(t)\varvec{b}-\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)+v(t)\right] \end{aligned}$$

(14)

$$\begin{aligned}= & {} \tilde{\varvec{a}}(t-1)+\frac{\hat{{\varvec{\psi }}}(t)}{r_1(t)} \left[ -\tilde{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)-\varvec{a}^{\tiny \text{ T }}\varvec{F}(t)\tilde{\varvec{b}}(t-1)+v(t)\right] \nonumber \\=: & {} \tilde{\varvec{a}}(t-1)+\frac{\hat{{\varvec{\psi }}}(t)}{r_1(t)}\left[ -\tilde{y}_1(t)-\xi _1(t)+v(t) \right] ,\ \end{aligned}$$

(15)

where

$$\begin{aligned} \tilde{y}_1(t):= & {} \tilde{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)\in {\mathbb {R}}, \end{aligned}$$

(16)

$$\begin{aligned} \xi _1(t):= & {} \varvec{a}^{\tiny \text{ T }}\varvec{F}(t)\tilde{\varvec{b}}(t-1)\in {\mathbb {R}}. \end{aligned}$$

(17)

Taking the norm of both sides of (15) and using (16) yield

$$\begin{aligned} \left\| \tilde{\varvec{a}}(t)\right\| ^2= & {} \left\| \tilde{\varvec{a}}(t-1) +\,\frac{\hat{{\varvec{\psi }}}(t)}{r_1(t)}\left[ -\tilde{y}_1(t)-\xi _1(t)+v(t) \right] \right\| ^2 \nonumber \\= & {} \left\| \tilde{\varvec{a}}(t-1)\right\| ^2+\frac{2\tilde{\varvec{a}}^{\tiny \text{ T }}(t-1) \hat{{\varvec{\psi }}}(t)}{r_1(t)} \left[ -\tilde{y}_1(t)-\xi _1(t)+v(t) \right] \nonumber \\&+\,\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)} \left[ -\tilde{y}_1(t)-\xi _1(t)+v(t) \right] ^2 \nonumber \\= & {} \left\| \tilde{\varvec{a}}(t-1)\right\| ^2+\frac{2\tilde{y}_1(t)}{r_1(t)} \left[ -\tilde{y}_1(t)-\xi _1(t)+v(t) \right] \nonumber \\&+\,\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\left[ -\tilde{y}_1(t) -\xi _1(t)+v(t) \right] ^2. \end{aligned}$$

(18)

Define $\tilde{y}_2(t):=\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\tilde{\varvec{b}}(t-1)\in {\mathbb {R}}$, $\xi _2(t):=\tilde{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\varvec{b}\in {\mathbb {R}}$. Similarly, we have

$$\begin{aligned} \tilde{\varvec{b}}(t)= & {} \tilde{\varvec{b}}(t-1)+\frac{\hat{{\varvec{\varphi }}}(t)}{r_2(t)}\left[ -\tilde{y}_2(t)-\xi _2(t)+v(t) \right] , \nonumber \\ \left\| \tilde{\varvec{b}}(t)\right\| ^2= & {} \left\| \tilde{\varvec{b}}(t-1)\right\| ^2+\frac{2\tilde{y}_2(t)}{r_2(t)}\left[ -\tilde{y}_2(t)-\xi _2(t)+v(t) \right] \nonumber \\&+\,\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)}\left[ -\tilde{y}_2(t)-\xi _2(t)+v(t) \right] ^2. \end{aligned}$$

(19)

Let $T(t):=\Vert \tilde{\varvec{a}}(t)\Vert ^2+\Vert \tilde{\varvec{b}}(t)\Vert ^2$. Using (18), (19), (9) and (11) gives

$$\begin{aligned} T(t)= & {} \left\| \tilde{\varvec{a}}(t-1)\right\| ^2+\frac{2\tilde{y}_1(t)}{r_1(t)}\left[ -\tilde{y}_1(t)-\xi _1(t)+v(t) \right] \\&+\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\left[ \tilde{y}_1^2(t)+\xi _1^2(t)+v^2(t) +2\tilde{y}_1(t)\xi _1(t)-2\tilde{y}_1(t)v(t)-2\xi _1(t)v(t)\right] \\&+\,\left\| \tilde{\varvec{b}}(t-1)\right\| ^2+\frac{2\tilde{y}_2(t)}{r_2(t)} \left[ -\tilde{y}_2(t)-\xi _2(t)+v(t) \right] \\&+\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)} \left[ \tilde{y}_2^2(t)+\xi _2^2(t)+v^2(t) +\,2\tilde{y}_2(t)\xi _2(t)-2\tilde{y}_2(t)v(t)-2\xi _2(t)v(t)\right] \\= & {} T(t-1)-\left[ \frac{2}{r_1(t)}-\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\right] \tilde{y}_1^2(t) \\&+\,2\left[ \frac{1}{r_1(t)}-\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\right] \tilde{y}_1(t)\left[ v(t)-\xi _1(t)\right] \\&+\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\left[ \xi _1^2(t)+v^2(t)-2\xi _1(t)v(t)\right] -\left[ \frac{2}{r_2(t)}-\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)}\right] \tilde{y}_2^2(t)\\&+2\left[ \frac{1}{r_2(t)} -\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)}\right] \tilde{y}_2(t) \left[ v(t)-\xi _2(t)\right] \frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)} \\&\quad \left[ \xi _2^2(t)+v^2(t)-2\xi _2(t)v(t)\right] \\= & {} T(t-1)-\left[ \frac{r_1(t)+r_1(t-1)}{r_1^2(t)}\right] \tilde{y}_1^2(t) +\frac{2r_1(t-1)}{r_1^2(t)}\tilde{y}_1(t)\left[ v(t)-\xi _1(t)\right] \\&+\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\left[ \xi _1^2(t)+v^2(t)-2\xi _1(t)v(t)\right] -\left[ \frac{r_2(t)+r_2(t-1)}{r_2^2(t)}\right] \tilde{y}_2^2(t)\\&+\frac{2r_2(t-1)}{r_2^2(t)}\tilde{y}_2(t)\left[ v(t)-\xi _2(t)\right] +\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)}\xi _2^2(t)+v^2(t)-2\xi _2(t)v(t)]\\\leqslant & {} T(t-1)-\frac{1}{r_1(t)}\tilde{y}_1^2(t)+\frac{2r_1(t-1)}{r_1^2(t)}\tilde{y}_1(t)\left[ v(t)-\xi _1(t)\right] \\&+\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)} \left[ \xi _1^2(t)+v^2(t)-2\xi _1(t)v(t)\right] -\frac{1}{r_2(t)}\tilde{y}_2^2(t)\\&+\frac{2r_2(t-1)}{r_2^2(t)}\tilde{y}_2(t)\left[ v(t)-\xi _2(t)\right] +\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)} \left[ \xi _2^2(t)+v^2(t)-2\xi _2(t)v(t)\right] \end{aligned}$$

$$\begin{aligned}= & {} T(t-1)-\gamma (t)-\frac{1}{r_1(t)}\tilde{y}_1^2(t) +\frac{2r_1(t-1)}{r_1^2(t)}\tilde{y}_1(t)v(t)\nonumber \\&+\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\left[ \xi _1^2(t)+v^2(t)\right] \nonumber \\&-\frac{1}{r_2(t)}\tilde{y}_2^2(t)+\frac{2r_2(t-1)}{r_2^2(t)}\tilde{y}_2(t)v(t) \nonumber \\&+\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)} \left[ \xi _2^2(t)+v^2(t)-2\xi _2(t)v(t)\right] , \end{aligned}$$

(20)

where

$$\begin{aligned} \gamma (t):=\frac{2r_1(t-1)}{r_1^2(t)}\tilde{y}_1(t)\xi _1(t) +\frac{2r_2(t-1)}{r_2^2(t)}\tilde{y}_2(t)\xi _2(t). \end{aligned}$$

When $\xi _1^2>\varepsilon $ or $\xi _2^2>\varepsilon $ or $\gamma (t)<0$ ($\varepsilon $ is a given positive number), we let $\tilde{\varvec{a}}(t):=\tilde{\varvec{a}}(t-1)$ and $\tilde{\varvec{b}}(t):=\tilde{\varvec{b}}(t-1)$, and thus we have $T(t)=T(t-1)$. When $\xi _1^2 \leqslant \varepsilon $ and $\xi _2^2\leqslant \varepsilon $ and $\gamma (t)\geqslant 0$ , since v(t) is a white noise with zero mean and variance $\sigma ^2$, and $\varvec{F}(t)$, $\hat{\varvec{a}}(t-1)$, $\hat{\varvec{b}}(t-1)$, $r_1(t)$, $r_2(t)$, $\xi _1(t)$ and $\xi _2(t)$ are independent of v(t), taking expectation of both sides of (20), we have

$$\begin{aligned} \mathrm{E}[T(t)]\leqslant & {} \mathrm{E}[T(t-1)]-\mathrm{E}\left[ \frac{\tilde{y}_1^2(t)}{r_1(t)}+\frac{\tilde{y}_2^2(t)}{r_2(t)}\right] \nonumber \\&+\mathrm{E}\left[ \frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}+\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)}\right] (\sigma ^2+\varepsilon ), \end{aligned}$$

(21)

From (9), we have

$$\begin{aligned} \sum _{t=1}^{\infty }\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1^2(t)}\leqslant & {} \sum _{t=1}^{\infty }\frac{\left\| \hat{{\varvec{\psi }}}(t)\right\| ^2}{r_1(t)r_1(t-1)} =\sum _{t=1}^{\infty }\frac{r_1(t)-r_1(t-1)}{r_1(t)r_1(t-1)}\\= & {} \sum _{t=1}^{\infty }\left[ \frac{1}{r_1(t-1)}-\frac{1}{r_1(t)}\right] =\frac{1}{r_1(0)}-\frac{1}{r_1(\infty )}<\infty ,\ \mathrm {a.s.} \end{aligned}$$

Similarly, from (11), we have

$$\begin{aligned} \sum _{t=1}^{\infty }\frac{\left\| \hat{{\varvec{\varphi }}}(t)\right\| ^2}{r_2^2(t)}<\infty ,\ \mathrm{a.s.}\end{aligned}$$

Hence, summation of the last term of the right-hand side of (21) from $t=1$ to $\infty $ is finite. Applying Lemma 1 to (21), we conclude that $\mathrm{E}[T(t)]$ converges to a constant. So there exist a constant $C>0$ and $t_0$ such that $\mathrm{E}[T(t)]\leqslant C$ for $t>t_0$. From (21), it follows that

$$\begin{aligned} \sum _{t=1}^{\infty }\left[ \frac{\tilde{y}_1^2(t)}{r_1(t)}+\frac{\tilde{y}_2^2(t)}{r_2(t)}\right] <\infty . \end{aligned}$$

Note that $r_1(t)>0$ and $r_2(t)>0$, we have

$$\begin{aligned} \sum _{t=1}^{\infty }\frac{\tilde{y}_1^2(t)}{r_1(t)}<\infty , \quad \sum _{t=1}^{\infty }\frac{\tilde{y}_2^2(t)}{r_2(t)}<\infty , \quad \lim _{t\rightarrow \infty }\frac{\tilde{y}_1^2(t)}{r_1(t)}=0, \quad \lim _{t\rightarrow \infty }\frac{\tilde{y}_2^2(t)}{r_2(t)}=0. \end{aligned}$$

(22)

Define the identification innovation

$$\begin{aligned} e(t):=y(t)-\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)\in {\mathbb {R}}, \end{aligned}$$

From (14), we have

$$\begin{aligned} \tilde{\varvec{a}}(t)=\tilde{\varvec{a}}(t-1)+\frac{\hat{{\varvec{\psi }}}(t)}{r_1(t)}e(t). \end{aligned}$$

(23)

Replacing t in (23) with $t+j$ and successive substitutions give

$$\begin{aligned} \tilde{\varvec{a}}(t+j) =\tilde{\varvec{a}}(t)+\sum _{i=1}^{j} \frac{\hat{{\varvec{\psi }}}(t+i)}{r_1(t+i)}e(t+i). \end{aligned}$$

(24)

Using (16), it follows that

$$\begin{aligned} \tilde{y}_1(t)= & {} \hat{{\varvec{\psi }}}^{\tiny \text{ T }}(t)\tilde{\varvec{a}}(t-1), \nonumber \\ \tilde{y}_1(t+j)= & {} \hat{{\varvec{\psi }}}^{\tiny \text{ T }}(t+j)\tilde{\varvec{a}}(t+j-1). \end{aligned}$$

(25)

Substituting (24) into (25) gives

$$\begin{aligned} \hat{{\varvec{\psi }}}^{\tiny \text{ T }}(t+j)\tilde{\varvec{a}}(t)=\tilde{y}_1(t+j)-\hat{{\varvec{\psi }}}^{\tiny \text{ T }}(t+j)\sum _{i=1}^{j-1}\frac{\hat{{\varvec{\psi }}}(t+i)}{r_1(t+i)}e(t+i), \end{aligned}$$

(26)

Squaring and summing for j from $j=1$ to $j=N-1$, dividing by $r_1(t+j)$, and using (A1), (24) and (26), we have

$$\begin{aligned} c_1\Vert \tilde{\varvec{a}}(t)\Vert ^2\leqslant & {} \tilde{\varvec{a}}^{\tiny \text{ T }}(t)\left[ \sum _{j=1}^{N-1}\frac{\hat{{\varvec{\psi }}}(t+j)\hat{{\varvec{\psi }}}^{\tiny \text{ T }}(t+j)}{r_1(t+j)}\right] \tilde{\varvec{a}}(t)\nonumber \\= & {} \sum _{j=1}^{N-1}\frac{\tilde{\varvec{a}}^{\tiny \text{ T }}(t)\hat{{\varvec{\psi }}}(t+j)\hat{{\varvec{\psi }}}^{\tiny \text{ T }}(t+j)\tilde{\varvec{a}}(t)}{r_1(t+j)}\nonumber \\\leqslant & {} \sum _{j=1}^{N-1}\left[ \frac{2\tilde{y}_1^2(t+j)}{r_1(t+j)} +\frac{2\left\| \hat{{\varvec{\psi }}}(t+j)\right\| ^2}{r_1(t+j)}\left\| \sum _{i=1}^{j-1}\frac{\hat{{\varvec{\psi }}}(t+i)}{r_1(t+i)}e(t+i)\right\| ^2\right] \nonumber \\= & {} \sum _{j=1}^{N-1}\left[ \frac{2\tilde{y}_1^2(t+j)}{r_1(t+j)}+\frac{2\left\| \hat{{\varvec{\psi }}}(t+j)\right\| ^2}{r_1(t+j)}\left\| \tilde{\varvec{a}}(t+j-1)-\tilde{\varvec{a}}(t)\right\| ^2\right] \nonumber \\\leqslant & {} \sum _{j=1}^{N-1}\left[ \frac{2\tilde{y}_1^2(t+j)}{r_1(t+j)} +\frac{4\left\| \hat{{\varvec{\psi }}}(t+j)\right\| ^2}{r_1(t+j)}\left( \left\| \tilde{\varvec{a}}(t+j-1)\right\| ^2+\left\| \tilde{\varvec{a}}(t)\right\| ^2\right) \right] ,\nonumber \\ \end{aligned}$$

(27)

Since $\mathrm{E}[T(t)]=\mathrm{E}[\Vert \tilde{\varvec{a}}(t)\Vert ^2+\Vert \tilde{\varvec{b}}(t)\Vert ^2]\leqslant C$, we have $\mathrm{E}[\Vert \tilde{\varvec{a}}(t)\Vert ^2]\leqslant C $. Taking the expectation and the limit of both sides of (27), it follows

$$\begin{aligned} \lim _{t\rightarrow \infty }\mathrm {E}\left[ \left\| \tilde{\varvec{a}}(t)\right\| ^2\right] \leqslant \lim _{t\rightarrow \infty }\frac{1}{c_1}\mathrm {E}\left\{ \sum _{j=1}^{N-1}\left[ \frac{2\tilde{y}_1^2(t+j)}{r_1(t+j)} +\frac{8C\left\| \hat{{\varvec{\psi }}}(t+j)\right\| ^2}{r_1(t+j)}\right] \right\} . \end{aligned}$$

Assume that $\lim \limits _{t\rightarrow \infty }\Vert \hat{{\varvec{\psi }}}(t+j)\Vert ^2/r_1(t+j)=0$. Using (22) gives $\lim \limits _{t\rightarrow \infty }\mathrm{E}\left[ \Vert \tilde{\varvec{a}}(t)\Vert ^2\right] =0$. Similarly, we can obtain $\lim \limits _{t\rightarrow \infty }\mathrm{E}[\Vert \tilde{\varvec{b}}(t)\Vert ^2]=0$. This completes the proof. $\square $

In order to improve the convergence rate of the HSG algorithm, we introduce a forgetting factor $\lambda $ $(0\leqslant \lambda \leqslant 1)$ in (8)–(11) and the corresponding algorithm is called the forgetting factor HSG (FF-HSG) algorithm, which is as follows:

$$\begin{aligned} \hat{\varvec{a}}(t)= & {} \hat{\varvec{a}}(t-1)+\frac{\varvec{F}(t)\hat{\varvec{b}}(t-1)}{r_1(t)} \left[ y(t)-\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)\right] , \end{aligned}$$

(28)

$$\begin{aligned} r_1(t)= & {} \lambda r_1(t-1)+\left\| \varvec{F}(t)\hat{\varvec{b}}(t-1)\right\| ^3,\ r_1(0)=1, \end{aligned}$$

(29)

$$\begin{aligned} \hat{\varvec{b}}(t)= & {} \hat{\varvec{b}}(t-1)+\frac{\varvec{F}^{\tiny \text{ T }}(t)\hat{\varvec{a}}(t-1)}{r_3(t)} \left[ y(t)-\hat{\varvec{a}}^{\tiny \text{ T }}(t-1)\varvec{F}(t)\hat{\varvec{b}}(t-1)\right] , \end{aligned}$$

(30)

$$\begin{aligned} r_3(t)= & {} \lambda r_3(t-1)+\left\| \varvec{F}^{\tiny \text{ T }}(t)\hat{\varvec{a}}(t-1)\right\| ^3,\ r_3(0)=1. \end{aligned}$$

(31)

Obviously, when the forgetting factor $\lambda =1$, the FF-HSG algorithm is reduced to the HSG algorithm; when $\lambda =0$, the FF-HSG algorithm is degenerated to the hierarchical projection algorithm.

4 Example

Consider the following bilinear-in-parameter system with $m=2$ and $n=3$,

$$\begin{aligned} y(t)= & {} \varvec{a}^{\tiny \text{ T }}\varvec{F}(t)\varvec{b}+v(t), \\ \varvec{F}(t)= & {} \left[ \begin{array}{c} \varvec{f}(u(t-1)) \\ \varvec{f}(u(t-2)) \end{array} \right] = \left[ \begin{array}{ccc} u(t-1) &{} u^2(t-1) &{} u^3(t-1)\\ u(t-2) &{} u^2(t-2) &{} u^3(t-2)\end{array}\right] . \\ \varvec{a}= & {} [2.06,1.00]^{\tiny \text{ T }},\quad \varvec{b}=[0.70,\sqrt{0.02},0.70]^{\tiny \text{ T }}, \nonumber \\ {\varvec{\theta }}= & {} [\varvec{a},\varvec{b}]^{\tiny \text{ T }}=[2.06,1.00,0.70,\sqrt{0.02},0.70]^{\tiny \text{ T }}, \end{aligned}$$

where $\Vert \varvec{b}\Vert =1$. In simulation, we generate a persistent excitation sequence with zero mean and unit variance as the input u(t) and take v(t) to be an uncorrelated noise sequence with zero mean and variance $\sigma ^2=0.10^2$. Taking the data length $L=3000$ and using the HSG algorithm to generate the parameter estimates $\hat{\varvec{a}}(t)$ and $\hat{\varvec{b}}(t)$ from the input–output data $\{y(t),\varvec{F}(t)$: $t=1,2,3\ldots \}$, the parameter estimates and their estimation errors are given in Tables 1, 2 and 3, and the estimation error $\delta :=\Vert \hat{{\varvec{\theta }}}-{\varvec{\theta }}\Vert /\Vert {\varvec{\theta }}\Vert $ versus t is shown in Fig. 1.

Table 1 HSG parameter estimates and errors

Full size table

Table 2 FF-HSG parameter estimates and errors ($\lambda =0.98$)

Full size table

Table 3 FF-HSG parameter estimates and errors ($\lambda =0.95$)

Full size table

From Tables 1, 2, 3 and Fig. 1, we can draw the following conclusions.

1.
The estimation errors become smaller with time t increasing—see Tables 1, 2 and 3.
2.
The FF-HSG algorithm has faster convergence rates than the HSG algorithm, and the convergence rates increase for appropriate small forgetting factors—see Fig. 1.

5 Conclusions

This paper investigates the performances of the HSG algorithm for bilinear-in-parameter systems. The theoretical analysis shows that the estimates converge to the true values under the persistent excitation conditions, and the simulation results verify the proposed convergence theorem. The method used in this paper can be extended to analyze the convergence of the identification algorithms for linear or nonlinear control systems [7, 19, 20, 43] and applied to hybrid switching-impulsive dynamical networks [18] and uncertain chaotic delayed nonlinear systems [17] or applied to other fields [6, 27, 28].

References

R. Abrahamssona, S.M. Kay, P. Stoica, Estimation of the parameters of a bilinear model with applications to submarine detection and system identification. Digit. Signal Process. 17(4), 756–773 (2007)
Article Google Scholar
A. Atitallah, S. Bedoui, K. Abderrahim, Identification of wiener time delay systems based on hierarchical gradient approach. in The 8th Vienna International Conference on Mathematical Modelling—MATHMOD, IFAC-Papers OnLine 48(1), 403–408 (2015)
E.W. Bai, An optimal two-stage identification algorithm for Hammerstein–Wiener nonlinear systems. Automatica 34(3), 333–338 (1998)
Article MathSciNet MATH Google Scholar
E.W. Bai, A blind approach to the Hammerstein–Wiener model identification. Automatica 38(6), 967–979 (2002)
Article MathSciNet MATH Google Scholar
E.W. Bai, Y. Liu, Least squares solutions of bilinear equations. Syst. Control Lett. 55(6), 466–472 (2006)
Article MathSciNet MATH Google Scholar
X. Cao, D.Q. Zhu, S.X. Yang, Multi-AUV target search based on bioinspired neurodynamics model in 3-D underwater environments. IEEE Trans. Neural Netw. Learn. Syst. (2016). doi:10.1109/TNNLS.2015.2482501
MathSciNet Google Scholar
Z.Z. Chu, D.Q. Zhu, S.X. Yang, Observer-based adaptive neural network trajectory tracking control for remotely operated Vehicle. IEEE Trans. Neural Netw. Learn. Syst. (2016). doi:10.1109/TNNLS
Google Scholar
F. Ding, G.J. Liu, X.P. Liu, Parameter estimation with scarce measurements. Automatica 47(8), 1646–1655 (2011)
Article MathSciNet MATH Google Scholar
F. Ding, X.M. Liu, M.M. Liu, The recursive least squares identification algorithm for a class of Wiener nonlinear systems. J. Franklin Inst. 353(7), 1518–1526 (2016)
Article MathSciNet MATH Google Scholar
F. Ding, X.M. Liu, X.Y. Ma, Kalman state filtering based least squares iterative parameter estimation for observer canonical state space systems using decomposition. J. Comput. Appl. Math. 301, 135–143 (2016)
Article MathSciNet MATH Google Scholar
F. Ding, X.H. Wang, Q.J. Chen, Y.S. Xiao, Recursive least squares parameter estimation for a class of output nonlinear systems based on the model decomposition. Circuits Syst. Signal Process. (2016). doi:10.1007/s00034-015-0190-6
MathSciNet MATH Google Scholar
F. Ding, X.M. Liu, Y. Gu, An auxiliary model based least squares algorithm for a dual-rate state space system with time-delay using the data filtering. J. Franklin Inst. 353(2), 398–408 (2016)
Article MathSciNet Google Scholar
F. Ding, Y. Gu, Performance analysis of the auxiliary model-based stochastic gradient parameter estimation algorithm for state-space systems with one-step state delay. Circuits Syst. Signal Process. 32(2), 585–599 (2013)
Article MathSciNet Google Scholar
M. Gilson, P. Van den Hof, Instrumental variable methods for closed-loop system identification. Automatica 41(2), 241–249 (2005)
Article MathSciNet MATH Google Scholar
G.C. Goodwin, K.S. Sin, Adaptive Filtering Prediction and Control (Prentice-Hall, Englewood Cliffs, 1984)
MATH Google Scholar
A. Haryanto, K.S. Hong, Maximum likelihood identification of Wiener–Hammerstein models. Mech. Syst. Signal Process. 41(1–2), 54–70 (2013)
Article Google Scholar
Y. Ji, X.M. Liu, F. Ding, New criteria for the robust impulsive synchronization of uncertain chaotic delayed nonlinear systems. Nonlinear Dyn. 79(1), 1–9 (2015)
Article MathSciNet MATH Google Scholar
Y. Ji, X.M. Liu, Unified synchronization criteria for hybrid switching-impulsive dynamical networks. Circuits Syst. Signal Process. 34(5), 1499–1517 (2015)
Article MathSciNet MATH Google Scholar
H. Li, Y. Shi, W. Yan, On neighbor information utilization in distributed receding horizon control for consensus-seeking. IEEE Trans. Cybern. (2016). doi:10.1109/TCYB.2015.2459719
Google Scholar
H. Li, Y. Shi, W. Yan, Distributed receding horizon control of constrained nonlinear vehicle formations with guaranteed $\gamma $-gain stability. Automatica 68, 148–154 (2016)
Article MathSciNet MATH Google Scholar
H. Li, Y. Shi, Robust H-infinity filtering for nonlinear stochastic systems with uncertainties and random delays modeled by Markov chains. Automatica 48(1), 159–166 (2012)
Article MathSciNet Google Scholar
L. Ljung, System Identification: Theory for the User, 2nd edn. (Prentice Hall, Englewood Cliffs, 1999)
J. Pan, X.H. Yang, H.F. Cai, B.X. Mu, Image noise smoothing using a modified Kalman filter. Neurocomputing 173, 1625–1629 (2016)
Article Google Scholar
J.D. Wang, Q.H. Zhang, L. Ljung, Revisiting Hammerstein system identification through the two-stage algorithm for bilinear parameter estimation. Automatica 45(11), 2627–2633 (2009)
Article MathSciNet MATH Google Scholar
D.Q. Wang, Hierarchical parameter estimation for a class of MIMO Hammerstein systems based on the reframed models. Appl. Math. Lett. 57, 13–19 (2016)
Article MathSciNet MATH Google Scholar
X.H. Wang, F. Ding, F.E. Alsaadi, T. Hayat, Convergence analysis of the hierarchical least squares algorithm for bilinear-in-parameter systems. Circuits Syst. Signal Process. (2016). doi:10.1007/s00034-016-0278-7
MathSciNet Google Scholar
T.Z. Wang, J. Qi, H. Xu et al., Fault diagnosis method based on FFT–RPCA–SVM for cascaded-multilevel inverter. ISA Trans. 60, 156–163 (2016)
Article Google Scholar
T.Z. Wang, H. Wu, M.Q. Ni et al., An adaptive confidence limit for periodic non-steady conditions fault detection. Mech. Syst. Signal Process. 72–73, 328–345 (2016)
Article Google Scholar
X.H. Wang, F. Ding, Recursive parameter and state estimation for an input nonlinear state space system using the hierarchical identification principle. Signal Process. 117, 208–218 (2015)
Article Google Scholar
D.Q. Wang, F. Ding, Parameter estimation algorithms for multivariable Hammerstein CARMA systems. Inf. Sci. 355, 237–248 (2016)
Article MathSciNet Google Scholar
Y.J. Wang, F. Ding, Recursive least squares algorithm and gradient algorithm for Hammerstein–Wiener systems using the data filtering. Nonlinear Dyn. 84(2), 1045–1053 (2016)
Article MathSciNet MATH Google Scholar
Y.J. Wang, F. Ding, Novel data filtering based parameter identification for multiple-input multiple-output systems using the auxiliary model. Automatica 71, 308–313 (2016)
Article MathSciNet MATH Google Scholar
Y.J. Wang, F. Ding, The filtering based iterative identification for multivariable systems. IET Control Theory Appl. 10(8), 894–902 (2016)
Article MathSciNet Google Scholar
Y.J. Wang, F. Ding, The auxiliary model based hierarchical gradient algorithms and convergence analysis using the filtering technique. Signal Process. 128, 212–221 (2016)
Article Google Scholar
X.H. Wang, F. Ding, Modelling and multi-innovation parameter identification for Hammerstein nonlinear state space systems using the filtering technique. Math. Comput. Modell. Dyn. Syst. 22(2), 113–140 (2016)
Article MathSciNet MATH Google Scholar
C. Wang, T. Tang, Recursive least squares estimation algorithm applied to a class of linear-in-parameters output error moving average systems. Appl. Math. Lett. 29, 36–41 (2014)
Article MathSciNet MATH Google Scholar
C. Wang, T. Tang, Several gradient-based iterative estimation algorithms for a class of nonlinear systems using the filtering technique. Nonlinear Dyn. 77(3), 769–780 (2014)
Article MathSciNet MATH Google Scholar
D.Q. Wang, W. Zhang, Improved least squares identification algorithm for multivariable Hammerstein systems. J. Franklin Inst. 352(11), 5292–5370 (2015)
Article MathSciNet Google Scholar
C. Wang, L. Zhu, Parameter identification of a class of nonlinear systems based on the multi-innovation identification theory. J. Franklin Inst. 352(10), 4624–4637 (2015)
Article MathSciNet Google Scholar
A. Wills, T.B. Schön, L. Ljung et al., Identification of Hammerstein–Wiener models. Automatica 49(1), 70–81 (2013)
Article MathSciNet MATH Google Scholar
W.L. Xiong, J.X. Ma, R.F. Ding, An iterative numerical algorithm for modeling a class of Wiener nonlinear systems. Appl. Math. Lett. 26(4), 487–493 (2013)
Article MathSciNet MATH Google Scholar
X.P. Xu, F. Wang, G.J. Liu, Identification of Hammerstein systems using key-term separation principle, auxiliary model and improved particle swarm optimisation algorithm. IET Signal Process. 7(8), 766–773 (2013)
Article Google Scholar
L. Xu, A proportional differential control method for a time-delay system using the Taylor expansion approximation. Appl. Math. Comput. 236, 391–399 (2014)
MathSciNet MATH Google Scholar
L. Xu, Application of the Newton iteration algorithm to the parameter estimation for dynamical systems. J. Comput. Appl. Math. 288, 33–43 (2015)
Article MathSciNet MATH Google Scholar
L. Xu, L. Chen, W.L. Xiong, Parameter estimation and controller design for dynamic systems from the step responses based on the Newton iteration. Nonlinear Dyn. 79(3), 2155–2163 (2015)
Article MathSciNet Google Scholar
L. Xu, The damping iterative parameter identification method for dynamical systems based on the sine signal measurement. Signal Process. 120, 660–667 (2016)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Nos. 61164015, 60474039) and the Key Research Project of Henan Higher Education Institutions (No. 16A120010).

Author information

Authors and Affiliations

School of Information Engineering, Nanchang Hangkong University, Nanchang, 330063, People’s Republic of China
Feng Ding
College of Mathematics and Information Science, Xinyang Normal University, Xinyang, 464000, People’s Republic of China
Xuehai Wang

Authors

Feng Ding
View author publications
You can also search for this author in PubMed Google Scholar
Xuehai Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Ding.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, F., Wang, X. Hierarchical Stochastic Gradient Algorithm and its Performance Analysis for a Class of Bilinear-in-Parameter Systems. Circuits Syst Signal Process 36, 1393–1405 (2017). https://doi.org/10.1007/s00034-016-0367-7

Download citation

Received: 15 March 2016
Revised: 06 July 2016
Accepted: 08 July 2016
Published: 21 July 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s00034-016-0367-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Hierarchical Stochastic Gradient Algorithm and its Performance Analysis for a Class of Bilinear-in-Parameter Systems

Abstract

Similar content being viewed by others

Convergence Analysis of the Hierarchical Least Squares Algorithm for Bilinear-in-Parameter Systems

Gradient Parameter Estimation of a Class of Nonlinear Systems Based on the Maximum Likelihood Principle

Hierarchical Gradient-Based Iterative Parameter Estimation Algorithms for a Nonlinear Feedback System Based on the Hierarchical Identification Principle

1 Introduction

2 System Description and the HSG Algorithm

Assumption 1

3 The Convergence Analysis

Lemma 1

Theorem 1

Proof

4 Example

5 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical Stochastic Gradient Algorithm and its Performance Analysis for a Class of Bilinear-in-Parameter Systems

Abstract

Similar content being viewed by others

Convergence Analysis of the Hierarchical Least Squares Algorithm for Bilinear-in-Parameter Systems

Gradient Parameter Estimation of a Class of Nonlinear Systems Based on the Maximum Likelihood Principle

Hierarchical Gradient-Based Iterative Parameter Estimation Algorithms for a Nonlinear Feedback System Based on the Hierarchical Identification Principle

1 Introduction

2 System Description and the HSG Algorithm

Assumption 1

3 The Convergence Analysis

Lemma 1

Theorem 1

Proof

4 Example

5 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation