Modeling a nonlinear process using the exponential autoregressive time series model

Xu, Huan; Ding, Feng; Yang, Erfu

doi:10.1007/s11071-018-4677-0

Modeling a nonlinear process using the exponential autoregressive time series model

Original Paper
Published: 06 December 2018

Volume 95, pages 2079–2092, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Nonlinear Dynamics Aims and scope Submit manuscript

Modeling a nonlinear process using the exponential autoregressive time series model

Download PDF

860 Accesses
39 Citations
1 Altmetric
Explore all metrics

Abstract

The parameter estimation methods for the nonlinear exponential autoregressive (ExpAR) model are investigated in this work. Combining the hierarchical identification principle with the negative gradient search, we derive a hierarchical stochastic gradient algorithm. Inspired by the multi-innovation identification theory, we develop a hierarchical-based multi-innovation identification algorithm for the ExpAR model. Introducing two forgetting factors, a variant of the hierarchical-based multi-innovation identification algorithm is proposed. Moreover, to compare and demonstrate the serviceability of these algorithms, a nonlinear ExpAR process is taken as an example in the simulation.

Gradient-based Parameter Estimation for a Nonlinear Exponential Autoregressive Time-series Model by Using the Multi-innovation

Article 06 January 2023

Two-stage Gradient-based Iterative Estimation Methods for Controlled Autoregressive Systems Using the Measurement Data

Article 06 November 2019

Decomposition-Based Gradient Estimation Algorithms for Multivariate Equation-Error Autoregressive Systems Using the Multi-innovation Theory

Article 07 September 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Automotive Engineering

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Nonlinear time series models can reveal nonlinear features of many practical processes, and they are widely used in finance, ecology and some other fields [1]. The exponential autoregressive (ExpAR) model [2] is a significant kind of nonlinear time series models. In the early days, the ExpAR model is applied to the statistical analysis of the Canadian lynx data [3, 4], and then it shows the appropriateness in describing certain nonlinear behaviors, such as amplitude-dependent frequency, jump phenomena and limit cycle, and in conducting accurate multistep-ahead predictions [5]. In recent years, a good deal of publications are devoted to studying the stationarity, estimation and application of the ExpAR model. For example, Chen et al. discussed the stationary conditions of several generalized ExpAR models, developed a variable projection based estimation algorithm, and adopted the generalized ExpAR models to model and predict the monthly mean thickness ozone column [6].

Analyzing and controlling a nonlinear time series process relies on an appropriate dynamical model. System identification is a common tool to construct the mathematical models of dynamical systems, parameter estimation is generating the unknown system parameters via a set of observations. System identification and parameter estimation are widely used in many areas [7,8,9]. Many identification methods such as the maximum likelihood [10], the genetic algorithm [11], the blind identification [12] and the subspace identification [13] have been developed for decades. The gradient-based methods are a class of fundamental system identification methods. Combining with recursive and iterative techniques, the gradient-based methods can be provided for identifying many kinds of systems. However, the gradient-based methods have poor parameter estimation accuracies. By introducing the forgetting factor, some variants of the gradient-based identification algorithms are derived, which have improved parameter estimation accuracies. For instance, Chen and Jiang developed a gradient-based identification method with several forgetting factors for nonlinear two-variable difference systems [14].

In the area of system identification, many techniques have been exploited to improve the identification results. For example, the hierarchical identification has been developed as a significant branch of system identification [15]. Recently, a hierarchical gradient-based iterative algorithm was used to simultaneously estimate the unknown amplitudes and angular frequencies of multi-frequency signals [16]. In addition, the multi-innovation identification has shown the effectiveness in nonlinear system identification [17]. By expanding a scalar innovation into a multi-dimensional vector, a multi-innovation stochastic gradient (SG) algorithm was derived for Wiener–Hammerstein systems with backlash [18]; a multi-innovation fractional order SG algorithm was developed for Hammerstein nonlinear ARMAX systems [19]. However, there is few research on the nonlinear time series model identification using these novel identification techniques.

This communique investigates the recursive identification algorithms for the ExpAR model. Applying the hierarchical identification principle, the ExpAR model is decomposed into two sub-identification (Sub-ID) models, one of which contains the unknown parameter vector of the linear subsystem, and the other contains the unknown parameter of the nonlinear part. With the negative gradient search, two unknown parameter sets are estimated interactively. In order to make the most of the information, the scalar innovations are expanded into innovation vectors. Moreover, two forgetting factors are introduced into the multi-innovation algorithm, so that we can present a new recursive identification algorithm with improved parameter estimation accuracy. In brief, we list the following contributions provided in this paper.

Considering the difficulty of the nonlinear optimal problem arising in identifying the ExpAR model, we combine the hierarchical identification principle with the negative gradient search so as to derive a hierarchical stochastic gradient (H-SG) algorithm for the ExpAR model.
Using the multi-innovation identification theory, a hierarchical multi-innovation stochastic gradient (H-MISG) algorithm is presented for the ExpAR model. Introducing two forgetting factors, we obtain a modified H-MISG algorithm.
Comparing the parameter estimation accuracies of the proposed hierarchical algorithms, we find that the modified version of the H-MISG algorithm has improved parameter estimation accuracy and can be effectively used to identify the ExpAR model.

2 Problem description

Some notations used throughout this paper are first introduced in Table 1.

Table 1 The notations used throughout this paper

Full size table

Given a time series $\{x_k,x_{k-1},x_{k-2},\ldots \}$, an ExpAR model can be expressed as

$$\begin{aligned} x_k= & {} \left( \alpha _1+\beta _1\mathrm{e}^{-\xi x^2_{k-1}}\right) x_{k-1}\nonumber \\&+\,\left( \alpha _2+\beta _2\mathrm{e}^{-\xi x^2_{k-1}}\right) x_{k-2}+\cdots \nonumber \\&+\,\left( \alpha _n+\beta _n\mathrm{e}^{-\xi x^2_{k-1}}\right) x_{k-n}+\varepsilon _k, \end{aligned}$$

(1)

where $\varepsilon _k$ is a white noise with zero mean, n denotes the system degree, $\alpha _i$, $\beta _i$ and $\xi $ are the model parameters to be estimated.

When the parameters $\beta _i=0$, $i=1,2,\ldots ,n$, Eq. (1) reduces to an autoregressive (AR) model which has no nonlinear dynamics.

The form in (1) is the classic ExpAR model, some modified versions have been presented. For instance, in order to give a more sophisticated specification of the dynamics of the characteristic roots of AR models, Ozaki derived a variant of the ExpAR model in [3] using the Hermite type polynomials:

$$\begin{aligned} x_k=\sum \limits _{i=1}^{n}\Big [\alpha _i +\Big (\beta _{i0}+\sum \limits _{j=1}^{m_i} \beta _{ij}x^j_{k-1}\Big )\mathrm{e}^{-\xi x^2_{k-1}}\Big ]x_{k-i}+\varepsilon _k. \end{aligned}$$

Introducing a time-delay d and a scalar parameter $\zeta $, Teräsvirta developed a different variant of the ExpAR model in [4]:

$$\begin{aligned} x_k= & {} \left[ \alpha _0+\beta _0\mathrm{e}^{-\xi (x_{k-d}-\zeta )^2}\right] \\&+\, \sum \limits _{i=1}^{n}\left[ \alpha _i+\beta _i\mathrm{e}^{-\xi (x_{k-d} -\zeta )^2}\right] x_{k-i}+\varepsilon _k. \end{aligned}$$

Some other generalized ExpAR models were summarized in [6]. After parametrization, we can derive the corresponding identification models, which have different parameter and information vectors, for the ExpAR family. This paper copes with the recursive identification for the classic ExpAR model. The proposed hierarchical algorithms are also appropriate for other ExpAR models.

Assume that the degree n is known, the data $x_k$ is measurable. The initial values are taken as $x_k=0$ and $\varepsilon _k=0$ for $t\le 0$.

It is obvious that $x_k$ is linear with respect to the parameters $\alpha _i$ and $\beta _i$, and is nonlinear with respect to the parameter $\xi $. Define the parameter vectors of the linear subsystem

$$\begin{aligned} \varvec{\alpha }:=\,&[\alpha _1,\alpha _2,\ldots ,\alpha _n]^{\tiny \mathrm{T}}\in \mathbb {R}^n,\\ \varvec{\beta }:=\,&[\beta _1,\beta _2,\ldots ,\beta _n]^{\tiny \mathrm{T}}\in \mathbb {R}^n, \end{aligned}$$

and the information vector

$$\begin{aligned} \varvec{X}_k := [x_{k-1},x_{k-2},\ldots ,x_{k-n}]^{\tiny \mathrm{T}}\in \mathbb {R}^n. \end{aligned}$$

Then, Eq. (1) can be transformed into

$$\begin{aligned} x_k= & {} \sum \limits _{i=1}^{n}\alpha _ix_{k-i}+\mathrm{e}^{-\xi x^2_{k-1}}\sum \limits _{i=1}^{n}\beta _ix_{k-i}+\varepsilon _k\nonumber \\= & {} \varvec{X}^{\tiny \mathrm{T}}_k\varvec{\alpha }+\mathrm{e}^{-\xi x^2_{k-1}}\varvec{X}^{\tiny \mathrm{T}}_k\varvec{\beta }+\varepsilon _k. \end{aligned}$$

(2)

Furthermore, define the following vectors:

$$\begin{aligned}&\varvec{\varTheta }:=\, [\varvec{\alpha }^{\tiny \mathrm{T}},\varvec{\beta }^{\tiny \mathrm{T}}]^{\tiny \mathrm{T}}\in \mathbb {R}^{2n},\\&\varvec{\phi }(\xi ,k) :=\, [\varvec{X}^{\tiny \mathrm{T}}_k,\mathrm{e}^{-\xi x^2_{k-1}}\varvec{X}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}\in \mathbb {R}^{2n}. \end{aligned}$$

Then, Eq. (2) can be equivalently transformed into the identification model

$$\begin{aligned} x_k=\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\varvec{\varTheta }+\varepsilon _k. \end{aligned}$$

(3)

Since the unknown parameter of the nonlinear subsystem $\xi $ exists in $\varvec{\phi }(\xi ,k)$, the identification problem becomes a complex nonlinear optimization problem and the least-squares method cannot be used for parameter estimation. The previous work aims to explore new recursive identification methods for the ExpAR model.

3 The hierarchical stochastic gradient algorithm

Hierarchical identification is the decomposition based identification. The key idea is to decompose the identification model into several subsystems, such that the scale of the optimization problem becomes small [20]. In this section, by the hierarchical identification principle, the ExpAR model is decomposed into two subsystems, one of which contains $\varvec{\varTheta }$, and the other contains $\xi $, both these two parameter sets are to be estimated. In addition, the negative gradient search is widely adopted to deal with some optimization problems and to determine the extreme point of the objective function. Applying the negative gradient search, an H-SG algorithm is proposed for the ExpAR model.

Define the information item $\psi (\varvec{\beta })$ and the intermediate variable $x_{1,k}$ as

$$\begin{aligned}&\psi (\varvec{\beta }) :=\, \varvec{X}^{\tiny \mathrm{T}}_k\varvec{\beta }\in \mathbb {R}, \\&x_{1,k} :=\, x_k-\varvec{X}^{\tiny \mathrm{T}}_k\varvec{\alpha }\in \mathbb {R}. \end{aligned}$$

From (2), we can see that the ExpAR model is decomposed into these two Sub-ID models:

$$\begin{aligned}&S_1: x_k = \varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\varvec{\varTheta }+\varepsilon _k, \end{aligned}$$

(4)

$$\begin{aligned}&S_2: x_{1,k} = \psi (\varvec{\beta })\mathrm{e}^{-\xi x^2_{k-1}}+\varepsilon _k. \end{aligned}$$

(5)

The parameter sets $\varvec{\varTheta }$ and $\xi $ in Sub-ID models (4) and (5) contain all the parameters to be estimated. The parameter $\xi $ in $\varvec{\phi }(\xi ,k)$ and the parameter vector $\varvec{\beta }$ in $\psi (\varvec{\beta })$ are the associate terms between these two Sub-ID models. Decomposing the identification model in (2) or (3) into the above fictitious subsystems, we can obtain a hierarchical structure which is demonstrated in Fig. 1.

Define two criterion functions

$$\begin{aligned}&J_1(\varvec{\varTheta }) :=\, \frac{1}{2}[x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\varvec{\varTheta }]^2, \end{aligned}$$

(6)

$$\begin{aligned}&J_2(\xi ) :=\, \frac{1}{2}[x_{1,k}-\psi (\varvec{\beta })\mathrm{e}^{-\xi x^2_{k-1}}]^2. \end{aligned}$$

(7)

Computing the gradients of $J_1(\varvec{\varTheta })$ and $J_2(\xi )$, we have

$$\begin{aligned} \mathrm{grad}[J_1(\varvec{\varTheta })]= & {} \frac{\partial J_1(\varvec{\varTheta })}{\partial \varvec{\varTheta }} =-\varvec{\phi }(\xi ,k)[x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\varvec{\varTheta }],\\ \mathrm{grad}[J_2(\xi )]= & {} \frac{\partial J_2(\xi )}{\partial \xi }\\= & {} x^2_{k-1}\psi (\varvec{\beta })\mathrm{e}^{-\xi x^2_{k-1}} {[}x_{1,k}-\psi (\varvec{\beta })\mathrm{e}^{-\xi x^2_{k-1}}] \\= & {} x^2_{k-1}\psi (\varvec{\beta })\mathrm{e}^{-\xi x^2_{k-1}} {[}x_k-\varvec{X}^{\tiny \mathrm{T}}_k\varvec{\alpha }\\&-\psi (\varvec{\beta })\mathrm{e}^{-\xi x^2_{k-1}}] \\= & {} -\varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\xi ,k) {[}x_k-\varvec{X}^{\tiny \mathrm{T}}_k\varvec{\alpha }-\psi (\varvec{\beta })\mathrm{e}^{-\xi x^2_{k-1}}] \\= & {} -\varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\xi ,k)[x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\varvec{\varTheta }], \end{aligned}$$

where

$$\begin{aligned} \varvec{\phi }'(\xi ,k):=\,&\frac{\partial \varvec{\phi }(\xi ,k)}{\partial \xi }\\ =&[\mathbf{0}^{\tiny \mathrm{T}}_n,-x^2_{k-1}\mathrm{e}^{-\xi x^2_{k-1}}\varvec{X}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}\in \mathbb {R}^{2n}. \end{aligned}$$

Let $\hat{\varvec{\varTheta }}_k$ and $\hat{\xi }_{k}$ signify the estimates of $\varvec{\varTheta }$ and $\xi $ at time k, $\mu _{1,k}$ and $\mu _{2,k}$ represent the step-sizes to be given later. Employing the negative gradient search, we have:

$$\begin{aligned} \hat{\varvec{\varTheta }}_k= & {} \hat{\varvec{\varTheta }}_{k-1} -\mu _{1,k}\mathrm{grad}[J_1(\hat{\varvec{\varTheta }}_{k-1})]\nonumber \\= & {} \hat{\varvec{\varTheta }}_{k-1}+\mu _{1,k}\varvec{\phi }(\xi ,k)[x_k -\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\hat{\varvec{\varTheta }}_{k-1}], \end{aligned}$$

(8)

$$\begin{aligned} \hat{\xi }_{k}= & {} \hat{\xi }_{k-1}-\mu _{2,k} \mathrm{grad}[J_2(\hat{\xi }_{k-1})] \nonumber \\= & {} \hat{\xi }_{k-1}+\mu _{2,k}\varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\hat{\xi }_{k-1},k) {[}x_k{-}\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\varvec{\varTheta }].\nonumber \\ \end{aligned}$$

(9)

The following finds the optimal step-sizes $\mu _{1,k}$ and $\mu _{2,k}$. One method is to apply the one-dimensional search, that is, to solve the optimization problems

$$\begin{aligned}&\mathop {\min }_{\mu _{1,k}\ge 0}J_1\{\hat{\varvec{\varTheta }}_{k-1}- \mu _{1,k}\mathrm{grad}[J_1(\hat{\varvec{\varTheta }}_{k-1})]\},\\&\mathop {\min }_{\mu _{2,k}\ge 0}J_2\{\hat{\xi }_{k-1}- \mu _{2,k}\mathrm{grad}[J_2(\hat{\xi }_{k-1})]\}. \end{aligned}$$

Remark 1

The one-dimensional search method is a fundamental method of finding the optimal step-size in the minimization problem. The key idea is to determine the negative gradient direction (i.e., the direction where the criterion function descends fastest) and to compute the step-size, which makes the criterion function minimal, by the one-dimensional search of the negative gradient direction.

For the sake of convenience, we define the innovations $e_{1,k}$ and $e_{2,k}$ as

$$\begin{aligned} e_{1,k} :=\,&x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\hat{\varvec{\varTheta }}_{k-1}\in \mathbb {R}, \end{aligned}$$

(10)

$$\begin{aligned} e_{2,k} :=\,&x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\varvec{\varTheta }\in \mathbb {R}. \end{aligned}$$

(11)

Substituting $\varvec{\varTheta }=\hat{\varvec{\varTheta }}_k$ into (6) gives

$$\begin{aligned} g_1[\mu _{1,k}] :=\,&J_1[\hat{\varvec{\varTheta }}_k]=\frac{1}{2}[x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\hat{\varvec{\varTheta }}_k]^2 \\ =\,&\frac{1}{2}\{x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)[\hat{\varvec{\varTheta }}_{k-1}+\mu _{1,k}\varvec{\phi }(\xi ,k)e_{1,k}]\}^2 \\ =\,&\frac{1}{2}\{x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\hat{\varvec{\varTheta }}_{k-1}-\mu _{1,k}\Vert \varvec{\phi }(\xi ,k)\Vert ^2e_{1,k}\}^2 \\ =\,&\frac{1}{2}\{e_{1,k}-\mu _{1,k}\Vert \varvec{\phi }(\xi ,k)\Vert ^2e_{1,k}\}^2 \\ =\,&\frac{1}{2}\{1-\mu _{1,k}\Vert \varvec{\phi }(\xi ,k)\Vert ^2\}^2e_{1,k}^2. \end{aligned}$$

In order to make $J_1[\hat{\varvec{\varTheta }}_k]$ minimum, we take the optimal step-size $\mu _{1,k}$ as

$$\begin{aligned} \mu _{1,k}=\frac{1}{\Vert \varvec{\phi }(\xi ,k)\Vert ^2}. \end{aligned}$$

(12)

To avoid the denominator being zero, the above equation can be modified to

$$\begin{aligned} \mu _{1,k}=\frac{1}{1+\Vert \varvec{\phi }(\xi ,k)\Vert ^2}. \end{aligned}$$

(13)

Substituting (12) or (13) into (8), we obtain the gain vector $\frac{\varvec{\varvec{\phi }}(\xi ,k)}{\Vert \varvec{\varvec{\phi }}(\xi ,k)\Vert ^2}$ or $\frac{\varvec{\varvec{\phi }}(\xi ,k)}{1+\Vert \varvec{\varvec{\phi }}(\xi ,k)\Vert ^2}$. Neither of these two gain vectors approaches zero with increasing k. From (8), we can see that when $\hat{\varvec{\varTheta }}_{k-1}$ is close to $\varvec{\varTheta }$, the large gain vector $\mu _{1,k}\varvec{\phi }(\xi ,k)$ will make $\hat{\varvec{\varTheta }}_k$ deviate from $\varvec{\varTheta }$. To address this problem, we let the step-size $\mu _{1,k}$ tend to zero with increasing k. Therefore, $\mu _{1,k}$ is taken as

$$\begin{aligned} \mu _{1,k} :=\,&\frac{1}{r_{1,k}}, \nonumber \\ r_{1,k} =\,&r_{1,k-1}+\Vert \varvec{\phi }(\xi ,k)\Vert ^2. \end{aligned}$$

(14)

Similarly, substituting $\xi =\hat{\xi }_{k}$ into (7) gives

$$\begin{aligned} g_2[\mu _{2,k}] :=\,&J_2[\hat{\xi }_{k}]=\frac{1}{2}[x_{1,k} -\psi (\varvec{\beta })\mathrm{e}^{-\hat{\xi }_{k} x^2_{k-1}}]^2 \\ =\,&\frac{1}{2}[x_k-\varvec{X}^{\tiny \mathrm{T}}_k\varvec{\alpha }-\psi (\varvec{\beta }) \mathrm{e}^{-\hat{\xi }_{k} x^2_{k-1}}]^2 \\ =\,&\frac{1}{2}[x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k},k)\varvec{\varTheta }]^2. \end{aligned}$$

Plugging the first-order Taylor expansion of $\varvec{\phi }(\xi ,k)$ at $\xi =\hat{\xi }_{k-1}$ into the above equation, we have

$$\begin{aligned} g_2[\mu _{2,k}]&= \frac{1}{2}\{x_k-[\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\\&\quad +\,[\varvec{\phi }'(\hat{\xi }_{k-1},k)]^{\tiny \mathrm{T}}(\hat{\xi }_{k}-\hat{\xi }_{k-1})\\&\quad +\, o(\hat{\xi }_{k}-\hat{\xi }_{k-1})]\varvec{\varTheta }\}^2 \\&= \frac{1}{2}\{x_k-[\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\\&\quad +\,[\varvec{\phi }'(\hat{\xi }_{k-1},k)]^{\tiny \mathrm{T}}[\mu _{2,k}\varvec{\varTheta }^{\tiny \mathrm{T}} \varvec{\phi }'(\hat{\xi }_{k-1},k)e_{2,k}]\\&\quad +\,o(\hat{\xi }_{k} -\hat{\xi }_{k-1})]\varvec{\varTheta }\}^2 \\&= \frac{1}{2}[x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k) \varvec{\varTheta }\\&\quad -[\varvec{\phi }'(\hat{\xi }_{k-1},k)]^{\tiny \mathrm{T}} {[}\mu _{2,k}\varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\hat{\xi }_{k-1},k) e_{2,k}]\varvec{\varTheta }\\&\quad +\,o(\hat{\xi }_{k}-\hat{\xi }_{k-1})]^2 \\&= \frac{1}{2}[e_{2,k}-\mu _{2,k}\Vert \varvec{\varTheta }^{\tiny \mathrm{T}} \varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2e_{2,k}\\&\quad +\,o(\hat{\xi }_{k}-\hat{\xi }_{k-1})]^2 \\&= \frac{1}{2}[1-\mu _{2,k}\Vert \varvec{\varTheta }^{\tiny \mathrm{T}} \varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2]^2e_{2,k}^2\\&\quad +\,o(\hat{\xi }_{k}-\hat{\xi }_{k-1})^2. \end{aligned}$$

The optimal $\mu _{2,k}$ can be obtained by minimizing $g_2[\mu _{2,k}]$, i.e., by solving the equation

$$\begin{aligned} 1-\mu _{2,k}\Vert \varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2=0. \end{aligned}$$

Thus, the step-size $\mu _{2,k}$ can be chosen as

$$\begin{aligned} \mu _{2,k}=\frac{1}{\Vert \varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2}. \end{aligned}$$

Similarly, considering the stability of the identification algorithm, the above equation can be modified to

$$\begin{aligned}&\mu _{2,k}:=\,\frac{1}{r_{2,k}}, \nonumber \\&r_{2,k}=\,r_{2,k-1}+\Vert \varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2. \end{aligned}$$

(15)

Plugging (10), (14) into (8), and (11), (15) into (9), we obtain the following recursive relations:

$$\begin{aligned}&\hat{\varvec{\varTheta }}_k = \hat{\varvec{\varTheta }}_{k-1} +\frac{1}{r_{1,k}}\varvec{\phi }(\xi ,k)e_{1,k}, \end{aligned}$$

(16)

$$\begin{aligned}&e_{1,k} = x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\xi ,k)\hat{\varvec{\varTheta }}_{k-1}, \end{aligned}$$

(17)

$$\begin{aligned}&r_{1,k} = r_{1,k-1}+\Vert \varvec{\phi }(\xi ,k)\Vert ^2, \end{aligned}$$

(18)

$$\begin{aligned}&\hat{\xi }_{k} = \hat{\xi }_{k-1}+\frac{1}{r_{2,k}} \varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\hat{\xi }_{k-1},k)e_{2,k}, \end{aligned}$$

(19)

$$\begin{aligned}&e_{2,k} = x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\varvec{\varTheta }, \end{aligned}$$

(20)

$$\begin{aligned}&r_{2,k} = r_{2,k-1}+\Vert \varvec{\varTheta }^{\tiny \mathrm{T}}\varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2. \end{aligned}$$

(21)

Here, a difficulty arises. Since the parameter sets $\varvec{\varTheta }$ and $\xi $, existing in the right-hand sides of (16)–(21), are to be estimated later, the algorithm in (16)–(21) cannot be realized. Inspired by the hierarchical identification principle, we replace the unknown parameters $\xi $ in (16)–(18) and $\varvec{\varTheta }$ in (19)–(21) with the estimates $\hat{\xi }_{k-1}$ and $\hat{\varvec{\varTheta }}_k$. It follows that

$$\begin{aligned}&\hat{\varvec{\varTheta }}_k = \hat{\varvec{\varTheta }}_{k-1} +\frac{1}{r_{1,k}}\varvec{\phi }(\hat{\xi }_{k-1},k)e_{1,k}, \end{aligned}$$

(22)

$$\begin{aligned}&e_{1,k} = x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\hat{\varvec{\varTheta }}_{k-1}, \end{aligned}$$

(23)

$$\begin{aligned}&r_{1,k} = r_{1,k-1}+\Vert \varvec{\phi }(\hat{\xi }_{k-1},k)\Vert ^2, \end{aligned}$$

(24)

$$\begin{aligned}&\varvec{\phi }(\hat{\xi }_{k-1},k) = [\varvec{X}^{\tiny \mathrm{T}}_k,\mathrm{e}^{-\hat{\xi }_{k-1} x^2_{k-1}}\varvec{X}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}, \end{aligned}$$

(25)

$$\begin{aligned}&\varvec{X}_k = [x_{k-1},x_{k-2},\ldots ,x_{k-n}]^{\tiny \mathrm{T}}, \end{aligned}$$

(26)

$$\begin{aligned}&\hat{\varvec{\varTheta }}_k = [\hat{\varvec{\alpha }}^{\tiny \mathrm{T}}_k,\hat{\varvec{\beta }}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}, \end{aligned}$$

(27)

$$\begin{aligned}&\hat{\xi }_{k} = \hat{\xi }_{k-1}+\frac{1}{r_{2,k}}\hat{\varvec{\varTheta }}^{\tiny \mathrm{T}}_k \varvec{\phi }'(\hat{\xi }_{k-1},k)e_{2,k}, \end{aligned}$$

(28)

$$\begin{aligned}&e_{2,k} = x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\hat{\varvec{\varTheta }}_k, \end{aligned}$$

(29)

$$\begin{aligned}&r_{2,k} = r_{2,k-1}+\Vert \hat{\varvec{\varTheta }}^{\tiny \mathrm{T}}_k \varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2, \end{aligned}$$

(30)

$$\begin{aligned}&\varvec{\phi }'(\hat{\xi }_{k-1},k) = [\mathbf{0}_n^{\tiny \mathrm{T}},-x^2_{k-1} \mathrm{e}^{-\hat{\xi }_{k-1} x^2_{k-1}}\varvec{X}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}. \end{aligned}$$

(31)

The above computational process forms the H-SG algorithm for the ExpAR model.

The process of computing $\hat{\varvec{\varTheta }}_k$ and $\hat{\xi }_{k}$ by the H-SG algorithm is exhibited in the following list.

1.
To initialize, let $k=1$, $\hat{\varvec{\varTheta }}_0= {[}\hat{\varvec{\alpha }}^{\tiny \mathrm{T}}_0,\hat{\varvec{\beta }}^{\tiny \mathrm{T}}_0]^{\tiny \mathrm{T}}=\mathbf{1}_{2n}/p_0$, $\hat{\xi }_0=1/p_0$, $p_0=10^6$, $r_{1,0}=1$ and $r_{2,0}=1$, give an error tolerance $\eta >0$.
2.
Collect the measurement data $x_k$, form the information vectors $\varvec{X}_k$ and $\varvec{\phi }(\hat{\xi }_{k-1},k)$ by (26) and (25).
3.
Compute the reciprocal of the step-size $r_{1,k}$ by (24) and the innovation $e_{1,k}$ by (23).
4.
Update the parameter estimation vector $\hat{\varvec{\varTheta }}_k$ by (22), and read out $\hat{\varvec{\alpha }}_k$ and $\hat{\varvec{\beta }}_k$ from $\hat{\varvec{\varTheta }}_k$ in (27).
5.
Form the derivative of $\varvec{\phi }(\hat{\xi }_{k-1},k)$ with respect to $\hat{\xi }_{k-1}$ by (31).
6.
Compute the reciprocal of the step-size $r_{2,k}$ by (30) and the innovation $e_{2,k}$ by (29).
7.
Update the parameter estimate $\hat{\xi }_{k}$ by (28).
8.
Compare $\{\hat{\varvec{\varTheta }}_k,\hat{\xi }_{k}\}$ with $\{\hat{\varvec{\varTheta }}_{k-1},\hat{\xi }_{k-1}\}$: if $\Vert \hat{\varvec{\varTheta }}_k-\hat{\varvec{\varTheta }}_{k-1}\Vert +\Vert \hat{\xi }_{k}-\hat{\xi }_{k-1}\Vert >\eta $, increase k by 1 and return to Step 2; otherwise, terminate this computational process.

The H-SG algorithm in (22)–(31) estimates the parameter sets $\varvec{\varTheta }$ and $\xi $ in an interactive way. The innovations $e_{1,k}$ and $e_{2,k}$ in (23) and (29) are scalars. In order to make the most of the information, we derive an interactive multi-innovation parameter estimation method in the next section.

4 The hierarchical multi-innovation stochastic gradient algorithm

The innovation is the useful information which can improve the parameter and state estimation accuracy. The multi-innovation identification is the innovation expansion based identification [21]. Applying the multi-innovation identification theory, we expand the scalar innovations $e_{1,k}$ and $e_{2,k}$ in (23) and (29), and develop an H-MISG algorithm for the ExpAR model in this section.

Let l denote the innovation length. Expand the scalar innovations in (23) and (29) into the l-dimensional vectors:

$$\begin{aligned} \varvec{E}_1(l) :=\,&\left[ \begin{array}{c} x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k) \hat{\varvec{\varTheta }}_{k-1}\\ x_{k-1}-\varvec{\phi }^{\tiny \mathrm{T}} (\hat{\xi }_{k-1},k-1)\hat{\varvec{\varTheta }}_{k-1}\\ \vdots \\ x_{k-l+1}-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k-l+1) \hat{\varvec{\varTheta }}_{k-1}\\ \end{array}\right] \in \mathbb {R}^l, \\ \varvec{E}_2(l) :=\,&\left[ \begin{array}{c} x_k-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\hat{\varvec{\varTheta }}_k\\ x_{k-1}-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k-1)\hat{\varvec{\varTheta }}_k\\ \vdots \\ x_{k-l+1}-\varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k-l+1)\hat{\varvec{\varTheta }}_k\\ \end{array}\right] \in \mathbb {R}^l. \end{aligned}$$

Define the following stacked vector and matrix:

$$\begin{aligned} \varvec{X}(l) :=\,&\left[ \begin{array}{c} x_k\\ x_{k-1}\\ \vdots \\ x_{k-l+1}\\ \end{array}\right] \in \mathbb {R}^l, \\ \varvec{\varPhi }(l,\hat{\xi }_{k-1}) :=\,&\left[ \begin{array}{c} \varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k)\\ \varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k-1)\\ \vdots \\ \varvec{\phi }^{\tiny \mathrm{T}}(\hat{\xi }_{k-1},k-l+1)\\ \end{array}\right] ^{\tiny \mathrm{T}} \in \mathbb {R}^{(2n)\times l}. \end{aligned}$$

Then, the innovation vectors can be equivalently transformed into

$$\begin{aligned} \varvec{E}_1(l)= & {} \varvec{X}(l)-\varvec{\varPhi }^{\tiny \mathrm{T}}(l,\hat{\xi }_{k-1})\hat{\varvec{\varTheta }}_{k-1},\\ \varvec{E}_2(l)= & {} \varvec{X}(l)-\varvec{\varPhi }^{\tiny \mathrm{T}}(l,\hat{\xi }_{k-1})\hat{\varvec{\varTheta }}_k. \end{aligned}$$

Since $\varvec{E}_1(l)=e_{1,k}$, $\varvec{\varPhi }(l,\hat{\xi }_{k-1})=\varvec{\phi }(\hat{\xi }_{k-1},k)$ and $\varvec{X}(l)=x_k$ for $l=1$, Eq. (22) can be written as

$$\begin{aligned} \hat{\varvec{\varTheta }}_k=\hat{\varvec{\varTheta }}_{k-1} +\frac{1}{r_{1,k}}\varvec{\varPhi }(l,\hat{\xi }_{k-1})\varvec{E}_1(l). \end{aligned}$$

Similarly, Eq. (28) can be transformed into

$$\begin{aligned} \hat{\xi }_{k}=\hat{\xi }_{k-1}+\frac{1}{r_{2,k}} \hat{\varvec{\varTheta }}^{\tiny \mathrm{T}}_k\varvec{\varPhi }'(l,\hat{\xi }_{k-1})\varvec{E}_2(l), \end{aligned}$$

where

$$\begin{aligned} \varvec{\varPhi }'(l,\hat{\xi }_{k-1}) :=\,&[\varvec{\phi }'(\hat{\xi }_{k-1},k), \varvec{\phi }'(\hat{\xi }_{k-1},k-1),\\&\ldots ,\varvec{\phi }'(\hat{\xi }_{k-1},k-l+1)] \in \mathbb {R}^{(2n)\times l}. \end{aligned}$$

In summary, the H-MISG algorithm for the ExpAR model can be derived as follows:

$$\begin{aligned}&\hat{\varvec{\varTheta }}_k = \hat{\varvec{\varTheta }}_{k-1}+\frac{1}{r_{1,k}} \varvec{\varPhi }(l,\hat{\xi }_{k-1})\varvec{E}_1(l), \end{aligned}$$

(32)

$$\begin{aligned}&\varvec{E}_1(l)= \varvec{X}(l)-\varvec{\varPhi }^{\tiny \mathrm{T}}(l,\hat{\xi }_{k-1})\hat{\varvec{\varTheta }}_{k-1}, \end{aligned}$$

(33)

$$\begin{aligned}&r_{1,k} = r_{1,k-1}+\Vert \varvec{\phi }(\hat{\xi }_{k-1},k)\Vert ^2, \end{aligned}$$

(34)

$$\begin{aligned}&\varvec{X}(l) = [x_{k-1},x_{k-2},\ldots ,x_{k-l+1}]^{\tiny \mathrm{T}}, \end{aligned}$$

(35)

$$\begin{aligned}&\varvec{\varPhi }(l,\hat{\xi }_{k-1}) = [\varvec{\phi }(\hat{\xi }_{k-1},k), \varvec{\phi }(\hat{\xi }_{k-1},k-1),\nonumber \\&\quad \ldots ,\varvec{\phi }(\hat{\xi }_{k-1},k-l+1)], \end{aligned}$$

(36)

$$\begin{aligned}&\varvec{\phi }(\hat{\xi }_{k-1},k) = [\varvec{X}^{\tiny \mathrm{T}}_k, \mathrm{e}^{-\hat{\xi }_{k-1} x^2_{k-1}}\varvec{X}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}, \end{aligned}$$

(37)

$$\begin{aligned}&\varvec{X}_k = [x(k-1),x(k-2),\ldots ,x(k-n)]^{\tiny \mathrm{T}}, \end{aligned}$$

(38)

$$\begin{aligned}&\hat{\varvec{\varTheta }}_k = [\hat{\varvec{\alpha }}^{\tiny \mathrm{T}}_k,\hat{\varvec{\beta }}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}, \end{aligned}$$

(39)

$$\begin{aligned}&\hat{\xi }_{k} = \hat{\xi }_{k-1}+\frac{1}{r_{2,k}}\hat{\varvec{\varTheta }}^{\tiny \mathrm{T}}_k \varvec{\varPhi }'(l,\hat{\xi }_{k-1})\varvec{E}_2(l), \end{aligned}$$

(40)

$$\begin{aligned}&\varvec{E}_2(l)= \varvec{X}(l)-\varvec{\varPhi }^{\tiny \mathrm{T}}(l,\hat{\xi }_{k-1})\hat{\varvec{\varTheta }}_k, \end{aligned}$$

(41)

$$\begin{aligned}&r_{2,k} = r_{2,k-1}+\Vert \hat{\varvec{\varTheta }}^{\tiny \mathrm{T}}_k\varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2, \end{aligned}$$

(42)

$$\begin{aligned}&\varvec{\phi }'(\hat{\xi }_{k-1},k) = [\mathbf{0}_n^{\tiny \mathrm{T}},-x^2_{k-1} \mathrm{e}^{-\hat{\xi }_{k-1} x^2_{k-1}}\varvec{X}^{\tiny \mathrm{T}}_k]^{\tiny \mathrm{T}}, \end{aligned}$$

(43)

$$\begin{aligned}&\varvec{\varPhi }'(l,\hat{\xi }_{k-1}) = [\varvec{\phi }'(\hat{\xi }_{k-1},k), \varvec{\phi }'(\hat{\xi }_{k-1},k-1),\nonumber \\&\quad \ldots ,\varvec{\phi }'(\hat{\xi }_{k-1},k-l+1)]. \end{aligned}$$

(44)

When $l=1$, the H-MISG degenerates into the H-SG algorithm.

The H-MISG algorithm in (32)–(44) can be implemented by the following steps.

1.
Set the innovation length l and initialize: let $k=1$, $\hat{\varvec{\varTheta }}_0=[\hat{\varvec{\alpha }}^{\tiny \mathrm{T}}_0,\hat{\varvec{\beta }}^{\tiny \mathrm{T}}_0]^{\tiny \mathrm{T}} =\mathbf{1}_{2n}/p_0$, $\hat{\xi }_0=1/p_0$, $p_0=10^6$, $r_{1,0}=1$ and $r_{2,0}=1$, give an error tolerance $\eta >0$.
2.
Collect the measurement data $x_k$, form the stacked information vector $\varvec{X}(l)$ by (35), the information vectors $\varvec{X}_k$ and $\varvec{\phi }(\hat{\xi }_{k-1},k)$ by (38) and (37), and $\varvec{\varPhi }(l,\hat{\xi }_{k-1})$ by (36).
3.
Compute the reciprocal of the step-size $r_{1,k}$ by (34) and the innovation vector $\varvec{E}_1(l)$ by (33).
4.
Update the parameter estimation vector $\hat{\varvec{\varTheta }}_k$ by (32), and read out $\hat{\varvec{\alpha }}_k$ and $\hat{\varvec{\beta }}_k$ from (39).
5.
Form the derivative of $\varvec{\phi }(\hat{\xi }_{k-1},k)$ by (43), and $\varvec{\varPhi }'(l,\hat{\xi }_{k-1})$ by (44).
6.
Compute the reciprocal of the step-size $r_{2,k}$ by (42) and the innovation vector $\varvec{E}_2(l)$ by (41).
7.
Update the parameter estimate $\hat{\xi }_{k}$ by (40).
8.
Compare $\{\hat{\varvec{\varTheta }}_k,\hat{\xi }_{k}\}$ with $\{\hat{\varvec{\varTheta }}_{k-1},\hat{\xi }_{k-1}\}$: if $\Vert \hat{\varvec{\varTheta }}_k-\hat{\varvec{\varTheta }}_{k-1}\Vert +\Vert \hat{\xi }_{k}-\hat{\xi }_{k-1}\Vert >\eta $, increase k by 1 and return to Step 2; otherwise, stop this computational process.

Remark 2

In order to obtain more accurate parameter estimates but not increase the computational cost of the H-MISG algorithm, we introduce the forgetting factors (FF) $\lambda _1$ and $\lambda _2$ into (34) and (42):

$$\begin{aligned} r_{1,k}= & {} \lambda _1r_{1,k-1}+\Vert \varvec{\phi }(\hat{\xi }_{k-1},k)\Vert ^2, \quad 0\le \lambda _1<1, \end{aligned}$$

(45)

$$\begin{aligned} r_{2,k}= & {} \lambda _2r_{2,k-1}+\Vert \hat{\varvec{\varTheta }}^{\tiny \mathrm{T}}_k \varvec{\phi }'(\hat{\xi }_{k-1},k)\Vert ^2, \quad 0\le \lambda _2<1.\nonumber \\ \end{aligned}$$

(46)

Replacing (34) and (42) in the H-MISG algorithm with (45) and (46), we obtain the variant of the H-MISG, i.e., the FF-H-MISG algorithm for the ExpAR model. When $\lambda _1=1$ and $\lambda _2=1$, the FF-H-MISG degenerates into the H-MISG algorithm.

Table 2 The H-SG estimates and errors ($\sigma ^2=0.20^2$)

Full size table

Table 3 The H-MISG estimates and errors ($\sigma ^2=0.20^2$, $l=5$)

Full size table

Remark 3

Before using the proposed algorithms to identify the ExpAR model, we need to determine the order from input-output data by using the order estimation methods, such as the orthogonalization procedure and the correlation analysis in [22].

At each recursion, the H-SG algorithm involves the current measurement data and innovation, the H-MISG or the FF-H-MISG algorithm applies all the current and the preceding $(l-1)$ measurement data and innovations, which makes the latter has a higher parameter estimation accuracy.

5 Example

Consider the following ExpAR time series

$$\begin{aligned} x_k= & {} \left( \alpha _1+\beta _1\mathrm{e}^{-\xi x^2_{k-1}}\right) x_{k-1} +\left( \alpha _2+\beta _2\mathrm{e}^{-\xi x^2_{k-1}}\right) x_{k-2}\\&+\cdots +\left( \alpha _n+\beta _n\mathrm{e}^{-\xi x^2_{k-1}}\right) x_{k-n}+\varepsilon _k \\= & {} \left( 1.25+2.00\mathrm{e}^{-2.30 x^2_{k-1}}\right) x_{k-1}\\&+\left( -0.28+1.85\mathrm{e}^{-2.30 x^2_{k-1}}\right) x_{k-2} +\varepsilon _k. \end{aligned}$$

The parameters to be identified are

$$\begin{aligned} \varvec{\varTheta }= & {} [\alpha _1,\alpha _2,\beta _1,\beta _2]^{\tiny \mathrm{T}}\\= & {} [1.25,-0.28,2.00,1.85]^{\tiny \mathrm{T}}, \quad \xi =2.30. \end{aligned}$$

In simulation, the variance of the white noise $\{\varepsilon _k\}$ is set to be $\sigma ^2$, the measurement data length is taken as $L_e=3000$. For simplicity, we define $\varvec{\vartheta }:=[\varvec{\varTheta }^{\tiny \mathrm{T}},\xi ]^{\tiny \mathrm{T}}$.

Table 4 The FF-H-MISG estimates and errors ($\sigma ^2=0.20^2$, $l=5$, $\lambda _1=0.91$, $\lambda _2=1.00$)

Full size table

Table 5 The FF-H-MISG estimates and errors ($\sigma ^2=0.20^2$, $\lambda _1=0.91$, $\lambda _2=1.00$)

Full size table

Taking $\sigma ^2=0.20^2$ and using the H-SG algorithm, H-MISG algorithm with $l=5$ and FF-H-MISG algorithm with $l=5$, $\lambda _1=0.91$ and $\lambda _2=1.00$ to identify this ExpAR model, respectively, the parameter estimates and their errors are shown in Tables 2, 3 and 4, the parameter estimation errors $\delta :=\Vert \hat{\varvec{\vartheta }}_k-\varvec{\vartheta }\Vert /\Vert \varvec{\vartheta }\Vert \times 100\%$ versus k are shown in Figure 2.

To illustrate the advantage of the proposed multi-innovation identification algorithms, we fix the noise variance $\sigma ^2=0.20^2$, the forgetting factors $\lambda _1=0.91$ and $\lambda _2=1.00$, and adopt the FF-H-MISG algorithm to identify this ExpAR model with the innovation length $l=5$, $l=6$ and $l=7$. The corresponding results are demonstrated in Table 5 and Fig. 3.

To demonstrate how the performance of the proposed FF-H-MISG algorithm depends on the forgetting factors, we fix the noise variance $\sigma ^2=0.20^2$, the innovation length $l=7$, the forgetting factor $\lambda _2=1.00$, and adopt the FF-H-MISG algorithm to identify this ExpAR model with the forgetting factor $\lambda _1=0.91$, $\lambda _1=0.97$ and $\lambda _1=0.99$. The corresponding results are exhibited in Table 6 and Fig. 4.

To show the influence of the noise level on the proposed FF-H-MISG algorithm, we fix the innovation length $l=7$, the forgetting factors $\lambda _1=0.91$, $\lambda _2=1.00$, and adopt the FF-H-MISG algorithm to identify this ExpAR model with the noise variance $\sigma ^2=0.20^2$, $\sigma ^2=0.23^2$ and $\sigma ^2=0.26^2$. The results are shown in Table 7 and Fig. 5.

From Tables 2, 3, 4, 5, 6 and 7 and Figs. 2, 3, 4 and 5, we draw the following conclusions.

The parameter estimation errors decrease as the data length k increases for all the algorithms proposed in this paper. The FF-H-MISG algorithm has the highest parameter estimation accuracy among these three algorithms—see Tables 2, 3, 4 and Fig. 2.
The parameter estimation accuracy becomes higher with the innovation length l increasing and the forgetting factor decreasing for the FF-H-MISG algorithm—see Tables 5, 6 and Figs. 3, 4.
The estimation errors of the FF-H-MISG algorithm tend to zero with the decreasing of noise levels—see Table 7 and Fig. 5.
The proposed FF-H-MISG algorithm with appropriate innovation length and forgetting factors is effective to identify the nonlinear ExpAR process—see Tables 5, 6 and Figs. 3, 4.

For the model validation, we use $L_r=200$ observations from $k=L_e+1$ to $k=L_e+L_r$ and the predicted model by the FF-H-MISG algorithm with $\lambda _1=0.91$, $\lambda _2=1.00$ and $l=7$. The predicted data $\hat{x}_k$ and the measurement data $x_k$ are plotted in Fig. 6. To evaluate the prediction performance, we define and compute the mean square error (MSE)

$$\begin{aligned} \mathrm{MSE}:=\left[ \frac{1}{L_r}\sum \limits _{k=L_e+1}^{L_e+L_r} (\hat{x}_k-x_k)^2\right] ^{1/2}=0.19635. \end{aligned}$$

From Fig. 6, we can see that the predicted data is close to the measurement data, which means the predicted model can reveal the dynamics of this ExpAR process.

Table 6 The FF-H-MISG estimates and errors ($\sigma ^2=0.20^2$, $l=7$, $\lambda _2=1.00$)

Full size table

Table 7 The FF-H-MISG estimates and errors ($l=7$, $\lambda _1=0.91$, $\lambda _2=1.00$)

Full size table

6 Conclusions

Applying the hierarchical identification principle and the multi-innovation identification theory, this paper derives an H-SG algorithm and an H-MISG algorithm for the ExpAR model. For the sake of the improved estimation accuracy, two forgetting factors are introduced into the H-MISG, and a variant of the H-MISG, i.e., the FF-H-MISG algorithm is presented in this work. The simulation results demonstrate that the FF-H-MISG algorithm with appropriate innovation length and forgetting factors is effective to identify the ExpAR model. Jointing other methods [23] such as the neural network [24, 25] and the kernel collocation [26, 27], the algorithms proposed in this paper can be exploited to study parameter identification of different systems and can be applied to other fields [28,29,30,31].

References

Gan, M., Li, H.X., Peng, H.: A variable projection approach for efficient estimation of RBF-ARX model. IEEE Trans. Cybern. 45(3), 462–471 (2015)
Article Google Scholar
Ozaki, T.: Non-linear time series models for non-linear random vibrations. J. Appl. Probab. 17(1), 84–93 (1980)
Article MathSciNet Google Scholar
Ozaki, T.: The statistical analysis of perturbed limit cycle processes using nonlinear time series models. J. Time Ser. Anal. 3(1), 29–41 (1982)
Article MathSciNet Google Scholar
Teräsvirta, T.: Specification, estimation, and evaluation of smooth transition autoregressive models. J. Am. Stat. Assoc. 89(425), 208–218 (1994)
MATH Google Scholar
Merzougui, M., Dridi, H., Chadli, A.: Test for periodicity in restrictive EXPAR models. Commun. Stat. Theory Methods 45(9), 2770–2783 (2016)
Article MathSciNet Google Scholar
Chen, G.Y., Gan, M., Chen, G.L.: Generalized exponential autoregressive models for nonlinear time series: stationarity, estimation and applications. Inf. Sci. 438, 46–57 (2018)
Article MathSciNet Google Scholar
Zhou, Z.P., Liu, X.F.: State and fault estimation of sandwich systems with hysteresis. Int. J. Robust Nonlinear Control 28(13), 3974–3986 (2018)
Article MathSciNet Google Scholar
Yu, C.P., Verhaegen, M., Hansson, A.: Subspace identification of local systems in one-dimensional homogeneous networks. IEEE Trans. Autom. Control 63(4), 1126–1131 (2018)
Article MathSciNet Google Scholar
Pan, J., Ma, H., Jiang, X., et al.: Adaptive gradient-based iterative algorithm for multivariate controlled autoregressive moving average systems using the data filtering technique. Complexity (2018). https://doi.org/10.1155/2018/9598307
Schoukens, M., Tiels, K.: Identification of block-oriented nonlinear systems starting from linear approximations: a survey. Automatica 85, 272–292 (2017)
Article MathSciNet Google Scholar
Arqub, O.A., Abo-Hammour, Z.: Numerical solution of systems of second-order boundary value problems using continuous genetic algorithm. Inf. Sci. 279, 396–415 (2014)
Article MathSciNet Google Scholar
Yu, C.P., Verhaegen, M.: Blind multivariable ARMA subspace identification. Automatica 66, 3–14 (2016)
Article MathSciNet Google Scholar
Yu, C.P., Verhaegen, M.: Data-driven fault estimation of non-minimum phase LTI systems. Automatica 92, 181–187 (2018)
Article MathSciNet Google Scholar
Chen, J., Jiang, B.: Modified stochastic gradient parameter estimation algorithms for a nonlinear two-variable difference system. Int. J. Control Autom. Syst. 14(6), 1493–1500 (2016)
Article Google Scholar
Chen, F.W., Garnier, H., Gilson, M.: Robust identification of continuous-time models with arbitrary time-delay from irregularly sampled data. J. Process Control 25, 19–27 (2015)
Article Google Scholar
Ding, F., Xu, L., Liu, X.M.: Signal modeling—part F: hierarchical iterative parameter estimation for multi-frequency signal models. J. Qingdao Univ. Sci. Technol. (Nat. Sci Ed.) 38(6), 1–13 (2017)
Google Scholar
Ding, F.: Several multi-innovation identification methods. Digit. Signal Process. 20(4), 1027–1039 (2010)
Article Google Scholar
Li, L.W., Ren, X.M., Guo, F.M.: Modified multi-innovation stochastic gradient algorithm for Wiener–Hammerstein systems with backlash. J. Franklin Inst. 355(9), 4050–4075 (2018)
Article MathSciNet Google Scholar
Cheng, S.S., Wei, Y.H., Sheng, D., Chen, Y.Q., Wang, Y.: Identification for Hammerstein nonlinear ARMAX systems based on multi-innovation fractional order stochastic gradient. Signal Process. 142, 1–10 (2018)
Article Google Scholar
Ding, F., Xu, L., Liu, X.M.: Signal modeling—part E: hierarchical parameter estimation for multi-frequency signal models. J. Qingdao Univ. Sci. Technol. (Nat. Sci Ed.) 38(5), 1–15 (2017)
Google Scholar
Ding, F.: System Identification—Multi-Innovation Identification Theory and Methods. Science Press, Beijing (2016)
Google Scholar
Zhang, B., Billings, S.A.: Identification of continuous-time nonlinear systems: the nonlinear difference equation with moving average noise (NDEMA) framework. Mech. Syst. Signal Process. 60–61, 810–835 (2015)
Article Google Scholar
El-Ajou, A., Arqub, O.A., Al-Smadi, M.: A general form of the generalized Taylor’s formula with some applications. Appl. Math. Comput. 256, 851–859 (2015)
MathSciNet MATH Google Scholar
Li, X., Zhu, D.Q.: An improved SOM neural network method to adaptive leader-follower formation control of AUVs. IEEE Trans. Ind. Electron. 65(10), 8260–8270 (2018)
Google Scholar
Chen, M.Z., Zhu, D.Q.: A workload balanced algorithm for task assignment and path planning of inhomogeneous autonomous underwater vehicle system. IEEE Trans. Cognit. Dev. Syst. (2018). https://doi.org/10.1109/TCDS.2018.2866984
Geng, F.Z., Qian, S.P.: An optimal reproducing kernel method for linear nonlocal boundary value problems. Appl. Math. Lett. 77, 49–56 (2018)
Article MathSciNet Google Scholar
Li, X.Y., Wu, B.Y.: A new reproducing kernel collocation method for nonlocal fractional boundary value problems with non-smooth solutions. Appl. Math. Lett. 86, 194–199 (2018)
Article MathSciNet Google Scholar
Pan, J., Li, W., Zhang, H.P.: Control algorithms of magnetic suspension systems based on the improved double exponential reaching law of sliding mode control. Int. J. Control Autom. Syst. 16(6), 2878–2887 (2018)
Article Google Scholar
Xu, L.: The parameter estimation algorithms based on the dynamical response measurement data. Adv. Mech. Eng. 9(11), 1–12 (2017). https://doi.org/10.1177/1687814017730003
Article Google Scholar
Xu, L., Ding, F.: Parameter estimation for control systems based on impulse responses. Int. J. Control Autom. Syst. 15(6), 2471–2479 (2017)
Article Google Scholar
Xu, L., Ding, F.: Iterative parameter estimation for signal models based on measured data. Circuits Syst. Signal Process. 37(7), 3046–3069 (2018)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by the 111 Project (B12018), the National Natural Science Foundation of China (No. 61273194) and the National First-Class Discipline Program of Light Industry Technology and Engineering (LITE2018-26).

Author information

Authors and Affiliations

School of Internet of Things Engineering, Jiangnan University, Wuxi, 214122, People’s Republic of China
Huan Xu & Feng Ding
College of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao, 266061, People’s Republic of China
Feng Ding
Space Mechatronic Systems Technology Laboratory, University of Strathclyde, Glasgow, G1 1XJ, Scotland, United Kingdom
Erfu Yang

Authors

Huan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Ding
View author publications
You can also search for this author in PubMed Google Scholar
Erfu Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Ding.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, H., Ding, F. & Yang, E. Modeling a nonlinear process using the exponential autoregressive time series model. Nonlinear Dyn 95, 2079–2092 (2019). https://doi.org/10.1007/s11071-018-4677-0

Download citation

Received: 24 June 2018
Accepted: 21 November 2018
Published: 06 December 2018
Issue Date: 28 February 2019
DOI: https://doi.org/10.1007/s11071-018-4677-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Modeling a nonlinear process using the exponential autoregressive time series model

Abstract

Similar content being viewed by others

Gradient-based Parameter Estimation for a Nonlinear Exponential Autoregressive Time-series Model by Using the Multi-innovation

Two-stage Gradient-based Iterative Estimation Methods for Controlled Autoregressive Systems Using the Measurement Data

Decomposition-Based Gradient Estimation Algorithms for Multivariate Equation-Error Autoregressive Systems Using the Multi-innovation Theory

1 Introduction

2 Problem description