1 Introduction

Parameter estimation and mathematical models are essential for system identification [13, 31, 33], system optimization [16, 24] and state and data filtering [14, 19, 32]. Exploring new parameter estimation methods is an eternal theme of system identification [5, 6], and many identification methods have been developed for linear and nonlinear systems [1, 25, 38, 40], dual-rate sampled systems [9, 11, 36] and state-delay systems [28]. Iterative methods can be used for estimating parameters and solving matrix equations [4]. The iterative identification algorithms make full use of the measured data at each iteration and thus can produce more accurate parameter estimates than the existing recursive identification algorithms  [29]. For decades, many iterative methods have been applied in the parameter estimation, such as the Newton iterative method [7, 26, 41, 42], the gradient-based iterative methods [39] and the least squares-based iterative (LSI) method [17]. Jin et al. [23] studied the LSI identification methods for multivariable integrating and unstable processes in closed loop; Wang et al. [37] derived several gradient-based iterative estimation algorithms for a class of nonlinear systems with colored noises using the filtering technique.

The least squares identification method involves matrix inversion, and its computational complexity depends on the dimensions of the covariance matrices [18]. In order to reduce the computational complexity, the decomposition technique is usually taken to transform a large-scale system into several sub-systems with small sizes, which can be easier to identify. Chen et al. [2] developed a decomposition-based least squares identification algorithm for input nonlinear systems by adopting the key term separation technique; Zhang [43] proposed a decomposition-based LSI identification algorithm for output error moving average systems based on the hierarchical identification principle.

In the field of system identification, missing-data systems have received much attention. Dual-rate sampled systems and multirate (non-uniformly) sampled systems can be regarded as a class of the systems with missing data [10]. In recent years, different identification methods for missing-data systems have been reported in the literature, e.g., the interval-varying auxiliary model-based recursive least squares method [8], the filtering-based multiple-model method [27] and the interval-varying auxiliary model-based multi-innovation stochastic gradient (V-AM-MISG) identification method [8, 12]. Recently, Jin et al. [22] extended the V-AM-MISG method to multivariable output error systems with scarce measurements by means of the interval-varying and multi-innovation methods in [8, 12]; Raghavan et al. [30] studied the expectation maximization-based state-space model identification problems with irregular output sampling.

This paper applies the decomposition technique to study the parameter identification problems of linear-in-parameters systems for improving computational efficiency. The key is to decompose the information vector into two sub-information vectors and the parameter vector into two sub-parameter vectors with smaller dimensions and fewer variables and then to estimate the parameters of each sub-system, respectively. The main contributions are as follows.

  • A decomposition-based LSI (D-LSI) algorithm is developed for linear-in-parameters systems by employing the hierarchical identification principle.

  • An interval-varying D-LSI algorithm is derived for estimating the parameters of the systems with missing data.

  • The proposed algorithms have higher computational efficiency than the LSI algorithm and the interval-varying LSI algorithm.

This paper is organized as follows: Section 2 introduces the identification model of the linear-in-parameters systems. Section 3 gives an LSI algorithm for comparisons. A D-LSI algorithm for the linear-in-parameters systems is developed in Sect. 4. Section 5 describes the parameter estimation problem with missing data and proposes an interval-varying LSI algorithm. Section 6 derives an interval-varying D-LSI algorithm to reduce computational load. The effectiveness of the proposed algorithms is illustrated by two simulation examples in Sect. 7. Finally, Sect. 8 gives some conclusions.

2 System Description and Identification Model

Let us introduce some notation. “\(A=:X\)” or “\(X:=A\)” stands for “A is defined as X”; \(\hat{{\varvec{\vartheta }}}(t)\) denotes the estimate of \({\varvec{\vartheta }}\) at time t; the norm of a matrix (or a column vector) \(\varvec{X}\) is defined by \(\Vert \varvec{X}\Vert ^2:=\mathrm{tr}[\varvec{X}\varvec{X}^{\mathrm{T}}]\); \(\mathbf{1}_n\) stands for an n-dimensional column vector whose elements are all 1; the superscript T denotes the matrix transpose.

Consider the linear-in-parameters system which can be expressed as

$$\begin{aligned} A(z)y(t)=\frac{{\varvec{\phi }}^{\mathrm{T}}(t)}{F(z)}{\varvec{\theta }}+v(t), \end{aligned}$$
(1)

where \(y(t)\in {\mathbb {R}}\) is the measured output, \({\varvec{\phi }}(t)\in {\mathbb {R}}^m\) is the information vector consisting of the system input–output data, \({\varvec{\theta }}\in {\mathbb {R}}^m\) is the parameter vector to be estimated, \(v(t)\in {\mathbb {R}}\) is the random white noise with zero mean and variance \(\sigma ^2\), A(z) and F(z) with known orders \(n_a\) and \(n_f\) are polynomials in the unit backward shift operator \(z^{-1}\) with the property \(z^{-1}y(t)=y(t-1)\), and defined by

$$\begin{aligned} A(z):= & {} 1+a_1z^{-1}+a_2z^{-2}+\cdots +a_{n_a}z^{-n_a},\quad a_i\in {\mathbb {R}},\\ F(z):= & {} 1+f_1z^{-1}+f_2z^{-2}+\cdots +f_{n_f}z^{-n_f},\quad f_i\in {\mathbb {R}}. \end{aligned}$$

The objective of this paper is to use the decomposition technique to derive iterative methods for estimating the parameters \({\varvec{\theta }}\), \(a_i\) and \(f_i\) in (1) from observation data for reducing the computational load. Without loss of generality, assume that \({\varvec{\phi }}(t)=\mathbf{0}\), \(y(t)=0\) and \(v(t)=0\) for \(t\leqslant 0\).

Define the parameter vectors and the information vectors,

$$\begin{aligned} {\varvec{\vartheta }}:= & {} \left[ \varvec{a}^{\mathrm{T}}, \varvec{f}^{\mathrm{T}}, {\varvec{\theta }}^{\mathrm{T}}\right] ^{\mathrm{T}}\in {\mathbb {R}}^n,\quad n:=n_a+n_f+m, \\ \varvec{a}:= & {} \left[ a_1, a_2, \ldots , a_{n_a}\right] ^{\mathrm{T}}\in {\mathbb {R}}^{n_a},\\ \varvec{f}:= & {} \left[ f_1, f_2, \ldots , f_ {n_f}\right] ^{\mathrm{T}}\in {\mathbb {R}}^{n_f},\\ {\varvec{\varphi }}(t):= & {} \left[ {\varvec{\varphi }}_y^{\mathrm{T}}(t), {\varvec{\varphi }}_x^{\mathrm{T}}(t), {\varvec{\phi }}^{\mathrm{T}}(t)\right] ^{\mathrm{T}}\in {\mathbb {R}}^n,\\ {\varvec{\varphi }}_y(t):= & {} \left[ -y(t-1), -y(t-2), \ldots , -y(t-n_a)\right] ^{\mathrm{T}}\in {\mathbb {R}}^{n_a},\\ {\varvec{\varphi }}_x(t):= & {} \left[ -x(t-1), -x(t-2), \ldots , -x(t-n_f)\right] ^{\mathrm{T}}\in {\mathbb {R}}^{n_f}. \end{aligned}$$

Define the intermediate variable

$$\begin{aligned} x(t):= & {} \frac{{\varvec{\phi }}^{\mathrm{T}}(t){\varvec{\theta }}}{F(z)}\nonumber \\= & {} [1-F(z)]x(t)+{\varvec{\phi }}^{\mathrm{T}}(t){\varvec{\theta }}\nonumber \\= & {} {\varvec{\varphi }}_x^{\mathrm{T}}(t)\varvec{f}+{\varvec{\phi }}^{\mathrm{T}}(t){\varvec{\theta }}. \end{aligned}$$
(2)

Then, System (1) can be rewritten as

$$\begin{aligned} y(t)= & {} [1-A(z)]y(t)+x(t)+v(t)\nonumber \\= & {} {\varvec{\varphi }}_y^{\mathrm{T}}(t)\varvec{a}+{\varvec{\varphi }}_x^{\mathrm{T}}(t)\varvec{f}+{\varvec{\phi }}^{\mathrm{T}}(t){\varvec{\theta }}+v(t) \end{aligned}$$
(3)
$$\begin{aligned}= & {} {\varvec{\varphi }}^{\mathrm{T}}(t){\varvec{\vartheta }}+v(t). \end{aligned}$$
(4)

Equation (4) is the identification model of System (1), and its parameter vector \({\varvec{\vartheta }}\) contains all the parameters \({\varvec{\theta }}\), \(a_i\) and \(f_i\) of the system.

3 The Least Squares-Based Iterative Algorithm

In this section, we give a least squares-based iterative algorithm for comparisons.

Consider the newest p data from \(j=t-p+1\) to \(j=t\) (p represents the data length). According to the identification model in (4), define a quadratic function:

$$\begin{aligned} J({\varvec{\vartheta }}):= & {} \sum \limits _{j=0}^{p-1}\left[ y(t-j)-{\varvec{\varphi }}^{\mathrm{T}}(t-j){\varvec{\vartheta }}\right] ^2. \end{aligned}$$

Assume that the information matrix \({\varvec{\varphi }}(t)\) is persistently exciting for large p. Minimizing the function \(J({\varvec{\vartheta }})\), we can obtain the least squares estimate of the parameter vector \({\varvec{\vartheta }}\):

$$\begin{aligned} \hat{{\varvec{\vartheta }}}(t)=\left[ \sum \limits _{j=0}^{p-1}{\varvec{\varphi }}(t-j){\varvec{\varphi }}^{\mathrm{T}}(t-j)\right] ^{-1}\sum \limits _{j=0}^{p-1}{\varvec{\varphi }}(t-j)y(t-j). \end{aligned}$$
(5)

Notice that the estimate \(\hat{{\varvec{\vartheta }}}(t)\) in (5) is impossible to obtain directly because the information vector \({\varvec{\varphi }}(t-j)\) contains the unknown term \(x(t-i)\). Here, the approach is based on the hierarchical identification principle: let \(k=1,2,3, \ldots \) be an iterative variable, \(\hat{{\varvec{\vartheta }}}_k(t):=\left[ \begin{array}{c} \hat{\varvec{a}}_k(t) \\ \hat{\varvec{f}}_k(t) \\ \hat{{\varvec{\theta }}}_k(t) \end{array}\right] \in {\mathbb {R}}^n\) be the iterative estimate of \({\varvec{\vartheta }}\) at iteration k, use the estimate \(\hat{x}_{k-1}(t-i)\) of \(x(t-i)\) to construct the estimate \(\hat{{\varvec{\varphi }}}_{x,k}(t)\) of \({\varvec{\varphi }}_x(t)\) at iteration k:

$$\begin{aligned} \hat{{\varvec{\varphi }}}_{x,k}(t):=[-\hat{x}_{k-1}(t-1),-\hat{x}_{k-1}(t-2),\ldots ,-\hat{x}_{k-1}(t-n_f)]^{\mathrm{T}}\in {\mathbb {R}}^{n_f}, \end{aligned}$$

and define the estimate of \({\varvec{\varphi }}(t)\):

$$\begin{aligned} \hat{{\varvec{\varphi }}}_k(t):=\left[ {\varvec{\varphi }}_y^{\mathrm{T}}(t), \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t), {\varvec{\phi }}^{\mathrm{T}}(t)\right] ^{\mathrm{T}}\in {\mathbb {R}}^n. \end{aligned}$$

Replacing \({\varvec{\varphi }}_x(t)\), \({\varvec{\theta }}\) and \(\varvec{f}\) in (2) with \(\hat{{\varvec{\varphi }}}_{x,k}(t)\), \(\hat{{\varvec{\theta }}}_k(t)\) and \(\hat{\varvec{f}}_k(t)\), respectively, the estimate \(\hat{x}_k(t)\) of x(t) can be computed by

$$\begin{aligned} \hat{x}_k(t)=\hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t)\hat{\varvec{f}}_k(t)+{\varvec{\phi }}^{\mathrm{T}}(t)\hat{{\varvec{\theta }}}_k(t). \end{aligned}$$

Replacing \({\varvec{\varphi }}(t-j)\) in (5) with \(\hat{{\varvec{\varphi }}}_k(t-j)\), we can obtain the following least squares-based iterative (LSI) algorithm for estimating \({\varvec{\vartheta }}\):

$$\begin{aligned} \hat{{\varvec{\vartheta }}}_k(t)= & {} \hat{\varvec{S}}_k^{-1}(t)\sum \limits _{j=0}^{p-1}\hat{{\varvec{\varphi }}}_k(t-j)y(t-j), \end{aligned}$$
(6)
$$\begin{aligned} \hat{\varvec{S}}_k(t):= & {} \sum \limits _{j=0}^{p-1}\hat{{\varvec{\varphi }}}_k(t-j)\hat{{\varvec{\varphi }}}_k^{\mathrm{T}}(t-j),\end{aligned}$$
(7)
$$\begin{aligned} \hat{{\varvec{\varphi }}}_k(t)= & {} \left[ {\varvec{\varphi }}_y^{\mathrm{T}}(t), \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t), {\varvec{\phi }}^{\mathrm{T}}(t)\right] ^{\mathrm{T}},\end{aligned}$$
(8)
$$\begin{aligned} {\varvec{\varphi }}_y(t)= & {} \left[ -y(t-1), -y(t-2), \ldots , -y(t-n_a)\right] ^{\mathrm{T}},\end{aligned}$$
(9)
$$\begin{aligned} \hat{{\varvec{\varphi }}}_{x,k}(t)= & {} \left[ -\hat{x}_{k-1}(t-1),-\hat{x}_{k-1}(t-2),\ldots ,-\hat{x}_{k-1}(t-n_f)\right] ^{\mathrm{T}},\end{aligned}$$
(10)
$$\begin{aligned} \hat{x}_k(t)= & {} \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t)\hat{\varvec{f}}_k(t)+{\varvec{\phi }}^{\mathrm{T}}(t)\hat{{\varvec{\theta }}}_k(t),\end{aligned}$$
(11)
$$\begin{aligned} \hat{{\varvec{\vartheta }}}_k(t)= & {} \left[ \begin{array}{c} \hat{\varvec{a}}_k(t) \\ \hat{\varvec{f}}_k(t) \\ \hat{{\varvec{\theta }}}_k(t) \end{array}\right] . \end{aligned}$$
(12)

The LSI parameter estimation algorithm is able to make full use of all the input–output data at each iteration, and thus, the parameter estimation accuracy can be greatly improved.

4 The Decomposition-Based LSI Algorithm

The LSI algorithm can improve the parameter estimation accuracy, but the disadvantage is that it needs heavy computational load for large-scale systems. By means of the hierarchical identification principle, the following derives a D-LSI algorithm to improve the computational efficiency.

The identification model in (3) includes the known information vectors \({\varvec{\varphi }}_y(t)\) and \({\varvec{\phi }}(t)\), and the unknown information vector \({\varvec{\varphi }}_x(t)\). Define a new information vector

$$\begin{aligned} {\varvec{\varphi }}_1(t):=\left[ {\varvec{\varphi }}_y^{\mathrm{T}}(t), {\varvec{\phi }}^{\mathrm{T}}(t)\right] ^{\mathrm{T}}\in {\mathbb {R}}^{n_a+m}, \end{aligned}$$
(13)

and the corresponding parameter vector

$$\begin{aligned} {\varvec{\theta }}_1:=\left[ \varvec{a}^{\mathrm{T}}, {\varvec{\theta }}^{\mathrm{T}}\right] ^{\mathrm{T}}\in {\mathbb {R}}^{n_a+m}. \end{aligned}$$

Based on the hierarchical identification principle [3], by defining two intermediate variables

$$\begin{aligned} y_1(t):= & {} y(t)-{\varvec{\varphi }}_x^{\mathrm{T}}(t)\varvec{f}, \end{aligned}$$
(14)
$$\begin{aligned} y_2(t):= & {} y(t)-{\varvec{\varphi }}_1^{\mathrm{T}}(t){\varvec{\theta }}_1, \end{aligned}$$
(15)

we can decompose the identification model in (3) into the following two fictitious sub-models:

$$\begin{aligned} y_1(t)= & {} {\varvec{\varphi }}_1^{\mathrm{T}}(t){\varvec{\theta }}_1+v(t),\end{aligned}$$
(16)
$$\begin{aligned} y_2(t)= & {} {\varvec{\varphi }}_x^{\mathrm{T}}(t)\varvec{f}+v(t). \end{aligned}$$
(17)

The parameter vectors \({\varvec{\theta }}_1=\left[ \begin{array}{c} \varvec{a} \\ {\varvec{\theta }} \end{array} \right] \) and \(\varvec{f}\) to be identified are included in the two sub-models, respectively.

According to Eqs. (16) and (17), minimizing the quadratic functions

$$\begin{aligned} J_1({\varvec{\theta }}_1):= & {} \sum \limits _{j=0}^{p-1} \left[ y_1(t-j)-{\varvec{\varphi }}_1^{\mathrm{T}}(t-j){\varvec{\theta }}_1\right] ^2,\\ J_2(\varvec{f}):= & {} \sum \limits _{j=0}^{p-1}\left[ y_2(t-j)-{\varvec{\varphi }}_x^{\mathrm{T}}(t-j)\varvec{f}\right] ^2, \end{aligned}$$

we can obtain the following least squares estimates of the parameter vectors \({\varvec{\theta }}\) and \(\varvec{f}\):

$$\begin{aligned} \hat{{\varvec{\theta }}}_1(t)= & {} \left[ \sum _{j=0}^{p-1}{\varvec{\varphi }}_1(t-j){\varvec{\varphi }}_1^{\mathrm{T}}(t-j)\right] ^{-1}\sum \limits _{j=0}^{p-1}[{\varvec{\varphi }}_1(t-j)y_1(t-j)], \end{aligned}$$
(18)
$$\begin{aligned} \hat{\varvec{f}}(t)= & {} \left[ \sum _{j=0}^{p-1}{\varvec{\varphi }}_x(t-j){\varvec{\varphi }}_x^{\mathrm{T}}(t-j)\right] ^{-1}\sum \limits _{j=0}^{p-1}[{\varvec{\varphi }}_x(t-j)y_2(t-j)]. \end{aligned}$$
(19)

Here, we have used the assumption that the information vectors \({\varvec{\varphi }}_1(t)\) and \({\varvec{\varphi }}_x(t)\) are persistently exciting for large p. Substituting (14) and (15) into (18) and (19), respectively, we have

$$\begin{aligned} \hat{{\varvec{\theta }}}_1(t)= & {} \left[ \sum \limits _{j=0}^{p-1}{\varvec{\varphi }}_1(t-j){\varvec{\varphi }}_1^{\mathrm{T}}(t-j)\right] ^{-1}\sum _{j=0}^{p-1}{\varvec{\varphi }}_1(t-j)\left[ y(t-j)-{\varvec{\varphi }}_x^{\mathrm{T}}(t-j)\varvec{f}\right] , \nonumber \\\end{aligned}$$
(20)
$$\begin{aligned} \hat{\varvec{f}}(t)= & {} \left[ \sum \limits _{j=0}^{p-1}{\varvec{\varphi }}_x(t-j){\varvec{\varphi }}_x^{\mathrm{T}}(t-j)\right] ^{-1}\sum _{j=0}^{p-1}{\varvec{\varphi }}_x(t-j)\left[ y(t-j)-{\varvec{\varphi }}_1^{\mathrm{T}}(t-j){\varvec{\theta }}_1\right] .\nonumber \\ \end{aligned}$$
(21)

However, the information vector \({\varvec{\varphi }}_x(t)\) contains the unknown term \(x(t-i)\), thus the algorithm in (20) and (21) cannot be implemented. Similarly, we use the hierarchical identification principle to solve this problem: let \(\hat{{\varvec{\theta }}}_{1,k}(t):=[\hat{\varvec{a}}_k^{\mathrm{T}}(t),\hat{{\varvec{\theta }}}_k^{\mathrm{T}}(t)]^{\mathrm{T}}\in {\mathbb {R}}^{n_a+m}\) be the iterative estimate of \({\varvec{\theta }}_1\) at iteration k, \(\hat{{\varvec{\varphi }}}_{x,k}(t)\) be the estimate of \({\varvec{\varphi }}_x(t)\) by replacing \(x(t-i)\) with its estimate \(\hat{x}_{k-1}(t-i)\) at iteration \(k-1\).

Replacing \({\varvec{\varphi }}_x(t)\), \(\varvec{f}\) and \({\varvec{\theta }}_1\) in (20) and (21) with their corresponding estimates \(\hat{{\varvec{\varphi }}}_{x,k}(t)\), \(\hat{\varvec{f}}_{k-1}(t)\) and \(\hat{{\varvec{\theta }}}_{1,k-1}(t)\), respectively, we can summarize the decomposition-based LSI (D-LSI) algorithm of the linear-in-parameters systems as

$$\begin{aligned} \hat{{\varvec{\theta }}}_{1,k}(t)= & {} \varvec{S}_1^{-1}(t)\sum _{j=0}^{p-1}{\varvec{\varphi }}_1(t-j)\left[ y(t-j)-\hat{{\varvec{\varphi }}}^{\mathrm{T}}_{x,k}(t-j)\hat{\varvec{f}}_{k-1}(t)\right] , \end{aligned}$$
(22)
$$\begin{aligned} \varvec{S}_1(t):= & {} \sum \limits _{j=0}^{p-1}{\varvec{\varphi }}_1(t-j){\varvec{\varphi }}_1^{\mathrm{T}}(t-j),\end{aligned}$$
(23)
$$\begin{aligned} \hat{\varvec{f}}_k(t)= & {} \hat{\varvec{S}}_{2,k}^{-1}(t)\sum _{j=0}^{p-1}\hat{{\varvec{\varphi }}}_{x,k}(t-j)\left[ y(t-j)-{\varvec{\varphi }}^{\mathrm{T}}_1(t-j)\hat{{\varvec{\theta }}}_{1,k-1}(t)\right] ,\end{aligned}$$
(24)
$$\begin{aligned} \hat{\varvec{S}}_{2,k}(t):= & {} \sum \limits _{j=0}^{p-1}\hat{{\varvec{\varphi }}}_{x,k}(t-j)\hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t-j),\end{aligned}$$
(25)
$$\begin{aligned} {\varvec{\varphi }}_1(t)= & {} \left[ {\varvec{\varphi }}_y^{\mathrm{T}}(t),{\varvec{\phi }}^{\mathrm{T}}(t)\right] ^{\mathrm{T}},\end{aligned}$$
(26)
$$\begin{aligned} {\varvec{\varphi }}_y(t)= & {} \left[ -y(t-1),-y(t-2),\ldots ,-y(t-n_a)\right] ^{\mathrm{T}},\end{aligned}$$
(27)
$$\begin{aligned} \hat{{\varvec{\varphi }}}_{x,k}(t)= & {} \left[ -\hat{x}_{k-1}(t-1),-\hat{x}_{k-1}(t-2),\ldots ,-\hat{x}_{k-1}(t-n_f)\right] ^{\mathrm{T}},\end{aligned}$$
(28)
$$\begin{aligned} \hat{x}_k(t)= & {} \hat{{\varvec{\varphi }}}_x^{\mathrm{T}}(t)\hat{\varvec{f}}_k(t)+{\varvec{\phi }}^{\mathrm{T}}(t)\hat{{\varvec{\theta }}}_k(t),\end{aligned}$$
(29)
$$\begin{aligned} \hat{{\varvec{\theta }}}_{1,k}(t)= & {} \left[ \begin{array}{c} \hat{\varvec{a}}_k(t) \\ \hat{{\varvec{\theta }}}_k(t) \end{array} \right] ,\end{aligned}$$
(30)
$$\begin{aligned} \hat{\varvec{a}}_k(t):= & {} \left[ \hat{a}_{1,k}(t), \hat{a}_{2,k}(t),\ldots ,\hat{a}_{n_a,k}(t)\right] ^{\mathrm{T}},\end{aligned}$$
(31)
$$\begin{aligned} \hat{\varvec{f}}_k(t):= & {} \left[ \hat{f}_{1,k}(t),\hat{f}_{2,k}(t),\ldots ,\hat{f}_{n_f,k}(t)\right] ^{\mathrm{T}}. \end{aligned}$$
(32)

In the D-LSI algorithm, the dimensions of the covariance matrices \(\varvec{S}_1^{-1}(t)\) and \(\hat{\varvec{S}}_{2,k}^{-1}(t)\) in (22) and (24) are \((n_a+m) \times (n_a+m)\) and \(n_f\times n_f\). In the LSI algorithm, the dimension of the covariance matrix \(\hat{\varvec{S}}_k^{-1}(t)\) in (6) is \((n_a+m+n_f) \times (n_a+m+n_f)\). Thus, the D-LSI algorithm requires less computational cost than the LSI algorithm.

The steps involved in the D-LSI algorithm to compute the parameter estimation vectors \(\hat{{\varvec{\theta }}}_{1,k}(t)\) and \(\hat{\varvec{f}}_k(t)\) are listed in the following.

  1. 1.

    Set the data length p, let \(t=p\), collect the observation data {y(i), \({\varvec{\phi }}(i)\): \(i=0, 1, \ldots , p-1\)}, and set a small positive number \(\varepsilon \).

  2. 2.

    Collect the observation data y(t) and \({\varvec{\phi }}(t)\) and form \({\varvec{\varphi }}_y(t)\) using (27) and \({\varvec{\varphi }}_1(t)\) using (26).

  3. 3.

    Let \(k=1\), set the initial values \(\hat{{\varvec{\theta }}}_{1,0}(0)=\mathbf{1}_{n_a+m}/p_0\), \(\hat{\varvec{f}}_0(0)=\mathbf{1}_{n_f}/p_0\), \(\hat{x}_0(t-i)=1/p_0\) \((i=1,2,\ldots ,n_f)\), \(p_0=10^6\).

  4. 4.

    Form \(\hat{{\varvec{\varphi }}}_{x,k}(t)\) using (28), compute \(\varvec{S}_1(t)\) and \(\hat{\varvec{S}}_{2,k}(t)\) using (23) and (25).

  5. 5.

    Update the parameter estimation vectors \(\hat{{\varvec{\theta }}}_{1,k}(t)\) and \(\hat{\varvec{f}}_k(t)\) using (22) and (24), respectively.

  6. 6.

    Read \(\hat{{\varvec{\theta }}}_k(t)\) from \(\hat{{\varvec{\theta }}}_{1,k}(t)\) using (30), and compute \(\hat{x}_k(t)\) using (29).

  7. 7.

    Compare \(\hat{{\varvec{\theta }}}_{1,k}(t)\) with \(\hat{{\varvec{\theta }}}_{1,k-1}(t)\) and \(\hat{\varvec{f}}_k(t)\) with \(\hat{\varvec{f}}_{k-1}(t)\): if

    $$\begin{aligned} \Vert \hat{{\varvec{\theta }}}_{1,k}(t)-\hat{{\varvec{\theta }}}_{1,k-1}(t)\Vert +\Vert \hat{\varvec{f}}_k(t)-\hat{\varvec{f}}_{k-1}(t)\Vert \leqslant \varepsilon , \end{aligned}$$

    obtain k, \(\hat{{\varvec{\theta }}}_{1,k}(t)\) and \(\hat{\varvec{f}}_k(t)\), increase t by 1, and go to Step 2; otherwise, increase k by 1, and go to Step 4.

5 The Interval-Varying LSI Algorithm

This section derives an interval-varying LSI algorithm to solve the identification problems of systems with missing data.

In many applications, there are many reasons for missing sampled data to arise. In general, a missing-data system implies that most data are available and few data are missing over a period of time. The following considers such a system with missing data that the inputs are normally available at every instant t because the input signals are usually generated by digital computers in practice, and only a small number of data are missing, as shown in Fig. 1 [8, 12], where “\(+\)” stands for missing data or bad data (outliers or unbelievable data), e.g., the outputs y(3), y(8), y(9), y(23), \(\ldots \) are missing samples and y(15), \(\ldots \) are unbelievable samples.

Fig. 1
figure 1

A missing output data pattern

For convenience, we define an integer sequence \(\{t_s, s=0,1,2,\ldots \}\) satisfying

$$\begin{aligned} 0=t_0<t_1<t_2<t_3<\cdots <t_{s-1}<t_s<\cdots \end{aligned}$$

with \(t^*_s:=t_s-t_{s-1}\geqslant 1\), such that y(t) and \({\varvec{\varphi }}_y(t)\) are available only when \(t=t_s\) \((s=0, 1, 2, \ldots )\), or equivalently, the data set \(\{y(t_s), {\varvec{\varphi }}_y(t_s): s=0,1,2,\ldots \}\) contains all available outputs. For instance, for the missing-data pattern in Fig. 1, when the order \(n_a=3\), define the integer sequence \(\{t_0\), \(t_1\), \(t_2\), \(\ldots \), \(t_9\), \(\ldots \}\), for \(t_0=0\), \(t_1=7\), \(t_2=13\), \(\ldots \), \(t_9=28\), \(\ldots \), i.e., {\(y(t_0), {\varvec{\varphi }}_y(t_0)\)}, {\(y(t_1), {\varvec{\varphi }}_y(t_1)\)}, {\(y(t_2), {\varvec{\varphi }}_y(t_2)\)}, \(\ldots \), {\(y(t_9), {\varvec{\varphi }}_y(t_9)\)}, \(\ldots \) are available.

Replacing t in (4) with \(t_s\) gives

$$\begin{aligned} y(t_s)={\varvec{\varphi }}^{\mathrm{T}}(t_s){\varvec{\vartheta }}+v(t_s) \end{aligned}$$
(33)

with

$$\begin{aligned} {\varvec{\varphi }}(t_s)= & {} \left[ {\varvec{\varphi }}_y^{\mathrm{T}}(t_s), {\varvec{\varphi }}_x^{\mathrm{T}}(t_s), {\varvec{\phi }}^{\mathrm{T}}(t_s)\right] ^{\mathrm{T}},\nonumber \\ {\varvec{\varphi }}_y(t_s)= & {} [-y(t_s-1), -y(t_s-2), \ldots , -y(t_s-n_a)]^{\mathrm{T}},\nonumber \\ {\varvec{\varphi }}_x(t_s)= & {} [-x(t_s-1), -x(t_s-2), \ldots , -x(t_s-n_f)]^{\mathrm{T}}. \end{aligned}$$
(34)

Consider p data from \(i=t_{s-p+1}\) to \(i=t_s\). Define the stacked output vector \(\varvec{Y}(t_s)\) and the stacked information matrix \({\varvec{\varPsi }}(t_s)\) as

$$\begin{aligned} \varvec{Y}(t_s):= & {} \left[ \begin{array}{c} y(t_s) \\ y(t_{s-1}) \\ \vdots \\ y(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p},\quad {\varvec{\varPsi }}(t_s):=\left[ \begin{array}{c} {\varvec{\varphi }}^{\mathrm{T}}(t_s) \\ {\varvec{\varphi }}^{\mathrm{T}}(t_{s-1}) \\ \vdots \\ {\varvec{\varphi }}^{\mathrm{T}}(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p\times n}. \end{aligned}$$

Assume that the information vector \({\varvec{\varphi }}(t_s)\) is persistently exciting for large p, that is, \([{\varvec{\varPsi }}^{\mathrm{T}}(t_s){\varvec{\varPsi }}(t_s)]\) is non-singular. The difficulty is that the information vector \({\varvec{\varphi }}_x(t_s)\) in \({\varvec{\varPsi }}(t_s)\) contains the unknown variable \(x(t_s-i)\). Replacing \(x(t_s-i)\) in (34) with their estimates \(\hat{x}_{k-1}(t_s-i)\) at iteration \(k-1\), and minimizing the quadratic function

$$\begin{aligned} J({\varvec{\vartheta }}):= & {} \Vert \varvec{Y}(t_s)-{\varvec{\varPsi }}(t_s){\varvec{\vartheta }}\Vert ^2, \end{aligned}$$

we can obtain the following interval-varying least squares-based iterative (V-LSI) algorithm for estimating the parameter vector \({\varvec{\vartheta }}\):

$$\begin{aligned} \hat{{\varvec{\vartheta }}}_k(t_s)= & {} \left[ \hat{{\varvec{\varPsi }}}_k^{\mathrm{T}}(t_s)\hat{{\varvec{\varPsi }}}_k(t_s)\right] ^{-1}\hat{{\varvec{\varPsi }}}_k^{\mathrm{T}}(t_s)\varvec{Y}(t_s), \end{aligned}$$
(35)
$$\begin{aligned} \hat{{\varvec{\vartheta }}}_k(t)= & {} \hat{{\varvec{\vartheta }}}_k(t_s),\ t\in T_s:=\{t_s, t_s+1, \ldots , t_{s+1}-1\},\end{aligned}$$
(36)
$$\begin{aligned} \varvec{Y}(t_s)= & {} \left[ y(t_s), y(t_{s-1}), \ldots , y(t_{s-p+1})\right] ^{\mathrm{T}},\end{aligned}$$
(37)
$$\begin{aligned} \hat{{\varvec{\varPsi }}}_k(t_s)= & {} \left[ \hat{{\varvec{\varphi }}}_k(t_s), \hat{{\varvec{\varphi }}}_k(t_{s-1}), \ldots , \hat{{\varvec{\varphi }}}_k(t_{s-p+1})\right] ^{\mathrm{T}},\end{aligned}$$
(38)
$$\begin{aligned} \hat{{\varvec{\varphi }}}_k(t_s)= & {} \left[ {\varvec{\varphi }}_y^{\mathrm{T}}(t_s), \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t_s), {\varvec{\phi }}^{\mathrm{T}}(t_s)\right] ^{\mathrm{T}},\end{aligned}$$
(39)
$$\begin{aligned} {\varvec{\varphi }}_y(t_s)= & {} \left[ -y(t_s-1),-y(t_s-2),\ldots ,-y(t_s-n_a)\right] ^{\mathrm{T}},\end{aligned}$$
(40)
$$\begin{aligned} \hat{{\varvec{\varphi }}}_{x,k}(t_s)= & {} \left[ -\hat{x}_{k-1}(t_s-1),-\hat{x}_{k-1}(t_s-2),\ldots ,-\hat{x}_{k-1}(t_s-n_f)\right] ^{\mathrm{T}},\end{aligned}$$
(41)
$$\begin{aligned} \hat{{\varvec{\vartheta }}}_k(t_s)= & {} \left[ \hat{\varvec{a}}_k^{\mathrm{T}}(t_s), \hat{\varvec{f}}_k^{\mathrm{T}}(t_s), \hat{{\varvec{\theta }}}_k^{\mathrm{T}}(t_s)\right] ^{\mathrm{T}},\end{aligned}$$
(42)
$$\begin{aligned} \hat{x}_k(j)= & {} \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(j)\hat{\varvec{f}}_k(t_s)+{\varvec{\phi }}^{\mathrm{T}}(j)\hat{{\varvec{\theta }}}_k(t_s),\ j\in \left[ t_1,t_{s+1}\right] ,\nonumber \\&\hat{x}_k(i)=1/p_0, \quad i\leqslant t_1-1. \end{aligned}$$
(43)

We simply hold the parameter estimate \(\hat{{\varvec{\vartheta }}}_k(t)\) remains unchanged over the interval \(\left[ t_s,t_{s+1}-1\right] \).

6 The Interval-Varying D-LSI Algorithm

In the following, we study an interval-varying D-LSI algorithm based on the decomposition technique to reduce computational cost.

Replacing t in (14)–(17) with \(t_s\) gives

$$\begin{aligned} y_1(t_s)= & {} y(t_s)-{\varvec{\varphi }}_x^{\mathrm{T}}(t_s)\varvec{f}\\= & {} {\varvec{\varphi }}_1^{\mathrm{T}}(t_s){\varvec{\theta }}_1+v(t_s),\\ y_2(t_s)= & {} y(t_s)-{\varvec{\varphi }}_1^{\mathrm{T}}(t_s){\varvec{\theta }}_1\\= & {} {\varvec{\varphi }}_x^{\mathrm{T}}(t_s)\varvec{f}+v(t_s). \end{aligned}$$

Define the stacked output vectors \(\varvec{Y}(t_s)\), \(\varvec{Y}_1(t_s)\) and \(\varvec{Y}_2(t_s)\) and the stacked information matrices \({\varvec{\varPsi }}_1(t_s)\) and \({\varvec{\varPsi }}_x(t_s)\) as

$$\begin{aligned} \varvec{Y}(t_s):= & {} \left[ \begin{array}{c} y(t_s) \\ y(t_{s-1}) \\ \vdots \\ y(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p},\quad \varvec{Y}_1(t_s):=\left[ \begin{array}{c} y_1(t_s) \\ y_1(t_{s-1}) \\ \vdots \\ y_1(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p},\\ \varvec{Y}_2(t_s):= & {} \left[ \begin{array}{c} y_2(t_s) \\ y_2(t_{s-1}) \\ \vdots \\ y_2(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p}, {\varvec{\varPsi }}_1(t_s):=\left[ \begin{array}{c} {\varvec{\varphi }}_1^{\mathrm{T}}(t_s) \\ {\varvec{\varphi }}_1^{\mathrm{T}}(t_{s-1}) \\ \vdots \\ {\varvec{\varphi }}_1^{\mathrm{T}}(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p\times (n_a+m)},\quad \\ {\varvec{\varPsi }}_x(t_s):= & {} \left[ \begin{array}{c} {\varvec{\varphi }}_x^{\mathrm{T}}(t_s) \\ {\varvec{\varphi }}_x^{\mathrm{T}}(t_{s-1}) \\ \vdots \\ {\varvec{\varphi }}_x^{\mathrm{T}}(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p\times n_f}. \end{aligned}$$

Define two quadratic functions:

$$\begin{aligned} J_1({\varvec{\theta }}_1):= & {} \Vert \varvec{Y}_1(t_s)-{\varvec{\varPsi }}_1(t_s){\varvec{\theta }}_1\Vert ^2,\\ J_2(\varvec{f}):= & {} \Vert \varvec{Y}_2(t_s)-{\varvec{\varPsi }}_x(t_s)\varvec{f}\Vert ^2. \end{aligned}$$

Assume that the information vectors \({\varvec{\varphi }}_1(t_s)\) and \({\varvec{\varphi }}_x(t_s)\) are persistently exciting for large p, that is, \([{\varvec{\varPsi }}_1^{\mathrm{T}}(t_s){\varvec{\varPsi }}_1(t_s)]\) and \([{\varvec{\varPsi }}_x^{\mathrm{T}}(t_s){\varvec{\varPsi }}_x(t_s)]\) are non-singular. Letting the partial derivatives of \(J_1({\varvec{\theta }}_1)\) and \(J_2(\varvec{f})\) with respect to \({\varvec{\theta }}_1\) and \(\varvec{f}\) be zero leads to the following least squares estimates of the parameter vectors \({\varvec{\theta }}_1\) and \(\varvec{f}\):

$$\begin{aligned} \hat{{\varvec{\theta }}}_1(t_s)= & {} \left[ {\varvec{\varPsi }}_1^{\mathrm{T}}(t_s){\varvec{\varPsi }}_1(t_s)\right] ^{-1}{\varvec{\varPsi }}_1^{\mathrm{T}}(t_s)\varvec{Y}_1(t_s)\nonumber \\= & {} \left[ {\varvec{\varPsi }}_1^{\mathrm{T}}(t_s){\varvec{\varPsi }}_1(t_s)\right] ^{-1}{\varvec{\varPsi }}_1^{\mathrm{T}}(t_s)[\varvec{Y}(t_s)-{\varvec{\varPsi }}_x(t_s)\varvec{f}], \end{aligned}$$
(44)
$$\begin{aligned} \hat{\varvec{f}}(t_s)= & {} \left[ {\varvec{\varPsi }}_x^{\mathrm{T}}(t_s){\varvec{\varPsi }}_x(t_s)\right] ^{-1}{\varvec{\varPsi }}_x^{\mathrm{T}}(t_s)\varvec{Y}_2(t_s)\nonumber \\= & {} \left[ {\varvec{\varPsi }}_x^{\mathrm{T}}(t_s){\varvec{\varPsi }}_x(t_s)\right] ^{-1}{\varvec{\varPsi }}_x^{\mathrm{T}}(t_s)[\varvec{Y}(t_s)-{\varvec{\varPsi }}_1(t_s){\varvec{\theta }}_1]. \end{aligned}$$
(45)

However, we can see that the right-hand sides of Eqs. (44) and (45) contain the unknown parameters \({\varvec{\theta }}_1\) and \(\varvec{f}\), and the information vector \({\varvec{\varphi }}_x(t_s)\) in \({\varvec{\varPsi }}_x(t_s)\) contains the unknown term \(x(t_s-i)\), so the estimates \(\hat{{\varvec{\theta }}}_1(t_s)\) and \(\hat{\varvec{f}}(t_s)\) are impossible to compute directly. Here, we use the hierarchical identification principle to solve this problem: let

\(\hat{{\varvec{\theta }}}_{1,k}(t_s):=\left[ \begin{array}{c} \hat{\varvec{a}}_k(t_s) \\ \hat{{\varvec{\theta }}}_k(t_s) \end{array} \right] \) and \(\hat{\varvec{f}}_k(t_s)\) be the iterative estimates of \({\varvec{\theta }}_1=\left[ \begin{array}{c} \varvec{a} \\ {\varvec{\theta }} \end{array} \right] \) and \(\varvec{f}\) at iteration k, respectively, and \(\hat{x}_k(t_s-i)\) be the estimate of \(x(t_s-i)\). Define

$$\begin{aligned} \hat{{\varvec{\varphi }}}_{x,k}(t_s):= & {} \left[ -\hat{x}_{k-1}(t_s-1), -\hat{x}_{k-1}(t_s-2), \ldots , -\hat{x}_{k-1}(t_s-n_f)\right] ^{\mathrm{T}}\in {\mathbb {R}}^{n_f},\\ \hat{{\varvec{\varPsi }}}_{x,k}(t_s):= & {} \left[ \begin{array}{c} \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t_s) \\ \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t_{s-1}) \\ \vdots \\ \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(t_{s-p+1}) \end{array}\right] \in {\mathbb {R}}^{p\times n_f}. \end{aligned}$$

Replacing \({\varvec{\theta }}\), \(\varvec{f}\) and \({\varvec{\varphi }}_x(t_s)\) in (2) with \(\hat{{\varvec{\theta }}}_k(t_s)\), \(\hat{\varvec{f}}_k(t_s)\) and \(\hat{{\varvec{\varphi }}}_{x,k}(j)\), the estimate \(\hat{x}_k(j)\) of x(j) can be computed by

$$\begin{aligned} \hat{x}_k(j)=\hat{{\varvec{\varphi }}}^{\mathrm{T}}_{x,k}(j)\hat{\varvec{f}}_k(t_s)+{\varvec{\phi }}^{\mathrm{T}}(j)\hat{{\varvec{\theta }}}_k(t_s). \end{aligned}$$

Replacing \({\varvec{\varPsi }}_x(t_s)\), \({\varvec{\theta }}_1\) and \(\varvec{f}\) in (44) and (45) with their corresponding estimates \(\hat{{\varvec{\varPsi }}}_{x,k}(t_s)\), \(\hat{{\varvec{\theta }}}_{1,k-1}(t_s)\) and \(\hat{\varvec{f}}_{k-1}(t_s)\), respectively, we can summarize the interval-varying D-LSI algorithm of computing \(\hat{{\varvec{\theta }}}_{1,k}(t_s)\) and \(\hat{\varvec{f}}_k(t_s)\) as

$$\begin{aligned} \hat{{\varvec{\theta }}}_{1,k}(t_s)= & {} \left[ {\varvec{\varPsi }}_1^{\mathrm{T}}(t_s){\varvec{\varPsi }}_1(t_s)\right] ^{-1}{\varvec{\varPsi }}_1^{\mathrm{T}}(t_s)\left[ \varvec{Y}(t_s)-\hat{{\varvec{\varPsi }}}_{x,k}(t_s)\hat{\varvec{f}}_{k-1}(t_s)\right] , \end{aligned}$$
(46)
$$\begin{aligned} \hat{{\varvec{\theta }}}_{1,k}(t)= & {} \hat{{\varvec{\theta }}}_{1,k}(t_s),\ t\in T_s:=\{t_s, t_s+1, \ldots , t_{s+1}-1\},\end{aligned}$$
(47)
$$\begin{aligned} \hat{\varvec{f}}_k(t_s)= & {} \left[ \hat{{\varvec{\varPsi }}}_{x,k}^{\mathrm{T}}(t_s)\hat{{\varvec{\varPsi }}}_{x,k}(t_s)\right] ^{-1}\hat{{\varvec{\varPsi }}}_{x,k}^{\mathrm{T}}(t_s)\left[ \varvec{Y}(t_s)-{\varvec{\varPsi }}_1(t_s)\hat{{\varvec{\theta }}}_{1,k-1}(t_s)\right] ,\end{aligned}$$
(48)
$$\begin{aligned} \hat{\varvec{f}}_k(t)= & {} \hat{\varvec{f}}_k(t_s),\end{aligned}$$
(49)
$$\begin{aligned} \varvec{Y}(t_s)= & {} \left[ y(t_s), y(t_{s-1}), \ldots , y(t_{s-p+1})\right] ^{\mathrm{T}},\end{aligned}$$
(50)
$$\begin{aligned} {\varvec{\varPsi }}_1(t_s)= & {} \left[ {\varvec{\varphi }}_1(t_s), {\varvec{\varphi }}_1(t_{s-1}), \ldots , {\varvec{\varphi }}_1(t_{s-p+1})\right] ^{\mathrm{T}},\end{aligned}$$
(51)
$$\begin{aligned} \hat{{\varvec{\varPsi }}}_{x,k}(t_s)= & {} \left[ \hat{{\varvec{\varphi }}}_{x,k}(t_s), \hat{{\varvec{\varphi }}}_{x,k}(t_{s-1}), \ldots , \hat{{\varvec{\varphi }}}_{x,k}(t_{s-p+1})\right] ^{\mathrm{T}},\end{aligned}$$
(52)
$$\begin{aligned} {\varvec{\varphi }}_1(t_s)= & {} \left[ \begin{array}{c} {\varvec{\varphi }}_y(t_s) \\ {\varvec{\phi }}(t_s) \end{array} \right] ,\end{aligned}$$
(53)
$$\begin{aligned} {\varvec{\varphi }}_y(t_s)= & {} \left[ -y(t_s-1),-y(t_s-2),\ldots ,-y(t_s-n_a)\right] ^{\mathrm{T}},\end{aligned}$$
(54)
$$\begin{aligned} \hat{{\varvec{\varphi }}}_{x,k}(t_s)= & {} \left[ -\hat{x}_{k-1}(t_s-1),-\hat{x}_{k-1}(t_s-2),\ldots ,-\hat{x}_{k-1}(t_s-n_f)\right] ^{\mathrm{T}},\end{aligned}$$
(55)
$$\begin{aligned} \hat{x}_k(j)= & {} \hat{{\varvec{\varphi }}}_{x,k}^{\mathrm{T}}(j)\hat{\varvec{f}}_k(t_s)+{\varvec{\phi }}^{\mathrm{T}}(j)\hat{{\varvec{\theta }}}_k(t_s),\ j\in \left[ t_1,t_{s+1}-1\right] ,\end{aligned}$$
(56)
$$\begin{aligned} \hat{{\varvec{\theta }}}_{1,k}(t_s)= & {} \left[ \begin{array}{c} \hat{\varvec{a}}_k(t_s) \\ \hat{{\varvec{\theta }}}_k(t_s) \end{array} \right] ,\end{aligned}$$
(57)
$$\begin{aligned} \hat{\varvec{a}}_k(t_s)= & {} \left[ \hat{a}_{1,k}(t_s), \hat{a}_{2,k}(t_s), \ldots , \hat{a}_{n_a,k}(t_s)\right] ^{\mathrm{T}},\end{aligned}$$
(58)
$$\begin{aligned} \hat{\varvec{f}}_k(t_s)= & {} \left[ \hat{f}_{1,k}(t_s), \hat{f}_{2,k}(t_s), \ldots , \hat{f}_{n_f,k}(t_s)\right] ^{\mathrm{T}}. \end{aligned}$$
(59)

To initialize the algorithm, we take \(\hat{{\varvec{\theta }}}_{1,0}(t_0)\) and \(\hat{\varvec{f}}_0(t_0)\) as real vectors with small positive entries, e.g., \(\hat{{\varvec{\theta }}}_{1,0}(t_0)=\mathbf{1}_{n_a+m}/p_0\), \(\hat{\varvec{f}}_0(t_0)=\mathbf{1}_{n_f}/p_0\), \(\hat{x}_0(i)=1/p_0\) \((i\leqslant t_1-1)\), \(p_0=10^6\). The parameter estimates \(\hat{{\varvec{\theta }}}_{1,k}(t)\) and \(\hat{\varvec{f}}_k(t)\) in (47) and (49) remain unchanged over the interval \(\left[ t_s,t_{s+1}-1\right] \).

The interval-varying D-LSI algorithm (which is abbreviated as the V-D-LSI algorithm) uses the data over the finite data window with the length p, thus the V-D-LSI algorithm can track time-varying parameters and be used for online identification. The interval-varying identification algorithms are proposed for missing-data systems but can be extended to systems with scarce measurements.

7 Examples

Example 1

Consider the following linear-in-parameters system:

$$\begin{aligned} A(z)y(t)= & {} \frac{{\varvec{\phi }}^{\mathrm{T}}(t)}{F(z)}{\varvec{\theta }}+v(t),\\ A(z)= & {} 1+a_1z^{-1}+a_2z^{-2}=1+0.27z^{-1}+0.75z^{-2},\\ F(z)= & {} 1+f_1z^{-1}+f_2z^{-2}=1-0.31z^{-1}0.44z^{-2},\\ {\varvec{\theta }}= & {} [b_1, b_2]^{\mathrm{T}}=[-0.56, 0.91]^{\mathrm{T}},\\ {\varvec{\vartheta }}= & {} [a_1, a_2, f_1, f_2, b_1, b_2]^{\mathrm{T}}=[0.27, 0.75, -0.31, 0.44, -0.56, 0.91]^{\mathrm{T}}. \end{aligned}$$

In simulation, \(\{{\varvec{\phi }}(t)\}\) is taken as an uncorrelated persistent excitation vector sequence with zero mean and unit variance, \(\{v(t)\}\) as a white noise sequence with zero mean and different variances \(\sigma ^2=0.10^2\) and \(\sigma ^2= 0.50^2\), respectively. Take the data length \(t=p=L_e=3000\) data and apply the LSI algorithm and the D-LSI algorithm to identify this example system, the parameter estimates and their errors \(\delta \) versus iteration k are given in Tables 1 and 2 and Figs. 2 and 3 where the parameter estimation error is defined as \(\delta :=\Vert \hat{{\varvec{\vartheta }}}_k-{\varvec{\vartheta }}\Vert /\Vert {\varvec{\vartheta }}\Vert \times 100\,\%\).

Table 1 The LSI parameter estimates and errors versus iteration k
Table 2 The D-LSI parameter estimates and errors versus iteration k
Fig. 2
figure 2

The LSI estimation errors versus k with \(\sigma ^2=0.10^2\) and \(\sigma ^2=0.50^2\)

Fig. 3
figure 3

The D-LSI estimation errors versus k with \(\sigma ^2=0.10^2\) and \(\sigma ^2=0.50^2\)

From Tables 1 and 2 and Figs. 2 and 3, we can draw the following conclusions.

  • The estimation errors given by the LSI algorithm and the D-LSI algorithm become smaller (in general) as iteration k increases or the noise variance \(\sigma ^2\) decreases—see Tables 1 and 2 and Figs. 2 and 3.

  • The parameter estimates given by the LSI algorithm and the D-LSI algorithm are very close to the true parameters for large k—see Tables 1 and 2.

Table 3 The V-LSI parameter estimates and errors versus iteration k
Table 4 The V-D-LSI parameter estimates and errors versus iteration k

Example 2

Consider the following linear-in-parameters system with missing data:

$$\begin{aligned} A(z)y(t)= & {} \frac{{\varvec{\phi }}^{\mathrm{T}}(t)}{F(z)}{\varvec{\theta }}+v(t),\\ A(z)= & {} 1+a_1z^{-1}+a_2z^{-2}=1-1.17z^{-1}+0.45z^{-2},\\ F(z)= & {} 1+f_1z^{-1}+f_2z^{-2}=1-0.35z^{-1}+0.52z^{-2},\\ {\varvec{\theta }}= & {} [b_1, b_2]^{\mathrm{T}}=[0.56, 0.93]^{\mathrm{T}},\\ {\varvec{\vartheta }}= & {} [a_1, a_2, f_1, f_2, b_1, b_2]^{\mathrm{T}}=[-1.17,0.45, -0.35,0.52, 0.56, 0.93]^{\mathrm{T}}. \end{aligned}$$

The simulation conditions are similar to those of Example 1, here the noise variances \(\sigma ^2=0.50^2\) and \(\sigma ^2=1.00^2\), respectively. Take \(s=p=L_e=3000\) and \(t^*_s=3\), collect the input–output data \(\{{\varvec{\phi }}(t), y(t_s)\}\). Apply the V-LSI algorithm and the V-D-LSI algorithm to identify this example system, the parameter estimates and their estimation errors \(\delta :=\Vert \hat{{\varvec{\vartheta }}}_k(t_s)-{\varvec{\vartheta }}\Vert /\Vert {\varvec{\vartheta }}\Vert \times 100\,\%\) versus k are given in Tables 3 and 4 and Figs. 4 and 5.

Fig. 4
figure 4

The V-LSI estimation errors versus k with \(\sigma ^2=0.50^2\) and \(\sigma ^2=1.00^2\)

Fig. 5
figure 5

The V-D-LSI estimation errors versus k with \(\sigma ^2=0.50^2\) and \(\sigma ^2=1.00^2\)

From Tables 3 and 4 and Figs. 4 and 5, it is clear that as the iteration k increases, the parameter estimates given by the V-LSI algorithm and the V-D-LSI algorithm converge to their true values, and the estimation errors become smaller (generally); under the same data length and noise variance, the estimation accuracies of the V-LSI algorithm and the V-D-LSI algorithm are close.

8 Conclusions

A D-LSI algorithm and a V-D-LSI algorithm are derived for identifying the linear-in-parameters systems by means of the least squares search and the decomposition technique. The analysis shows that under the same noise level and iteration, the D-LSI algorithm and the V-D-LSI algorithm give almost same parameter estimation accuracy. Compared with the LSI algorithm and the V-LSI algorithm, the decomposition-based iterative algorithms require less computational cost. The simulation results indicate that the proposed algorithms can generate highly accurate parameter estimates. The identification idea can be extended to study the parameter estimation problems of other linear systems and nonlinear systems with colored noises, missing-data systems and scarce measurement systems [34, 35], hybrid networks and uncertain chaotic delayed systems [20, 21], and can be applied to other fields [15].