Multistage parameter estimation algorithms for identification of bilinear systems

Shahriari, Fatemeh; Arefi, Mohammad Mehdi; Luo, Hao; Yin, Shen

doi:10.1007/s11071-022-07749-0

Multistage parameter estimation algorithms for identification of bilinear systems

Original Paper
Published: 11 August 2022

Volume 110, pages 2635–2655, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Nonlinear Dynamics Aims and scope Submit manuscript

Multistage parameter estimation algorithms for identification of bilinear systems

Download PDF

Fatemeh Shahriari¹,
Mohammad Mehdi Arefi ORCID: orcid.org/0000-0003-3986-8205¹,
Hao Luo² &
…
Shen Yin³

334 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, two methods for parameter estimation of bilinear state-space systems with colored noise, which are expressed by ARMA model, are proposed. Using the hierarchical identification principle and gradient method, to reduce the computational cost, both the four-stage recursive least squares algorithm and the four-stage stochastic gradient algorithm are exploited by which parameter estimation error is reduced and the speed of convergence of parameters is increased. In addition, a bilinear state observer for state estimation is designed to make use of the estimated states in the four-stage recursive least squares and the four-stage stochastic gradient algorithms. Finally, a numerical example and a practical example are provided to indicate the superiority of the proposed methods. The results show that due to the data length increase, the estimation error of the parameters is reduced. Furthermore, the estimated parameters converge to the actual values in a short time.

Filtering-based maximum likelihood hierarchical recursive identification algorithms for bilinear stochastic systems

Article 13 April 2023

Least-squares-based iterative and gradient-based iterative estimation algorithms for bilinear systems

Article 17 March 2017

Recursive parameter identification of the dynamical models for bilinear state space systems

Article 15 June 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Automotive Engineering

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

System identification has been the focus of many researchers in recent decades in various areas, for modeling linear [1, 2] and nonlinear systems [3,4,5]. The main goal of system identification is to utilize input/output data to obtain an appropriate model, in a way that the behavior of the identified model is as close as possible to the behavior of the actual system, as per a predefined criterion. The mathematical model thus created can be used for analysis or controller design [6,7,8,9].

Mathematical modeling is possible in several ways: analytical modeling (white-box), experimental modeling (black-box) and hybrid modeling (gray-box). In the white-box case, considering the nature of the components of the system and the physical laws governing them, mathematical models are formed between the input and the output of the system. In black-box approach, there is no information about the internal components of the system, and only by using the input /output data, a mathematical relationship is obtained between the input and the output. In the gray-box approach, the physical components of the system are distinctive; however, their values are unknown, and by using the input/output data, the unknown parameters can be determined. Several system identification methods have been proposed for linear and nonlinear systems [10, 11]. Since most industrial processes are complex and nonlinear, nonlinear systems identification has attracted a lot of attention in the past two decades. However, nonlinear system identification is much more difficult than linear systems identification [12]. In addition, due to the complexity of the systems, it is difficult to obtain physical models since the exact system model is not always available. Therefore, data-driven methods that are dependent on historical information of the system are recently developed to identify the system behavior. These methods do not require complex mathematical tools and are very useful in practice [13, 14].

Bilinear systems are a class of nonlinear systems that are widely studied and used due to their simplicity, and they can be considered as a suitable model for many physical systems [15,16,17]. In recent years, bilinear systems have drawn a significant amount of attention due to their intrinsic simplicity and wide applications [18]. The importance of such systems is due to their wide range of applications, not only in engineering but also in biology, economics and chemistry [19]. Such a system can explain many physical phenomena and it has been used in many fields, such as air conditioning control [20], immune system, heart regulator, control of carbon dioxide in the lungs and blood pressure [21,22,23]. There are a couple of methods that have been proposed for bilinear system identification, such as least-squares methods that are based on minimizing the error, i.e., in other words, the summation of square errors [24, 25], gradient methods [26, 27], maximum likelihood methods [28], iterative methods [29, 30], recursive methods [31,32,33] and error prediction methods [34]. In addition, interaction matrices approach is utilized for identification and observer design of bilinear systems [35, 36]. In [35], optimal bilinear observers are designed for bilinear state-space models, and a new method for identification of bilinear system is introduced in [36]. As can be seen in this paper, by using the interaction matrix formulation, the bilinear system state is expressed in terms of input/output measurement. In the iterative methods, to improve the estimation of the parameters, the algorithm uses all the data in every iteration and the results of the previous iteration. In these methods, the algorithm is applied once the input–output dataset is collected. In the recursive algorithms, the algorithm uses only the current data to improve the previous estimation. In [37], recursive extended least squares and maximum likelihood methods have been used to identify the bilinear system parameters. Gibson et al. [28] have provided maximum likelihood parameter estimation algorithm for bilinear system identification. Also, Kalman filtering algorithm for parameter estimation of bilinear systems has been proposed in [38]. In [39], the state-space model of bilinear systems has been converted to a transfer function by removing state variables and the recursive least squares algorithm and multi-innovation theory are used to increase the parameter estimation accuracy. In a multi-innovation identification algorithm, increasing the innovation length can increase parameter estimation accuracy and reduce the algorithm’s sensitivity to noise.

Li et al. [25] have presented a least-squares iterative algorithm to identify bilinear system parameters using the maximum likelihood. The maximum likelihood iterative least-squares algorithm can provide a more accurate estimate of bilinear systems than the iterative least-squares algorithm. In [40], an iterative algorithm based on the hierarchical principle is proposed to alleviate the complexity of computational load. The proposed algorithm can improve the accuracy of parameter estimation and reduce the computational load. In [41], to achieve a better accuracy, an algorithm using the Kalman filter and multi-innovation theory has been proposed. Both algorithms work well, and the Kalman filter-based multi-innovation recursive extended least squares algorithm has a higher parameter estimation accuracy than the Kalman filter-based recursive extended least squares algorithm. Using the principle of hierarchical identification [40] and data filtering method, a gradient iterative algorithm and a filtering-based gradient iterative algorithm have been presented in [26]. The proposed algorithms can provide very accurate estimates of bilinear systems. Two-step gradient-based iterative algorithm has less computational cost than gradient-based iterative algorithm and its convergence speed is faster than other two methods in this paper. The algorithms presented in [26] provide a better parameter estimation accuracy than the methods presented in [27]. In [42], a state filtering-based hierarchical identification algorithm has been proposed. In [43], multi-innovation-based stochastic gradient algorithm is presented by using the decomposition method. Decomposition-based multi-innovation stochastic gradient algorithm has higher accuracy than the decomposition-based stochastic gradient algorithm. Also, pursuant to hierarchical principle and data filtering, a least squares iterative algorithm is proposed for identification of bilinear systems [44]. Ding et al. [45] have provided stochastic gradient algorithm and gradient iterative algorithm to estimate the parameters of bilinear systems using an auxiliary model. Auxiliary model-based gradient iterative algorithm uses all the input–output data measured in each iteration which is more suitable than the auxiliary stochastic gradient algorithm. The auxiliary model-based gradient iterative algorithm is effective for bilinear systems in a white noise environment. With the help of subspace identification method, a data-driven design approach has been presented in [46]. In [47], the stochastic gradient algorithm using hierarchical identification principle and multi-innovation idea is developed. In addition, in [48], considering the data filtering method, an extended stochastic gradient algorithm has been proposed.

Although the recursive least-squares method is the most common estimation method among the various previously published methods and has a high convergence rate, it suffers from some drawbacks such as high computational burden. Therefore, in order to overcome this problem, other identification methods such as the principle of hierarchical identification are proposed that divide the main system into multiple subsystems with smaller dimensions to estimate unknown parameters. For example, in [25, 48], the computational load for identifying the system is high, and the hierarchical identification principle has been utilized to obtain an effective method for parameter estimation with higher computational efficiency. It has been shown that the suggested algorithms have less parameter estimation error compared to other presented algorithms.

Motivated by the above-stated concerns, a four-stage hierarchical identification algorithm is used in this paper to identify a bilinear system based on state-space equations. In this regard, a four-stage recursive least squares algorithm and a four-stage stochastic gradient algorithm are proposed for the identification of bilinear systems. To improve the computational efficiency using the hierarchical identification principle, the identification model is decomposed into four subsystems and the information vector is separately decomposed into four subvectors with smaller dimensions. In addition, an ARMA colored noise model is used in the presented model. Since only input/output data of the system is available in these algorithms, a state observer is used to estimate the system states, and then, the estimated states are used in the identification algorithm. Finally, the proposed algorithm presented for bilinear system identification is simulated and the convergence of the identified parameters is reported. The main contributions of this paper are listed as follows:

A four-stage recursive least squares algorithm and a four-stage stochastic gradient are proposed using the hierarchical identification principle to reduce the computational efficiency. The principle of hierarchical identification divides the main system into several subsystems with small dimensions. Also, the information vector is broken down into several information subdivisions.
A bilinear state observer is presented based on the Kalman filter algorithm for bilinear state-space estimation.
To show the high efficiency of the four-stage recursive least squares algorithm, a comparison of the computational efficiency between two recursive least squares and four-stage recursive least squares algorithm is provided.

The rest of this paper is organized as follows: In Sect. 2, the preliminary definitions, problem statement and the bilinear state-space system is presented. In Sect. 3, a four-stage recursive least squares algorithm is described. Section 4 shows the computational efficiency of the 4S-RLS algorithm. Section 5 presents a four-stage stochastic gradient algorithm. A numerical example and a practical example are presented in Sect. 6 to show the effectiveness of the proposed algorithm. Finally, the paper is ended by Sect. 7 with some concluding points.

2 Problem statement

In this section, first, a number of notations are explained. Superscript T represents the matrix transpose. $\widehat{\rho }({t})$ is the estimation of the parameter $\rho $ in time. I (${I}_{n})$ represents identity matrix $n\times n$, and $q$ is the unit shift operator as

$$ qz\left( t \right) = z\left( {t + 1} \right),\quad q^{ - 1} z\left( t \right) = z\left( {t - 1} \right) $$

Figure 1 shows the state-space representation of bilinear systems. According to this figure, the bilinear system state-space model is defined as follows:

$${\varvec{z}}\left(t+1\right)=A{\varvec{z}}\left(t\right)+B{\varvec{z}}\left(t\right)\overline{u }\left(t\right)+f\overline{u }\left(t\right)$$

(1)

$$\overline{y }\left(t\right)=h{\varvec{z}}\left(t\right)+\omega (t)$$

(2)

where ${\varvec{z}}(t)=[{{z}_{1}(t),{z}_{2}(t),\cdots ,{z}_{n}(t)]}^{{T}}\in {\mathbb{R}}^{n}$ is the state vector, $\overline{u }\left(t\right)$ is the system input, $\overline{y }\left(t\right)$ is the system output, $\omega \left(t\right)=\frac{D\left(q\right)}{C\left(q\right)}v(t)$ is a colored noise and $v\left(t\right)\in {\mathbb{R}}$ is a zero mean white noise. $A\in {\mathbb{R}}^{n\times n}$, $B\in {\mathbb{R}}^{n\times n}$, $f\in {\mathbb{R}}^{n}$ and $h\in {\mathbb{R}}^{1\times n}$ are the system matrices and vectors with an appropriate dimension as follows:

$$ \begin{aligned} A & = \left[ {\begin{array}{*{20}l} { - a_{1} } \hfill & 1 \hfill & 0 \hfill & \ldots \hfill & 0 \hfill \\ { - a_{2} } \hfill & 0 \hfill & 1 \hfill & \ddots \hfill & 0 \hfill \\ \vdots \hfill & \vdots \hfill & \ddots \hfill & \ddots \hfill & 0 \hfill \\ { - a_{n - 1} } \hfill & 0 \hfill & \cdots \hfill & 0 \hfill & 1 \hfill \\ { - a_{n} } \hfill & 0 \hfill & \ldots \hfill & 0 \hfill & 0 \hfill \\ \end{array} } \right] \in {\mathbb{R}}^{n \times n} \\ B & = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {{\varvec{b}}_{{\varvec{1}}} } \\ {{\varvec{b}}_{{\varvec{2}}} } \\ \end{array} } \\ \vdots \\ {{\varvec{b}}_{{\varvec{n}}} } \\ \end{array} } \right] \in {\mathbb{R}}^{n \times n} \quad {\varvec{b}}_{{\varvec{i}}} \in {\mathbb{R}}^{1 \times n} \\ f & = \left[ {f_{1} ,f_{2} , \ldots ,f_{n} } \right]^{T} \in {\mathbb{R}}^{n} ,\quad h = \left[ {1,0, \ldots ,0} \right] \in {\mathbb{R}}^{1 \times n} \\ \end{aligned} $$

Using the shift operator, the polynomials $D\left(q\right)$ and $C\left(q\right)$ are defined as

$$ \begin{aligned} D\left( q \right) & = 1 + d_{1} q^{ - 1} + d_{2} q^{ - 2} + \cdots + d_{p} q^{ - p} \\ C\left( q \right) & = 1 + c_{1} q^{ - 1} + c_{2} q^{ - 2} + \cdots + c_{m} q^{ - m} \\ \end{aligned} $$

According to the method presented in [48] and also using Eqs. (1) and (2), it can be concluded that

$$ \begin{aligned} z_{1} \left( t \right) & = - \mathop \sum \limits_{i = 1}^{n} a_{i} z_{1} \left( {t - i} \right) + \mathop \sum \limits_{i = 1}^{n} {\varvec{b}}_{{\varvec{i}}} {\varvec{z}}\left( {t - i} \right)\overline{u}\left( {t - i} \right) \\ { } & \quad + \mathop \sum \limits_{i = 1}^{n} f_{i} \overline{u}\left( {t - i} \right) \\ \end{aligned} $$

(3)

The parameter vector $\rho $ is defined as:

$$\rho ={\left[{\rho }_{s},{\rho }_{n}\right]}^{{T}}\in {\mathbb{R}}^{{n}^{2}+2n+m+p}$$

where

$$ \begin{aligned} \rho_{s} & = \left[ {\rho_{a}^{{{T}}} ,\rho_{b}^{{{T}}} ,\rho_{f}^{{{T}}} } \right]^{{{T}}} \in {\mathbb{R}}^{{n^{2} + 2n}} \\ \rho_{n} & = \left[ {c_{1} ,c_{2} , \ldots c_{m} , d_{1} ,d_{2} \ldots ,d_{p} } \right]^{{{T}}} \in {\mathbb{R}}^{m + p} \\ \end{aligned} $$

and

$$ \begin{aligned} \rho_{a} & = \left[ {a_{1} ,a_{2} { }, \ldots ,a_{n} } \right]^{{{T}}} \in {\mathbb{R}}^{n} \\ {\varvec{\rho}}_{{\varvec{b}}} & = \left[ {{\varvec{b}}_{1} ,{\varvec{b}}_{2} , \ldots ,{\varvec{b}}_{{\varvec{n}}} } \right]^{{\varvec{T}}} \in {\mathbb{R}}^{{{\varvec{n}}^{2} }} \\ \rho_{f} & = \left[ {f_{1} ,f_{2} , \ldots ,f_{n} } \right]^{{{T}}} \in {\mathbb{R}}^{n} \\ \end{aligned} $$

In addition, the information vector $\varphi \left(t\right)$ is defined as follows:

$$ \begin{aligned} \varphi \left( t \right) & = \left[ {\varphi_{z}^{T} \left( t \right),\varphi_{{z\overline{u}}}^{T} \left( t \right),\varphi_{{\overline{u}}}^{T} \left( t \right),\varphi_{n}^{T} \left( t \right)} \right]^{{{T}}} \in {\mathbb{R}}^{{n_{0} }} , \\ n_{0} & : = n^{2} + 2n + m + p \\ \end{aligned} $$

$$ \begin{aligned} \varphi_{s} \left( t \right) & = \left[ {\varphi_{z}^{T} \left( t \right),\varphi_{{z\overline{u}}}^{T} \left( t \right),\varphi_{{\overline{u}}}^{T} \left( t \right)} \right]^{{{T}}} \in {\mathbb{R}}^{{n_{1} }} , \\ n_{1} & : = n^{2} + 2n \\ \end{aligned} $$

$$ \begin{aligned} \varphi_{z} \left( t \right) & = \left[ { - z_{1} \left( {t - 1} \right), - z_{1} \left( {t - 1} \right) , \ldots , - z_{1} \left( {t - n} \right)} \right]^{T} \in {\mathbb{R}}^{{n_{2} }} , \\ n_{2} & : = n \\ \end{aligned} $$

$$ \begin{aligned} \varphi_{{z\overline{u}}} & = [{\varvec{z}}^{T} \left( {t - 1} \right)\overline{u}\left( {t - 1} \right),{\varvec{z}}^{T} \left( {t - 2} \right)\overline{u}\left( {t - 2} \right), \ldots ,{\varvec{z}}^{T} \left( {t - n} \right)\overline{u}\left( {t - n} \right)]^{T} \in {\mathbb{R}}^{{n_{2} }} , \\ n_{3} & : = n^{2} \\ \end{aligned} $$

$$ \begin{aligned} \varphi_{{\overline{u}}} \left( t \right) & = \left[ {\overline{u}\left( {t - 1} \right),\overline{u}\left( {t - 2} \right) , \ldots , \overline{u}\left( {t - n} \right)} \right]^{T} \in {\mathbb{R}}^{{n_{2} }} , \\ n_{2} & : = n \\ \end{aligned} $$

From (2), the colored noise equation can be written as

$$ \begin{aligned} \omega \left( t \right) & = \left[ {1 - C\left( z \right)} \right]\omega \left( t \right) + D\left( z \right)v\left( t \right) \\ & = - c_{1} \omega \left( {t - 1} \right) - c_{2} \omega \left( {t - 2} \right) - \cdots - c_{m} \omega \left( {t - m} \right) \\ & \quad + v\left( t \right) + d_{1} v\left( {t - 1} \right) + d_{2} v\left( {t - 2} \right) + \cdots \\ & \quad + d_{p} v\left( {t - p} \right) = \varphi_{n}^{T} \left( t \right)\rho_{n} + v\left( t \right) \\ \end{aligned} $$

(4)

where the information vector ${\varphi }_{n}\left(t\right)$ is defined as follows:

$$ \begin{aligned} \varphi_{n} \left( t \right) & = [ - \omega \left( {t - 1} \right), - \omega \left( {t - 2} \right), \ldots , - \omega \left( {t - m} \right), \\ & \quad v\left( {t - 1} \right),v\left( {t - 2} \right), \ldots ,v\left( {t - p} \right)]^{T} \in {\mathbb{R}}^{{n_{4} }} ,\quad n_{4 } : = m + p \\ \end{aligned} $$

Substituting (3) in (2) and according to the definition of information vectors, the bilinear system identification model in (1) and (2) can be expressed as:

$$ \begin{aligned} \overline{y}\left( t \right) & = \varphi_{z}^{T} \left( t \right)\rho_{a} + \varphi_{{z\overline{u}}}^{T} \left( t \right)\rho_{b} + \varphi_{{\overline{u}}}^{T} \left( t \right)\rho_{f} \\ & \quad + \varphi_{n}^{T} \left( t \right)\rho_{n} + v\left( t \right) = \varphi^{T} \left( t \right)\rho + v\left( t \right) \\ \end{aligned} $$

(5)

2.1 Input–output model of bilinear system

The input–output relationship of a bilinear state-space system is obtained for the identification purpose by eliminating the state variables. By removing the state vector in Eqs. (1) and (2), the input–output relationship of the bilinear system is expressed as follows:

$$ \left[ {A\left( q \right) + u\left( {t - n} \right)B\left( q \right)} \right]y\left( t \right) = \left[ {C\left( q \right) + u\left( {t - 1} \right)D\left( q \right)} \right]u\left( t \right) + v\left( t \right) $$

(6)

$$ \begin{aligned} A\left( q \right) & = 1 + a_{1} q^{ - 1} + a_{2} q^{ - 2} + \cdots + a_{{n_{a} }} q^{{ - n_{a} }} \\ B\left( q \right) & = b_{1} q^{ - 1} + b_{2} q^{ - 2} + \cdots + b_{{n_{b} }} q^{{ - n_{b} }} \\ C\left( q \right) & = c_{1} q^{ - 1} + c_{2} q^{ - 2} + \cdots + c_{m} q^{ - m} \\ D\left( q \right) & = d_{2} q^{ - 2} + d_{3} q^{ - 3} + \cdots + c_{w} q^{ - w} \\ \end{aligned} $$

The following steps have been carried out to obtain (6). Therefore, using (1), one can write

$$\left\{\begin{array}{l}{z}_{1}\left(t+1\right)={z}_{2}\left(t\right)+{f}_{1}u\left(t\right)\\ {z}_{2}\left(t+1\right)={z}_{3}\left(t\right)+{f}_{2}u(t)\\ \vdots \\ {z}_{n-1}\left(t+1\right)={z}_{n}\left(t\right)+{f}_{n-1}u\left(t\right) \\ { z}_{n}\left(t+1\right)=-{a}_{n}{z}_{1}\left(t\right)-{a}_{n-1}{z}_{2}\left(t\right)-{a}_{n-2}{z}_{3}\left(t\right)-\dots -{a}_{1}{z}_{n}\left(t\right) \\ \quad -\left[{b}_{n}{z}_{1}\left(t\right)+{b}_{n-1}{z}_{2}\left(t\right)+{b}_{n-2}{z}_{3}\left(t\right)+\dots +{b}_{1}{z}_{n}\left(t\right)\right]u\left(t\right)+{f}_{n}u\left(t\right).\end{array}\right.$$

(7)

Then, by using (7), the following equations are obtained directly:

$$\left\{\begin{array}{l}{z}_{2}\left(t\right)={z}_{1}\left(t+1\right)-{f}_{1}u\left(t\right) \\ {z}_{3}\left(t\right)={z}_{2}\left(t+1\right)-{f}_{2}u\left(t\right) \\ \quad ={z}_{1}\left(t+2\right)-{f}_{1}u\left(t+1\right)-{f}_{2}u\left(t\right) \\ {z}_{4}\left(t\right)={z}_{3}\left(t+1\right)-{f}_{3}u\left(t\right) \\ \quad ={z}_{1}\left(t+3\right)-{f}_{1}u\left(t+2\right)-{f}_{2}u\left(t+1\right)-{f}_{3}u\left(t\right)\\ \vdots \\ {z}_{n}\left(t\right)={z}_{n-1}\left(t+1\right)-{f}_{n-1}u\left(t\right) \\ ={z}_{1}\left(t+n-1\right)-{f}_{1}u\left(t+n-2\right)-{f}_{2}u\left(t+n-3\right)-\dots {-f}_{n-1}u\left(t\right)\end{array}\right.$$

(8)

Multiplying both sides of last equation of (8) by operator $q$, we have

$$ \begin{aligned} z_{n} \left( {t + 1} \right) & = z_{1} \left( {t + n} \right) - f_{1} u\left( {t + n - 1} \right) \\ & \quad - f_{2} u\left( {t + n - 2} \right) - \cdots - f_{n - 1} u\left( {t + 1} \right) \\ \end{aligned} $$

(9)

Replacing (9) in the last equation of (7) yields

$$ \begin{aligned} & - \left[ { a_{n} , a_{n - 1} , a_{n - 2} , \ldots , a_{1} } \right]\left[ {\begin{array}{*{20}c} {z_{1} \left( t \right)} \\ {\begin{array}{*{20}c} {z_{2} \left( t \right)} \\ {z_{3} \left( t \right)} \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots \\ {z_{n} \left( t \right)} \\ \end{array} } \\ \end{array} } \right] \\ & - \left[ { b_{n} , b_{n - 1} , b_{n - 2} , \ldots , b_{1} } \right]\left[ {\begin{array}{*{20}c} {z_{1} \left( t \right)} \\ {\begin{array}{*{20}c} {z_{2} \left( t \right)} \\ {z_{3} \left( t \right)} \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots \\ {z_{n} \left( t \right)} \\ \end{array} } \\ \end{array} } \right]u\left( t \right) + f_{n} u\left( t \right) \\ & \qquad = z_{1} \left( {t + n} \right) - f_{1} u\left( {t + n - 1} \right) - f_{2} u\left( {t + n - 2} \right) - \cdots - f_{n - 1} u\left( {t + 1} \right) \\ \end{aligned} $$

(10)

Now, according to matrix representation of (8) and (10), one can write

$$ \begin{aligned} & \left( {1 + a_{1} q^{ - 1} + a_{2} q^{ - 2} + \cdots + a_{n} q^{ - n} } \right)q^{n} z_{1} \left( t \right) \\ & \qquad + \left[ {b_{1} q^{ - 1} + b_{2} q^{ - 2} + \cdots + b_{n} q^{ - n} } \right)q^{n} z_{1} \left( t \right)]u\left( t \right) \\ & \quad = [ f_{n} + a_{n - 1} f_{1} + a_{n - 2} f_{2} + \cdots + a_{1} f_{n - 1} ,f_{n - 1} \\ & \qquad + a_{n - 2} f_{1} + a_{n - 3} f_{2} + \cdots + a_{1} f_{n - 2} , \ldots , f_{2} + a_{1} f_{1} , f_{1} ] \\ & \qquad \left[ {\begin{array}{c} {u\left( t \right)} \\ {u\left( {t + 1} \right)} \\ \vdots \\ {u\left( {t + n - 1} \right)}\\ \end{array} } \right] = \bigg\{ [ b_{n - 1} f_{1} + b_{n - 2} f_{2} + \cdots + b_{1} f_{n - 1} ,b_{n - 2} f_{1} \\ & \qquad + b_{n - 3} f_{2} + \cdots + b_{1} f_{n - 2} , \ldots ,b_{1} f_{1} , 0 ] \\ & \qquad \left.\left[ {\begin{array}{c} {u\left( t \right)} \\ {u\left( {t + 1} \right)} \\ \vdots \\ {u\left( {t + n - 1} \right)} \\ \end{array} } \right] \right\} u\left( t \right) \\ \end{aligned} $$

(11)

In order to simplify (11), we define two vectors as follows:

$$ \begin{aligned} \left[ {c_{n} , \ldots ,c_{2} ,c_{1} } \right] & : = [ f_{n} + a_{n - 1} f_{1} + a_{n - 2} f_{2} + \cdots \\ & \quad + a_{1} f_{n - 1} , f_{2} + a_{1} f_{1} , f_{1} ] \in {\mathbb{R}}^{1 \times n} \\ \end{aligned} $$

(12)

$$ \begin{aligned} \left[ {d_{n} , \ldots ,d_{3} ,d_{2} } \right] & : = [ b_{n - 1} f_{1} + b_{n - 2} f_{2} + \cdots \\ & \quad + b_{1} f_{n - 1} ,b_{1} f_{1} ] \in {\mathbb{R}}^{{1 \times \left( {n - 1} \right)}} \\ \end{aligned} $$

(13)

Then, (11) can be written as

$$ \begin{aligned} & A\left( q \right)q^{n} z_{1} \left( t \right) + [B(q)q^{n} z_{1} (t)]u\left( t \right) = C\left( q \right)q^{n} u\left( t \right) \\ & \quad + [D(q)]q^{n} u\left( t \right)u\left( t \right). \\ \end{aligned} $$

(14)

where

$$ \begin{aligned} A\left( q \right) & = 1 + a_{1} q^{ - 1} + a_{2} q^{ - 2} + \cdots + a_{n} q^{ - n} \\ B\left( q \right) & = b_{1} q^{ - 1} + b_{2} q^{ - 2} + \cdots + b_{n} q^{ - n} \\ C\left( q \right) & = c_{1} q^{ - 1} + c_{2} q^{ - 2} + \cdots + c_{n} q^{ - n} \\ D\left( q \right) & = d_{2} q^{ - 2} + d_{3} q^{ - 3} + \cdots + d_{n} q^{ - n} \\ \end{aligned} $$

Then, Eq. (14) can be rewritten as

$$ \begin{aligned} & A\left( q \right)q^{n} z_{1} \left( t \right) + u\left( t \right)\left[ {B\left( q \right)q^{n} z_{1} \left( t \right)} \right] = C\left( q \right) z^{n} u\left( t \right) \\ & \quad + u\left( t \right)\left[ {D\left( q \right)q^{n} u\left( t \right)} \right] \\ \end{aligned} $$

(15)

By substituting $t$ with $t-n$, we have

$${z}_{1}\left(t\right)=\frac{C\left(q\right)+u(t-n)D\left(q\right)}{A\left(q\right)+u(t-n)B\left(q\right)} u(t)$$

Replacing ${z}_{1}\left(t\right)$ in relation (2), the input–output relation of the bilinear state-space system in (1) and (2) is obtained as follows:

$$y\left(t\right)= \frac{C\left(q\right)+u(t-n)D\left(q\right)}{A\left(q\right)+u(t-n)B\left(q\right)} u\left(t\right)+v(t)$$

3 Four-stage recursive least squares algorithm

In this section, a four-stage recursive least squares algorithm is proposed to alleviate computational load, increase the convergence rate of the parameters to actual values and reduce the error simultaneously. According to the hierarchical principle, the main system is broken down into four subsystems; then, an algorithm is presented to estimate the unknown parameters of the bilinear system. Consider the following performance index:

$$J\left(\rho \right)=\sum_{j=1}^{t}{\left[\overline{y }\left(j\right)-{\varphi }^{T}\left(j\right)\rho \right]}^{2}$$

Using the least-squares principle and minimizing the performance index, the recursive least squares algorithm can be written as

$$\widehat{\rho }\left(t\right)=\widehat{\rho }\left(t-1\right)+{K}\left(t\right)\left[\overline{y }\left(t\right)-{\varphi }^{T}\left(t\right)\widehat{\rho }\left(t-1\right)\right]$$

(16)

$${K}\left(t\right)=R\left(t-1\right)\varphi \left(t\right){[1+{\varphi }^{T}\left(t\right)R\left(t-1\right)\varphi \left(t\right)]}^{-1}$$

(17)

$$R\left(t\right)=\left[ I-{K}\left(t\right){\varphi }^{T}\left(t\right)\right]R\left(t-1\right)$$

(18)

where $R\left(t\right)$ is the covariance matrix and ${K}\left(t\right)=R\left(t\right)\varphi \left(t\right)$ is the gain vector. The first problem of identification is that only the input and output data are available. Since $\varphi \left(t\right)$ contains unknown state variables, and ${\varphi }_{n}\left(t\right)$ consists of the noise variable $(\omega \left(t-i\right) ,i=1,2,..m)$, it is not possible to estimate the parameter $\widehat{\rho }\left(t\right)$ with Eqs. (16)–(18). Therefore, a bilinear state observer must be designed for state estimation.

3.1 Bilinear state observer algorithm

As we know, Kalman filter algorithm is commonly used to estimate the states of linear systems. Here, in order to apply the Kalman filter for bilinear systems, Eqs. (1) and (2) should be written in the following form:

$$ \begin{aligned} & {\varvec{z}}\left( {t + 1} \right) = A_{1} \left( {\varvec{t}} \right){\varvec{z}}\left( t \right) + f\overline{u}\left( t \right) \\ & \overline{y}\left( t \right) = h{\varvec{z}}\left( t \right) + \omega \left( t \right) \\ & A_{1} \left( {\varvec{t}} \right) = A + B\overline{u}\left( t \right) \\ \end{aligned} $$

which may be considered as a Linear state-space model. Therefore, the Kalman filter can be used to design a bilinear state observecr [48].

$$ \begin{aligned} \hat{\varvec{z}}\left( {t + 1} \right) & = A\hat{\varvec{z}}\left( t \right) + B\hat{\varvec{z}}\left( t \right)\overline{u}\left( t \right) + f\overline{u}\left( t \right) \\ & \quad + K_{z} \left( t \right)\left[ { \overline{y}\left( t \right) - h\hat{\varvec{z}}\left( t \right) - \varphi_{n}^{T} \left( t \right)\rho_{n} } \right] \\ \end{aligned} $$

(19)

$$ \begin{aligned} K_{z} \left( t \right) & = AR_{z} \left( t \right)h^{T} \left[ {hR_{z} \left( t \right)h^{T} + R_{v} } \right]^{ - 1} B\overline{u}\left( t \right)R_{z} \left( t \right) \\ & \quad \times h^{T} \left[ {hR_{z} \left( t \right)h^{T} + R_{v} } \right]^{ - 1} \\ \end{aligned} $$

(20)

$$ \begin{aligned} R_{z} \left( {t + 1} \right) & = \left[ {A - K_{z} \left( t \right)h + B\overline{u}\left( t \right)} \right]R_{z} \left( t \right) \\ & \quad \left[ {A^{T} - h^{T} K_{z}^{T} \left( t \right) + B^{T} \overline{u}\left( t \right)} \right] \\ & \quad + K_{z} \left( t \right)R_{v} K_{z}^{T} \left( t \right) \\ \end{aligned} $$

(21)

where ${\widehat{R}}_{v}\left(t\right)=\frac{1}{t}\sum_{j=1}^{t}{[\overline{y }\left(j\right)-h\widehat{{\varvec{z}}}\left(j\right)]}^{2}$, ${K}_{z}\left(t\right)$ is the optimal vector of state observer and ${R}_{z}\left(t+1\right)$ is the covariance matrix state estimation error.

If the vectors and matrices $A$, $B$, $f$ and n are unknown, the bilinear state observer in (19)–(21) cannot be used. Therefore, $\widehat{{\varvec{z}}}\left(t\right)$ should be estimated by considering the estimated parameters.

Therefore, the parameter estimation vectors are defined as follows:

$$ \begin{aligned} \hat{\rho } & = \left[ {\hat{\rho }_{s} ,\hat{\rho }_{n} } \right]^{T} \in {\mathbb{R}}^{{n_{0} }} \\ \rho_{s} & = \left[ {\hat{\rho }_{a}^{T} ,\hat{\rho }_{b}^{T} ,\hat{\rho }_{f}^{T} } \right]^{T} \in {\mathbb{R}}^{{n_{1} }} \\ \hat{\rho }_{n} & = \left[ {\hat{c}_{1} ,\hat{c}_{2} , \ldots \hat{c}_{m} ,\hat{d}_{1} ,\hat{d}_{2} \ldots ,\hat{d}_{p} } \right]^{T} \in {\mathbb{R}}^{{n_{4} }} \\ \hat{\rho }_{a} & = \left[ {\hat{a}_{1} ,\hat{a}_{2} , \ldots ,\hat{a}_{n} } \right]^{T} \\ \hat{\rho }_{b} & = \left[ {\hat{\varvec{b}}_{1} ,\hat{\varvec{b}}_{2} , \ldots ,\hat{\varvec{b}}_{{\varvec{n}}} } \right]^{T} \in {\mathbb{R}}^{{n_{3} }} \\ \hat{\rho }_{f} & = \left[ {\hat{f}_{1} ,\hat{f}_{2} , \ldots ,\hat{f}_{n} } \right]^{T} \in {\mathbb{R}}^{{n_{2} }} \\ \end{aligned} $$

$$\widehat{A}\left(t\right)=\left[\begin{array}{cccc}\begin{array}{c}-{\widehat{a}}_{1}\left(t\right)\\ -{\widehat{a}}_{2}\left(t\right)\\ \begin{array}{c}\vdots \\ \begin{array}{c}-{\widehat{a}}_{n-1}\left(t\right)\\ -{\widehat{a}}_{n}\left( t\right)\end{array}\end{array}\end{array}& \begin{array}{c}1\\ 0\\ \begin{array}{c}\vdots \\ \begin{array}{c}0\\ 0\end{array}\end{array}\end{array}& \begin{array}{cc}\begin{array}{cc}\begin{array}{c}\begin{array}{c}0\\ 1\\ \ddots \end{array}\\ \begin{array}{c}\cdots \\ \dots \end{array}\end{array}& \begin{array}{c}\begin{array}{c}\begin{array}{c}\dots \\ \ddots \end{array}\\ \ddots \end{array}\\ 0\\ 0\end{array}\end{array}& \begin{array}{c}0\\ 0\\ \begin{array}{c}0\\ \begin{array}{c}1\\ 0\end{array}\end{array}\end{array}\end{array}\end{array}\right]$$

(22)

$$ \hat{B}\left( t \right): = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\hat{\varvec{b}}_{1} \left( {\varvec{t}} \right)} \\ {\hat{\varvec{b}}_{2} \left( {\varvec{t}} \right)} \\ \end{array} } \\ \vdots \\ {\hat{\varvec{b}}_{{\varvec{n}}} \left( {\varvec{t}} \right)} \\ \end{array} } \right]\,\,\,{\varvec{b}}_{{\varvec{i}}} \in {\mathbb{R}}^{1 \times n} ,\hat{f}\left( t \right): = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\hat{f}_{1} \left( t \right)} \\ {\hat{f}_{2} \left( t \right)} \\ \end{array} } \\ \vdots \\ {\hat{f}_{n} \left( t \right)} \\ \end{array} } \right] $$

(23)

By substituting $A$, $B$ and $f$ in (19)–(21) with the estimated matrices and vectors $\widehat{A}(t)$, $\widehat{B}(t)$ and $\widehat{f}(t)$, we have

$$ \begin{aligned} \hat{\varvec{z}}\left( {t + 1} \right) & = \hat{A}\hat{\varvec{z}}\left( t \right) + \hat{B}\hat{\varvec{z}}\left( t \right)\overline{u}\left( t \right) + \hat{\rho }_{f} \overline{u}\left( t \right) \\ & \quad + K_{z} \left( t \right)\left[ {\overline{y}\left( t \right) - h\hat{\varvec{z}}\left( t \right) - \hat{\varphi }_{n}^{T} \left( t \right)\hat{\rho }_{n} } \right] \\ \end{aligned} $$

(24)

$$ \begin{aligned} K_{z} \left( t \right) & = \hat{A}R_{z} \left( t \right)h^{T} \left[ {hR_{z} \left( t \right)h^{T} + R_{v} } \right]^{ - 1} \hat{B}\overline{u}\left( t \right)R_{z} \left( t \right) \\ & \quad \times h^{T} \left[ {hR_{z} \left( t \right)h^{T} + R_{v} } \right]^{ - 1} \\ \end{aligned} $$

(25)

$$ \begin{aligned} R_{z} \left( {t + 1} \right) & = \left[ {\hat{A} - K_{z} \left( t \right)h + \hat{B}\overline{u}\left( t \right)} \right]R_{z} \left( t \right) \\ & \quad \left[ {\hat{A}^{T} - h^{T} K_{z}^{T} \left( t \right) + \hat{B}^{T} \overline{u}\left( t \right)} \right] \\ & \quad + K_{z} \left( t \right)R_{v} K_{z}^{T} \left( t \right) \\ \end{aligned} $$

(26)

Thus, based on the bilinear state observer the estimates $\widehat{{\varvec{z}}}(t)$ of the unknown states z(t) can be calculated.

Replacing an unknown state ${\varvec{z}}\left(t-i\right)$ with estimated $\widehat{{\varvec{z}}}\left(t-i\right)$ and an unknown noise term $\omega \left(t-i\right)$ with estimated $\widehat{\omega }\left(t-i\right)$, the information estimation vectors are defined as:

$$\widehat{\varphi }\left(t\right)={\left[{{\widehat{\varphi }}_{z}}^{T}\left(t\right),{{\widehat{\varphi }}_{z\overline{u}} }^{T}\left(t\right),{{\varphi }_{\overline{u}} }^{T}\left(t\right),{{\widehat{\varphi }}_{n}}^{T}\left(t\right)\right]}^{T}\in {\mathbb{R}}^{{n}_{0}}$$

$${\widehat{\varphi }}_{s}\left(t\right)={\left[{{\widehat{\varphi }}_{z}}^{T}\left(t\right),{{\widehat{\varphi }}_{z\overline{u}} }^{T}\left(t\right),{{\varphi }_{\overline{u}} }^{T}(t)\right]}^{T}\in {\mathbb{R}}^{{n}_{1}}$$

$${\widehat{\varphi }}_{z}\left(t\right)={\left[-{\widehat{z}}_{1}\left(t-1\right),-{\widehat{z}}_{1}\left(t-2\right) ,\dots ,-{\widehat{z}}_{1}\left(t-n\right)\right]}^{T} \in {\mathbb{R}}^{{n}_{2}}$$

$${\widehat{\varphi }}_{z\overline{u} }=[{\widehat{{\varvec{z}}}}^{T}\left(t-1\right)\overline{u }\left(t-1\right),{\widehat{{\varvec{z}}}}^{T}\left(t-2\right)\overline{u }\left(t-2\right),\dots { ,\widehat{{\varvec{z}}}}^{T}\left(t-n\right)\overline{u }\left(t-n\right){]}^{T}\in {\mathbb{R}}^{{n}_{3}}$$

$${\widehat{\varphi }}_{n}\left(t\right)=[-\widehat{\omega }\left(t-1\right),-\widehat{\omega }\left(t-2\right),\dots ,-\widehat{\omega }\left(t-m\right), \widehat{v}\left(t-1\right),\widehat{v}\left(t-2\right),\dots ,\widehat{v}\left(t-p\right)]^{T}\in {\mathbb{R}}^{{n}_{4}}$$

It should be noted that the estimations of $\omega \left(t\right)$ and $v\left(t\right)$ are defined as:

$$\widehat{\omega }\left(t\right)=\overline{y }\left(t\right)-{\widehat{\varphi }}_{s}\left(t\right){\widehat{\rho }}_{s}\left(t-1\right)$$

(27)

$$\widehat{v}\left(t\right)=\overline{y }\left(t\right)-{\widehat{\varphi }}^{T}\left(t\right)\widehat{\rho }\left(t-1\right)$$

(28)

The identification models 4 and 5 can be written in four subsystems as follows:

$$ \begin{aligned} \overline{y}_{a} \left( t \right) & = \varphi_{z}^{T} \left( t \right)\rho_{a} + v\left( t \right) \\ \overline{y}_{b} \left( t \right) & = \varphi_{{z\overline{u}}}^{T} \left( t \right)\rho_{b} + v\left( t \right) \\ \overline{y}_{{\overline{u}}} \left( t \right) & = \varphi_{{\overline{u}}}^{T} \left( t \right)\rho_{f} + v\left( t \right) \\ \omega \left( t \right) & = \varphi_{n}^{T} \left( t \right)\rho_{n} + v\left( t \right) \\ \end{aligned} $$

According to the least squares principle, by defining the criterion function, the recursive relationships are as follows:

$$ \begin{aligned} \hat{\rho }_{a} \left( t \right) & = \hat{\rho }_{a} \left( {t - 1} \right) + K_{1} \left( t \right)\left[ {\overline{y}_{z} \left( t \right) - \varphi_{z}^{T} \left( t \right)\hat{\rho }_{a} \left( {t - 1} \right)} \right] \\ & = \hat{\rho }_{a} \left( {t - 1} \right) + K_{1} \left( t \right)[\overline{y}\left( t \right) - \varphi_{{z\overline{u}}}^{T} \left( t \right)\rho_{b} \\ & \quad - \varphi_{{\overline{u}}}^{T} \left( t \right)\rho_{f} - \varphi_{z}^{T} \left( t \right)\hat{\rho }_{a} \left( {t - 1} \right)] \\ \end{aligned} $$

(29)

$${K}_{1}\left(t\right)=\frac{{R}_{1}\left(t-1\right){\varphi }_{z}\left(t\right)}{\beta +{\varphi }_{z}^{T}\left(t\right){R}_{1}\left(t-1\right){\varphi }_{z}\left(t\right)}$$

(30)

$${R}_{1}\left(t\right)=\frac{1}{\beta }\left[I-{K}_{1}\left(t\right){\varphi }_{z}^{T}\left(t\right)\right]{R}_{1}\left(t-1\right)$$

(31)

$$ \begin{aligned} \hat{\rho }_{b} \left( t \right) & = \hat{\rho }_{b} \left( {t - 1} \right) + K_{2} \left( t \right)\left[ {\overline{y}_{{z\overline{u}}} \left( t \right) - \varphi_{{z\overline{u}}}^{T} \left( t \right)\hat{\rho }_{b} \left( {t - 1} \right)} \right] \\ & = \hat{\rho }_{b} \left( {t - 1} \right) + K_{2} \left( t \right)[\overline{y}\left( t \right) - \varphi_{z}^{T} \left( t \right)\rho_{a} \\ & \quad - \varphi_{{\overline{u}}}^{T} \left( t \right)\rho_{f} - \varphi_{{z\overline{u}}}^{T} \left( t \right)\hat{\rho }_{b} \left( {t - 1} \right)] \\ \end{aligned} $$

(32)

$${K}_{2}\left(t\right)=\frac{{R}_{2}\left(t-1\right){\varphi }_{z\overline{u} }\left(t\right)}{\beta +{\varphi }_{z\overline{u} }^{T}\left(t\right){R}_{2}\left(t-1\right){\varphi }_{z\overline{u} }\left(t\right)}$$

(33)

$${R}_{2}\left(t\right)=\frac{1}{\beta }\left[I-{ K}_{2}\left(t\right){\varphi }_{z\overline{u} }^{T}\left(t\right)\right]{R}_{2}\left(t-1\right)$$

(34)

$$ \begin{aligned} \hat{\rho }_{f} \left( t \right) & = \hat{\rho }_{f} \left( {t - 1} \right) + K_{3} \left( t \right)\left[ {\overline{y}_{{\overline{u}}} \left( t \right) - \varphi_{{\overline{u}}}^{T} \left( t \right)\hat{\rho }_{f} \left( {t - 1} \right)} \right] \\ & = \hat{\rho }_{f} \left( {t - 1} \right) + K_{3} \left( t \right)[\overline{y}\left( t \right) - \varphi_{z}^{T} \left( t \right)\rho_{a} \\ & \quad - \varphi_{{z\overline{u}}}^{T} \left( t \right)\rho_{b} - \varphi_{{\overline{u}}}^{T} \left( t \right)\hat{\rho }_{f} \left( {t - 1} \right)] \\ \end{aligned} $$

(35)

$${K}_{3}\left(t\right)=\frac{{R}_{3}\left(t-1\right){\varphi }_{\overline{u} }\left(t\right)}{\beta +{\varphi }_{\overline{u} }^{T}\left(t\right){R}_{3}\left(t-1\right){\varphi }_{\overline{u} }\left(t\right)}$$

(36)

$${R}_{3}\left(t\right)=\frac{1}{\beta }\left[I-{K}_{3}\left(t\right){\varphi }_{\overline{u} }^{T}\left(t\right)\right]{R}_{3}\left(t-1\right)$$

(37)

$${\widehat{\rho }}_{n}\left(t\right)={\widehat{\rho }}_{n}\left(t-1\right)+{K}_{4}\left(t\right)[\omega \left(t\right)-{{\varphi }_{n}}^{T}\left(t\right){\widehat{\rho }}_{n}\left(t-1\right)]$$

(38)

$${K}_{4}\left(t\right)=\frac{{R}_{4}\left(t-1\right){\varphi }_{n}\left(t\right)}{\beta +{\varphi }_{n}^{T}\left(t\right){R}_{3}\left(t-1\right){\varphi }_{n}\left(t\right)}$$

(39)

$${R}_{4}\left(t\right)=\frac{1}{\beta }\left[I-{K}_{4}\left(t\right){\varphi }_{n}^{T}\left(t\right)\right]{R}_{4}\left(t-1\right)$$

(40)

The information vectors ${\varphi }_{z}$, ${\varphi }_{z\overline{u} }$ and ${\varphi }_{n}$ contain unknown states $z\left(t\right)$, and Eqs. (29), (32), (35) and (38) are used to estimate unknown parameters. Therefore, the algorithm (29)–(40) cannot directly estimate the unknown parameters. Consequently, by substituting their estimations, we have the following relations:

$$ \begin{aligned} \hat{\rho }_{a} \left( t \right) & = \hat{\rho }_{a} \left( {t - 1} \right) + K_{1} \left( t \right)[\overline{y}\left( t \right) - \varphi_{{z\overline{u}}}^{T} \left( t \right)\hat{\rho }_{b} \\ & \quad - \varphi_{{\overline{u}}}^{T} \left( t \right)\hat{\rho }_{f} - \varphi_{z}^{T} \left( t \right)\hat{\rho }_{a} \left( {t - 1} \right)] \\ \end{aligned} $$

(41)

$${K}_{1}\left(t\right)=\frac{{R}_{1}\left(t-1\right){\widehat{\varphi }}_{z}\left(t\right)}{\beta +{\widehat{\varphi }}_{z}^{T}\left(t\right){R}_{1}\left(t-1\right){\widehat{\varphi }}_{z}\left(t\right)}$$

(42)

$${R}_{1}\left(t\right)=\frac{1}{\beta }\left[I-{K}_{1}\left(t\right){\widehat{\varphi }}_{z}^{T}\left(t\right)\right]{R}_{1}\left(t-1\right)$$

(43)

$$ \begin{aligned} \hat{\rho }_{b} \left( t \right) & = \hat{\rho }_{b} \left( {t - 1} \right) + K_{2} \left( t \right)[\overline{y}\left( t \right) - \hat{\varphi }_{z}^{T} \left( t \right)\hat{\rho }_{a} \\ & \quad - \varphi_{{\overline{u}}}^{T} \left( t \right)\hat{\rho }_{f} - \hat{\varphi }_{{z\overline{u}}}^{T} \left( t \right)\hat{\rho }_{b} \left( {t - 1} \right)] \\ \end{aligned} $$

(44)

$${K}_{2}\left(t\right)=\frac{{R}_{2}\left(t-1\right){\widehat{\varphi }}_{z\overline{u} }\left(t\right)}{\beta +{\widehat{\varphi }}_{z\overline{u} }^{T}\left(t\right){R}_{2}\left(t-1\right){\widehat{\varphi }}_{z\overline{u} }\left(t\right)}$$

(45)

$${R}_{2}\left(t\right)=\frac{1}{\beta }\left[I-{K}_{2}\left(t\right){\widehat{\varphi }}_{z\overline{u} }^{T}\left(t\right)\right]{R}_{2}\left(t-1\right)$$

(46)

$$ \begin{aligned} \hat{\rho }_{f} \left( t \right) & = \hat{\rho }_{f} \left( {t - 1} \right) + K_{3} \left( t \right)[\overline{y}\left( t \right) - \hat{\varphi }_{z}^{T} \left( t \right)\hat{\rho }_{a} \\ & \quad - \hat{\varphi }_{{z\overline{u}}}^{T} \left( t \right)\hat{\rho }_{b} - \varphi_{{\overline{u}}}^{T} \left( t \right)\hat{\rho }_{f} \left( {t - 1} \right)] \\ \end{aligned} $$

(47)

$${ K}_{3}\left(t\right)=\frac{{R}_{3}\left(t-1\right){\varphi }_{\overline{u} }\left(t\right)}{\beta +{\varphi }_{\overline{u} }^{T}\left(t\right){R}_{3}\left(t-1\right){\varphi }_{\overline{u} }\left(t\right)}$$

(48)

$${R}_{3}\left(t\right)=\frac{1}{\beta }\left[I-{K}_{3}\left(t\right){\varphi }_{\overline{u} }^{T}\left(t\right)\right]{R}_{3}\left(t-1\right)$$

(49)

$${\widehat{\rho }}_{n}\left(t\right)={\widehat{\rho }}_{n}\left(t-1\right)+{K}_{4}\left(t\right)[\widehat{\omega }\left(t\right)-{{\varphi }_{n}}^{T}\left(t\right){\widehat{\rho }}_{n}\left(t-1\right)]$$

(50)

$${K}_{4}\left(t\right)=\frac{{R}_{4}\left(t-1\right){\widehat{\varphi }}_{n}\left(t\right)}{1+{\widehat{\varphi }}_{n}^{T}\left(t\right){R}_{3}\left(t-1\right){\widehat{\varphi }}_{n}\left(t\right)}$$

(51)

$${R}_{4}\left(t\right)=\left[I-{K}_{4}\left(t\right){\widehat{\varphi }}_{n}^{T}\left(t\right)\right]{R}_{4}\left(t-1\right)$$

(52)

$${\widehat{\varphi }}_{z}\left(t\right)={\left[-{\widehat{z}}_{1}\left(t-1\right),-{\widehat{z}}_{1}\left(t-2\right) ,\dots ,-{\widehat{z}}_{1}\left(t-n\right)\right]}^{T} \in {\mathbb{R}}^{n}$$

(53)

$${\widehat{\varphi }}_{zu}=[{\widehat{{\varvec{z}}}}^{T}\left(t-1\right)\overline{u }\left(t-1\right),{\widehat{{\varvec{z}}}}^{T}\left(t-2\right)\overline{u }\left(t-2\right),\dots { ,\widehat{{\varvec{z}}}}^{T}\left(t-n\right)\overline{u }\left(t-n\right){]}^{T}\in {\mathbb{R}}^{{n}^{2}}$$

(54)

$${\varphi }_{u}\left(t\right)={\left[\overline{u }\left(t-1\right),\overline{u }\left(t-2\right) ,\dots , \overline{u }\left(t-n\right)\right]}^{T}$$

(55)

$$ \begin{aligned} \hat{\varphi }_{n} \left( t \right) & = [ - \hat{\omega }\left( {t - 1} \right), - \hat{\omega }\left( {t - 2} \right), \ldots , \\ & \quad - \hat{\omega }\left( {t - m} \right),\hat{v}\left( {t - 1} \right),\hat{v}\left( {t - 2} \right), \ldots ,\hat{v}\left( {t - p} \right)]^{T} \in {\mathbb{R}}^{m + p} \\ \end{aligned} $$

(56)

$$ \begin{aligned} \hat{\varvec{z}}\left( {t + 1} \right) & = \hat{A}\hat{\varvec{z}}\left( t \right) + \hat{B}\hat{\varvec{z}}\left( t \right)\overline{u}\left( t \right) + \hat{\rho }_{f} \overline{u}\left( t \right) \\ & \quad + K_{z} \left( t \right)\left[ { \overline{y}\left( t \right) - h\hat{\varvec{z}}\left( t \right) - \hat{\varphi }_{n}^{T} \left( t \right)\hat{\rho }_{n} } \right] \\ \end{aligned} $$

(57)

$$ \begin{aligned} K_{z} \left( t \right) & = \hat{A}R_{z} \left( t \right)h^{T} \left[ {hR_{z} \left( t \right)h^{T} + R_{v} } \right]^{ - 1} \hat{B}\overline{u}\left( t \right)R_{z} \left( t \right) \\ & \quad \times h^{T} \left[ {hR_{z} \left( t \right)h^{T} + R_{v} } \right]^{ - 1} \\ \end{aligned} $$

(58)

$$ \begin{aligned} R_{z} \left( {t + 1} \right) & = \left[ {\hat{A} - K_{z} \left( t \right)h + \hat{B}\overline{u}\left( t \right)} \right]R_{z} \left( t \right) \\ & \quad \left[ {\hat{A}^{T} - h^{T} K_{z}^{T} \left( t \right) + \hat{B}^{T} \overline{u}\left( t \right)} \right] \\ & \quad + K_{z} \left( t \right)R_{v} K_{z}^{T} \left( t \right) \\ \end{aligned} $$

(59)

$$\widehat{A}\left(t\right)=\left[\begin{array}{ccc}\begin{array}{c}-{\widehat{a}}_{1}\left(t\right)\\ -{\widehat{a}}_{2}\left(t\right)\\ \begin{array}{c}\vdots \\ \begin{array}{c}-{\widehat{a}}_{n-1}\left(t\right)\\ -{\widehat{a}}_{n}\left( t\right)\end{array}\end{array}\end{array}& \begin{array}{c}1\\ 0\\ \begin{array}{c}\vdots \\ \begin{array}{c}0\\ 0\end{array}\end{array}\end{array}& \begin{array}{cc}\begin{array}{cc}\begin{array}{c}\begin{array}{c}0\\ 1\\ \ddots \end{array}\\ \begin{array}{c}\cdots \\ \dots \end{array}\end{array}& \begin{array}{c}\begin{array}{c}\begin{array}{c}\dots \\ \ddots \end{array}\\ \ddots \end{array}\\ 0\\ 0\end{array}\end{array}& \begin{array}{c}0\\ 0\\ \begin{array}{c}0\\ \begin{array}{c}1\\ 0\end{array}\end{array}\end{array}\end{array}\end{array}\right]$$

(60)

$$ \hat{B}\left( t \right): = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\hat{\varvec{b}}_{1} \left( {\varvec{t}} \right)} \\ {\hat{\varvec{b}}_{2} \left( {\varvec{t}} \right)} \\ \end{array} } \\ \vdots \\ {\hat{\varvec{b}}_{{\varvec{n}}} \left( {\varvec{t}} \right)} \\ \end{array} } \right]\,\,\,{\varvec{b}}_{{\varvec{i}}} \in {\mathbb{R}}^{1 \times n} ,\hat{f}\left( t \right): = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\hat{f}_{1} \left( t \right)} \\ {\hat{f}_{2} \left( t \right)} \\ \end{array} } \\ \vdots \\ {\hat{f}_{n} \left( t \right)} \\ \end{array} } \right] $$

(61)

Equations (41)–(61) consist of four recursive least-squares algorithms for the bilinear system (1) and (2). In summary, the steps of the above algorithm are given in Algorithm 1.

3.2 Convergence analysis

Assumption 1

Assume $\{v\left(t\right)\}$ is random white noise with zero mean and abounded variance ${\sigma }^{2}$

$$E\left[v\left(t\right)\right]=0$$

(62)

$$E\left[{v}^{2}\left(t\right)\right]={\sigma }^{2}<\infty $$

(63)

Lemma 1

For the four-stage recursive least squares algorithm in (41)–(61), for any c > 1, the following inequalities hold:

$$\sum_{t=1}^{\infty }\frac{{{\widehat{\varphi }}_{z}}^{T}\left(t\right){R}_{1}(t){\widehat{\varphi }}_{z}\left(t\right)}{{\left[\mathrm{ln}\left|{{R}_{1}}^{-1}\left(t\right)\right|\right]}^{c}}<\infty $$

(64)

$$\sum_{t=1}^{\infty }\frac{{{\widehat{\varphi }}_{z\overline{u}} }^{T}\left(t\right){R}_{2}(t){\widehat{\varphi }}_{z\overline{u} }\left(t\right)}{{\left[\mathrm{ln}\left|{{R}_{2}}^{-1}\left(t\right)\right|\right]}^{c}}<\infty $$

(65)

$$\sum_{t=1}^{\infty }\frac{{{\widehat{\varphi }}_{\overline{u}} }^{T}\left(t\right){R}_{3}(t){\widehat{\varphi }}_{\overline{u} }\left(t\right)}{{\left[\mathrm{ln}\left|{{R}_{3}}^{-1}\left(t\right)\right|\right]}^{c}}<\infty $$

(66)

$$\sum_{t=1}^{\infty }\frac{{{\widehat{\varphi }}_{n}}^{T}\left(t\right){R}_{4}(t){\widehat{\varphi }}_{n}\left(t\right)}{{\left[\mathrm{ln}\left|{{R}_{4}}^{-1}\left(t\right)\right|\right]}^{c}}<\infty $$

(67)

The proof of Lemma 1can be found in [49].

Theorem 1

For the system in (1)–(2) and the four-stage recursive least squares algorithm in (41)–(61), let

$$ \begin{aligned} M\left( t \right) & : = \left[ {\ln \left| {R_{1}^{ - 1} \left( t \right)} \right|} \right]^{c} + \left[ {\ln \left| {R_{2}^{ - 1} \left( t \right)} \right|} \right]^{c} \\ & \quad + \left[ {\ln \left| {R_{3}^{ - 1} \left( t \right)} \right|} \right]^{c} + \left[ {\ln \left| {R_{4}^{ - 1} \left( t \right)} \right|} \right]^{c} \\ \end{aligned} $$

(68)

Assume that Conditions (62) and (63) hold, for any c > 1, we have

$$ \begin{aligned} \left\| {\hat{\rho }_{a} \left( t \right) - \rho_{a} } \right\|^{2} & = {{O}}\left( {\frac{M\left( t \right)}{{\lambda_{\min } \left[ {R_{1}^{ - 1} \left( t \right)} \right]}}} \right), \\ \left\| {\hat{\rho }_{b} \left( t \right) - \rho_{b} } \right\|^{2} & = {{O}}\left( {\frac{M\left( t \right)}{{\lambda_{\min } \left[ {R_{2}^{ - 1} \left( t \right)} \right]}}} \right) \\ \end{aligned} $$

(69)

$$ \begin{aligned} \left\| {\hat{\rho }_{f} \left( t \right) - \rho_{f} } \right\|^{2} & = O\left( {\frac{M\left( t \right)}{{\lambda_{\min } \left[ {R_{3}^{ - 1} \left( t \right)} \right]}}} \right), \\ \left\| {\hat{\rho }_{n} \left( t \right) - \rho_{n} } \right\|^{2} & = O\left( {\frac{M\left( t \right)}{{\lambda_{\min } \left[ {R_{4}^{ - 1} \left( t \right)} \right]}}} \right) \\ \end{aligned} $$

(70)

Proof

See the detailed proof in [50]. □

Theorem 2

For the identification model in Eq. (5) and the four-stage recursive least squares algorithm in Eqs. (41)–(61), assume that there are positive constants ${\alpha }_{1}$, ${\alpha }_{2}$, ${\alpha }_{3}$, ${\alpha }_{4}$, ${\beta }_{1}$, ${\beta }_{2}$, ${\beta }_{3}$, ${\beta }_{4}$ and large $t$ such that the following persistent excitation conditions hold:

$${\alpha }_{1}{I}_{{n}_{2}}\le \frac{1}{t}\sum_{i=1}^{t}{\widehat{\varphi }}_{z}\left(i\right){{\widehat{\varphi }}_{z}}^{T}\left(i\right)\le {\beta }_{1}{I}_{{n}_{2}}$$

(71)

$${\alpha }_{2}{I}_{{n}_{3}}\le \frac{1}{t}\sum_{i=1}^{t}{\widehat{\varphi }}_{z\overline{u} }\left(i\right){{\widehat{\varphi }}_{z\overline{u}} }^{T}\left(i\right)\le {\beta }_{2}{I}_{{n}_{3}}$$

(72)

$${\alpha }_{3}{I}_{{n}_{2}}\le \frac{1}{t}\sum_{i=1}^{t}{\widehat{\varphi }}_{\overline{u} }\left(i\right){{\widehat{\varphi }}_{\overline{u}} }^{T}\left(i\right)\le {\beta }_{3}{I}_{{n}_{2}}$$

(73)

$${\alpha }_{4}{I}_{{n}_{4}}\le \frac{1}{t}\sum_{i=1}^{t}{\widehat{\varphi }}_{n}\left(i\right){{\widehat{\varphi }}_{n}}^{T}\left(i\right)\le {\beta }_{4}{I}_{{n}_{4}}$$

(74)

Then, four-stage recursive least squares parameter estimation errors converge to zero as t goes to infinity.

$$ \left\| {\hat{\rho }_{a} \left( t \right) - \rho_{a} } \right\|^{2} \to ,\,\,\left\| {\hat{\rho }_{b} \left( t \right) - \rho_{b} } \right\|^{2} \to 0 $$

(75)

$$ \left\| {\hat{\rho }_{f} \left( t \right) - \rho_{f} } \right\|^{2} \to ,\,\,\left\| {\hat{\rho }_{n} \left( t \right) - \rho_{n} } \right\|^{2} \to 0 $$

(76)

The proof is expressed in Appendix section.

4 The computational efficiency

Utilizing flops is an useful way to determine computsational efficiency [51]. Here, a flop is each operation of addition, multiplication, subtraction or division. In general, a division is presumed as a multiplication and a subtraction is presumed as an addition. Therefore, the algorithm can be represented by additions and multiplications. The number of multiplications and additions of the proposed algorithms are listed in Tables 1 and 2. In order to show the computational efficiency in the 4S-RLS algorithm, an RLS algorithm is presented for the sake of comparison. Tables 3 and 4 show that the computational load of the proposed algorithm is less than the RLS algorithm.

Table 1 Computational efficiency of the RLS algorithm

Full size table

Table 2 Computational efficiency of the 4S-RLS algorithm

Full size table

Table 3 Number of additions and multiplications of the algorithms

Full size table

Table 4 Comparison of total flops of the algorithms with ${n}_{1}=8$, ${n}_{2}=2$, ${n}_{3}=4$, ${n}_{4}=2$

Full size table

The flop difference between the RLS algorithm and the 4S-RLS algorithm is as follows: $N_{2} - N_{1} = 6\left( {2n_{2} + n_{3} + n_{4} } \right)^{2} + 6\left( {2n_{2} + n_{3} + n_{4} } \right) - \left[ {6\left( {2n_{2}^{2} + n_{3}^{2} + n_{4}^{2} } \right) + 8\left( {n_{2}^{2} + 2n_{2} } \right) + 4\left( {2n_{2} + n_{3} + n_{4} } \right) } \right] = 4n_{2}^{2} + 24n_{2} n_{3} + 24n_{2} n_{4} + 12n_{3} n_{4} - 12n_{2} + 2n_{3} + 2n_{4} > 0$. Therefore, ${N}_{1}<{N}_{2}$ which means that the 4S-RLS algorithm is more efficient than the RLS algorithm.

5 Four-stage stochastic gradient algorithm

In this part, a four-stage stochastic gradient algorithm is considered to estimate the unknown parameters and reduce the computational burden.

The second-order criterion function is considered as follows:

$$J\left(\rho \right)=\frac{1}{2}{\left[\overline{y }\left(t\right)-{\varphi }^{T}\left(t\right)\rho \right]}^{2}$$

By computing the gradient of $J$, we have

$$\nabla \left[J\left(\rho \right)\right]=\frac{\partial \left(J\left(\rho \right)\right)}{\partial \rho }=-\varphi (t)\left[\overline{y }\left(t\right)-{\varphi }^{T}\left(t\right)\rho \right]$$

According to the gradient search principle and minimizing the objective function, the stochastic gradient algorithm is obtained as follows:

Assume that $\frac{1}{\mu (t)}$ is the step size.

$$ \begin{aligned} \hat{\rho }\left( t \right) & = \hat{\rho }\left( {t - 1} \right) + \frac{\varphi \left( t \right)}{{\mu \left( t \right)}}\left[ {\overline{y}\left( t \right) - \varphi^{T} \left( t \right)\hat{\rho }\left( {t - 1} \right) } \right] \\ \mu \left( t \right) & = \propto \mu \left( {t - 1} \right) + \left\| {\varphi \left( t \right)} \right\|^{2} ,\,\,\mu \left( 0 \right) = 1 \\ \end{aligned} $$

where $0\le \propto <1$ is a forgetting factor that can improve the accuracy of parameter estimation. Hence, the four-stage stochastic gradient algorithm is obtained as follows:

$${\widehat{\rho }}_{a}\left(t\right) ={\widehat{\rho }}_{a}\left(t-1\right)+\frac{{\varphi }_{z}}{{\mu }_{1}\left(t\right)}[\overline{y }\left(t\right)-{{\varphi }_{z\overline{u}} }^{T}\left(t\right){\rho }_{b}-{{\varphi }_{\overline{u}} }^{T}\left(t\right){\rho }_{f} -{{\varphi }_{z}}^{T}\left(t\right){\widehat{\rho }}_{a}\left(t-1\right)]$$

(77)

$${\mu }_{1}\left(t\right)=\propto {\mu }_{1}\left(t-1\right)+{\Vert {\varphi }_{z}(t)\Vert }^{2}$$

(78)

$${\widehat{\rho }}_{b}\left(t\right)={\widehat{\rho }}_{b}\left(t-1\right)+\frac{{\widehat{\varphi }}_{z\overline{u}}}{{\mu }_{2}\left(t\right)}[\overline{y }\left(t\right)-{{\widehat{\varphi }}_{z}}^{T}\left(t\right){\widehat{\rho }}_{a}-{{\varphi }_{\overline{u}} }^{T}\left(t\right){\widehat{\rho }}_{f}-{{\widehat{\varphi }}_{z\overline{u}} }^{T}\left(t\right){\widehat{\rho }}_{b}\left(t-1\right)]$$

(79)

$${\mu }_{2}\left(t\right)=\propto {\mu }_{2}\left(t-1\right)+{\Vert {\varphi }_{z\overline{u} }(t)\Vert }^{2}$$

(80)

$${\widehat{\rho }}_{f}\left(t\right)={\widehat{\rho }}_{f}\left(t-1\right)+\frac{{\varphi }_{\overline{u}}}{{\mu }_{3}\left(t\right)}[y(t)-{{\widehat{\varphi }}_{z}}^{T}\left(t\right){\widehat{\rho }}_{a} -{{\widehat{\varphi }}_{z\overline{u}} }^{T}\left(t\right){\widehat{\rho }}_{b}-{{\varphi }_{\overline{u}} }^{T}\left(t\right){\widehat{\rho }}_{f}\left(t-1\right)]$$

(81)

$${\mu }_{3}\left(t\right)=\propto {\mu }_{3}\left(t-1\right)+{\Vert {\varphi }_{\overline{u} }(t)\Vert }^{2}$$

(82)

$${\widehat{\rho }}_{n}\left(t\right)={\widehat{\rho }}_{n}\left(t-1\right)+\frac{{\varphi }_{n}}{{\mu }_{4}\left(t\right)}[\widehat{\omega }\left(t\right)- {{\varphi }_{n}}^{T}\left(t\right){\widehat{\rho }}_{n}\left(t-1\right)]$$

(83)

$${\mu }_{4}\left(t\right)=\propto {\mu }_{4}\left(t-1\right)+{\Vert {\varphi }_{n}(t)\Vert }^{2}$$

(84)

$${\widehat{\varphi }}_{z}\left(t\right)={\left[-{\widehat{z}}_{1}\left(t-1\right),-{\widehat{z}}_{1}\left(t-2\right),\dots ,-{\widehat{z}}_{1}\left(t-n\right)\right]}^{T}$$

(85)

$${\widehat{\varphi }}_{zu}=[{\widehat{{\varvec{z}}}}^{T}\left(t-1\right)\overline{u }\left(t-1\right),{\widehat{{\varvec{z}}}}^{T}\left(t-2\right)\overline{u }\left(t-2\right),\dots { ,\widehat{{\varvec{z}}}}^{T}\left(t-n\right)\overline{u }\left(t-n\right){]}^{T}\in {\mathbb{R}}^{{n}^{2}}$$

(86)

$${\varphi }_{u}\left(t\right)={\left[\overline{u }\left(t-1\right),\overline{u }\left(t-2\right) ,\dots , \overline{u }\left(t-n\right)\right]}^{T}$$

(87)

$${\widehat{\varphi }}_{n}\left(t\right)=[-\widehat{\omega }\left(t-1\right),-\widehat{\omega }\left(t-2\right),\dots ,-\widehat{\omega }\left(t-m\right),\widehat{v}\left(t-1\right),\widehat{v}\left(t-2\right),\dots ,\widehat{v}\left(t-p\right){]}^{T}\in {\mathbb{R}}^{m+p}$$

(88)

$$\widehat{{\varvec{z}}}\left(t+1\right)=\widehat{A}\widehat{{\varvec{z}}}\left(t\right)+\widehat{B}\widehat{{\varvec{z}}}\left(t\right)\overline{u }\left(t\right)+{\widehat{\rho }}_{f}\overline{u }\left(t\right)+{K}_{z}\left(t\right)\left[\overline{y }\left(t\right)-h\widehat{{\varvec{z}}}\left(t\right)-{{\widehat{\varphi }}_{n}}^{T}\left(t\right){\widehat{\rho }}_{n}\right]$$

(89)

$${K}_{z}\left(t\right)=\widehat{A}{R}_{z}\left(t\right){h}^{T}{\left[h{R}_{z}\left(t\right){h}^{T}+{R}_{v}\right]}^{-1}\widehat{B}\overline{u }\left(t\right){R}_{z}\left(t\right)\times {h}^{T}{\left[h{R}_{z}\left(t\right){h}^{T}+{R}_{v}\right]}^{-1}$$

(90)

$${R}_{z}\left(t+1\right)=\left[\widehat{A}-{K}_{z}\left(t\right)h+\widehat{B}\overline{u }\left(t\right)\right]{R}_{z}\left(t\right) [{\widehat{A}}^{T}-{h}^{T}{{K}_{z}}^{T}\left(t\right)+{\widehat{B}}^{T}\overline{u }(t)]+{K}_{z}\left(t\right){R}_{v}{{K}_{z}}^{T}\left(t\right)$$

(91)

$$\widehat{A}\left(t\right)=\left[\begin{array}{ccc}\begin{array}{c}-{\widehat{a}}_{1}\left(t\right)\\ -{\widehat{a}}_{2}\left(t\right)\\ \begin{array}{c}\vdots \\ \begin{array}{c}-{\widehat{a}}_{n-1}\left(t\right)\\ -{\widehat{a}}_{n}\left( t\right)\end{array}\end{array}\end{array}& \begin{array}{c}1\\ 0\\ \begin{array}{c}\vdots \\ \begin{array}{c}0\\ 0\end{array}\end{array}\end{array}& \begin{array}{cc}\begin{array}{cc}\begin{array}{c}\begin{array}{c}0\\ 1\\ \ddots \end{array}\\ \begin{array}{c}\cdots \\ \dots \end{array}\end{array}& \begin{array}{c}\begin{array}{c}\begin{array}{c}\dots \\ \ddots \end{array}\\ \ddots \end{array}\\ 0\\ 0\end{array}\end{array}& \begin{array}{c}0\\ 0\\ \begin{array}{c}0\\ \begin{array}{c}1\\ 0\end{array}\end{array}\end{array}\end{array}\end{array}\right]$$

(92)

$$ \hat{B}\left( t \right): = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\hat{\varvec{b}}_{1} \left( {\varvec{t}} \right)} \\ {\hat{\varvec{b}}_{2} \left( {\varvec{t}} \right)} \\ \end{array} } \\ \vdots \\ {\hat{\varvec{b}}_{{\varvec{n}}} \left( {\varvec{t}} \right)} \\ \end{array} } \right],\hat{f}\left( t \right): = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\hat{f}_{1} \left( t \right)} \\ {\hat{f}_{2} \left( t \right)} \\ \end{array} } \\ \vdots \\ {\hat{f}_{n} \left( t \right)} \\ \end{array} } \right] $$

(93)

In summary, the steps of the proposed algorithm are presented in Algorithm 2.

6 Simulation results

In order to show the efficiency of the proposed method, two numerical examples are provided. The first example is a bilinear state-space system that should be identified, and the second one is a practical example where the system is represented by bilinear state-space model.

6.1 Numerical example

Consider a bilinear state-space system as follows:

$$ \begin{aligned} {\varvec{z}}\left( {t + 1} \right) & = \left[ {\begin{array}{*{20}c} { - 0.20} & 1 \\ {0.25} & 0 \\ \end{array} } \right]{\varvec{z}}\left( t \right) \\ & \quad + { }\left[ {\begin{array}{*{20}c} {0.08} & {0.17} \\ { - 0.12} & { - 0.2} \\ \end{array} } \right]{\varvec{z}}\left( t \right)\overline{u}\left( t \right) \\ & \quad + \left[ {\begin{array}{*{20}c} {0.4} \\ 2 \\ \end{array} } \right]\overline{u}\left( t \right) \\ \overline{y}\left( t \right) & = \left[ {1,0} \right]{\varvec{z}}\left( t \right) - c\omega \left( {t - 1} \right) + dv\left( {t - 1} \right) + v\left( t \right) \\ \end{aligned} $$

The parameter vector for identification is

$$\rho ={\left[{a}_{1},{a}_{2},{b}_{11},{b}_{12},{b}_{21},{b}_{22},{f}_{1},{f}_{2},c,d\right]}^{T}$$

$$\rho =[0.20 ,-0.25 , 0.08 , 0.17,-0.12 ,-0.2 , 0.40 , 2 ,-0.3 , 1]$$

For simulation studies, consider a persistent excitation input signal $u(t)$, where $v(t)$ is a white noise with zero mean and variance ${\sigma }^{2}={0.10}^{2}$, $\beta =0.88$, $\propto =0.998$ and the data length is L = 3000. The parameter estimation error is calculated by $\delta = \left\| {\rho \left( t \right) - \rho } \right\|/\left\| \rho \right\|$. The results of the parameters estimations and the error with the four-stage recursive least squares algorithm are presented in Fig. 2 and Table 5. In addition, the results of the four-stage stochastic gradient algorithm are shown in Fig. 3 and Table 6.

Table 5 4S-RLS estimates and errors (${\sigma }^{2}={0.10}^{2}$)

Full size table

Table 6 4S-SG estimates and errors (${\sigma }^{2}={0.10}^{2}$)

Full size table

From the simulation results presented in the figures and the tables, the following conclusions can be derived: Convergence analysis shows that the proposed algorithms are effective and the estimated parameters by the proposed algorithms can converge to their real values. Figures 2 and 3 show that the estimation error decreases with a suitable speed. Also, the results presented in Tables 5 and 6 show the convergence of the parameters to the real values. The 4S-RLS algorithm can provide an effective parameter estimation compared to the 4S-SG algorithm. Due to the data length and the noise variance, the 4S-RLS algorithm has a smaller estimation error than the 4S-SG algorithm and also has a higher parameter estimation accuracy. The system states and their estimates are shown in Figs. 4 and 5, respectively. The estimated states correspond well to the actual system states, indicating that the bilinear state observer is effective.

6.2 Practical example

The pH neutralization would be a case study to demonstrate the effectiveness of the proposed methods. For this process, the input/output data are gathered using the GMN test signal [32]. In this process, which is a highly nonlinear process, acid flow (HNO3), base flow (NaOH) and buffer (NaHco3) are considered as input to the system, and the pH level is considered as the system output. The inputs can be regarded as (u1), (u2), (u3). In this process, it can be assumed that acid flow rate and tank capacity are constant. The structure of the pH neutralization process is shown in Fig. 6.

In this example, it is assumed that only input/output data is available and the proposed methods are used to identify the system with bilinear state-space models. To identify the system, 4S-RLS and 4S-SG algorithms are used to estimate the vector parameter $\widehat{\rho }\left(t\right)$ with data length t = N = 1280 and variance ${\sigma }^{2}={0.10}^{2}$. To confirm the results, the estimated output and the actual output data for the test dataset are shown in Figs. 7 and 8. The estimation error is calculated as $e :=\frac{\Vert \widehat{y}\left(t\right)-y\Vert }{\Vert y\Vert }\times 100$. According to the simulation results, the obtained error value for the 4S-RLS algorithm is 3.9734% and for the 4S-SG algorithm is 5.5564%.

7 Conclusion

In this paper, the parameter identification of bilinear state-space systems with colored noise which is expressed by the ARMA model was investigated. The proposed methods are based on the hierarchy principle. Since the states of the system need to be used for identification and only the system input/output data are available, a bilinear state observer was designed to estimate the system states. By using the hierarchical identification principle and the gradient search, a four-stage recursive least-squares algorithm and a four-stage stochastic gradient algorithm were presented to reduce the computational burden.The simulation results have demonstrated that 4S-SG algorithm is efficient for identifying the bilinear systems, and the 4S-RLS algorithm outperforms the 4S-SG algorithm, with less estimation error. In addition, with increasing data length for different noise variances, the accuracy of the proposed method increases.

Data availability

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

Ding, F., Wang, F., Xu, L., Wu, M.: Decomposition based least squares iterative identification algorithm for multivariate pseudo-linear ARMA systems using the data filtering. J. Frankl. Inst. 354(3), 1321–1339 (2017)
MathSciNet MATH Google Scholar
Ding, F., Xu, L., Zhu, Q.: Performance analysis of the generalised projection identification for time-varying systems. IET Control Theory Appl. 10(18), 2506–2514 (2016)
MathSciNet Google Scholar
Wang, D.: Hierarchical parameter estimation for a class of MIMO Hammerstein systems based on the reframed models. Appl. Math. Lett. 57, 13–19 (2016)
MathSciNet MATH Google Scholar
Kazemi, M., Arefi, M.M., Alipouri, Y.: Wiener model based GMVC design considering sensor noise and delay. ISA Trans. 88, 73–81 (2019)
Google Scholar
Wang, D., Zhang, W.: Improved least squares identification algorithm for multivariable Hammerstein systems. J. Frankl. Inst. 352(11), 5292–5307 (2015)
MathSciNet MATH Google Scholar
Zhang, C., Li, J.: Adaptive iterative learning control for nonlinear pure-feedback systems with initial state error based on fuzzy approximation. J. Frankl. Inst. 351(3), 1483–1500 (2014)
MathSciNet MATH Google Scholar
Zhang, H., Luo, Y., Liu, D.: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans. Neural Netw. 20(9), 1490–1503 (2009)
Google Scholar
Xu, L.: A proportional differential control method for a time-delay system using the Taylor expansion approximation. Appl. Math. Comput. 236, 391–399 (2014)
MathSciNet MATH Google Scholar
Xu, L.: The parameter estimation algorithms based on the dynamical response measurement data. Adv. Mech. Eng. 9(11), 1687814017730003 (2017)
Google Scholar
Beauduin, T., Fujimoto, H.: Identification of system dynamics with time delay: a two-stage frequency domain approach. IFAC-PapersOnLine 50(1), 10870–10875 (2017)
Google Scholar
Bianchi, F., Prandini, M., Piroddi, L.: A randomized two-stage iterative method for switched nonlinear systems identification. Nonlinear Anal. Hybrid Syst. 35, 100818 (2020)
MathSciNet MATH Google Scholar
Liu, Q., Ding, F.: Auxiliary model-based recursive generalized least squares algorithm for multivariate output-error autoregressive systems using the data filtering. Circuits Syst. Signal Process. 38(2), 590–610 (2019)
Google Scholar
Yin, S., Gao, H., Kaynak, O.: Data-driven control and process monitoring for industrial applications—part I. IEEE Trans. Ind. Electron. 61(11), 6356–6359 (2014)
Google Scholar
Yin, S., Gao, H., Qiu, J., Kaynak, O.: Fault detection for nonlinear process with deterministic disturbances: a just-in-time learning based data driven method. IEEE Trans. Cybern. 47(11), 3649–3657 (2016)
Google Scholar
Zhang, X., Ding, F., Xu, L., Yang, E.: State filtering-based least squares parameter estimation for bilinear systems using the hierarchical identification principle. IET Control Theory Appl. 12(12), 1704–1713 (2018)
MathSciNet Google Scholar
Bruni, C., Dipillo, G., Koch, G.: Bilinear systems: an appealing class of" nearly linear" systems in theory and applications. IEEE Trans. Autom. Control 19(4), 334–348 (1974)
MathSciNet MATH Google Scholar
Hafezi, Z., Arefi, M.M.: Recursive generalized extended least squares and RML algorithms for identification of bilinear systems with ARMA noise. ISA Trans. 88, 50–61 (2019)
Google Scholar
Hwang, C., Chen, M.-Y.: Parameter identification of bilinear systems using the Galerkin method. Int. J. Syst. Sci. 16(5), 641–648 (1985)
MathSciNet MATH Google Scholar
Balatif, O., Abdelbaki, I., Rachik, M., Rachik, Z.: Optimal control for multi-input bilinear systems with an application in cancer chemotherapy. Int. J. Sci. Innov. Math. Res. (IJSIMR) 3(2), 22–31 (2015)
Google Scholar
Arguello-Serrano, B., Velez-Reyes, M.: Nonlinear control of a heating, ventilating, and air conditioning system with thermal load estimation. IEEE Trans. Control Syst. Technol. 7(1), 56–63 (1999)
Google Scholar
Tsai, S.-H., Hsiao, M.-Y., Tsai, K.-L.: LMI-based fuzzy control for a class of time-delay discrete fuzzy bilinear system. In IEEE International Conference on Fuzzy Systems, pp. 796–801 (2009)
Figalli, G., Cava, M.L., Tomasi, L.: An optimal feedback control for a bilinear model of induction motor drives. Int. J. Control 39(5), 1007–1016 (1984)
MATH Google Scholar
Mohler, R.R.: Nonlinear Systems (vol. 2) Applications to Bilinear Control. Prentice-Hall, Inc., Englewood Cliffs (1991)
MATH Google Scholar
Dai, H., Sinha, N.: Robust recursive least-squares method with modified weights for bilinear system identification. IEE Proce. Control Theory Appl. 136(3), 122–126 (1989)
MATH Google Scholar
Li, M., Liu, X., Ding, F.: The maximum likelihood least squares based iterative estimation algorithm for bilinear systems with autoregressive moving average noise. J. Frankl. Inst. 354(12), 4861–4881 (2017)
MathSciNet MATH Google Scholar
Li, M., Liu, X., Ding, F.: The gradient-based iterative estimation algorithms for bilinear systems with autoregressive noise. Circuits Syst. Signal Process. 36(11), 4541–4568 (2017)
MathSciNet MATH Google Scholar
Li, M., Liu, X., Ding, F.: Least-squares-based iterative and gradient-based iterative estimation algorithms for bilinear systems. Nonlinear Dyn. 89(1), 197–211 (2017)
MathSciNet MATH Google Scholar
Gibson, S., Wills, A., Ninness, B.: Maximum-likelihood parameter estimation of bilinear systems. IEEE Trans. Autom. Control 50(10), 1581–1596 (2005)
MathSciNet MATH Google Scholar
Xu, L., Ding, F.: Iterative parameter estimation for signal models based on measured data. Circuits Syst. Signal Process. 37(7), 3046–3069 (2018)
MathSciNet MATH Google Scholar
Ding, F., Liu, P.X., Liu, G.: Gradient based and least-squares based iterative identification methods for OE and OEMA systems. Digital Signal Process. 20(3), 664–677 (2010)
Google Scholar
Xu, H., Ding, F., Yang, E.: Modeling a nonlinear process using the exponential autoregressive time series model. Nonlinear Dyn. 95(3), 2079–2092 (2019)
MATH Google Scholar
Kazemi, M., Arefi, M.M.: A fast iterative recursive least squares algorithm for Wiener model identification of highly nonlinear systems. ISA Trans. 67, 382–388 (2017)
Google Scholar
Xu, L., Ding, F., Gu, Y., Alsaedi, A., Hayat, T.: A multi-innovation state and parameter estimation algorithm for a state space system with d-step state-delay. Signal Process. 140, 97–103 (2017)
Google Scholar
Fnaiech, F., Ljung, L.: Recursive identification of bilinear systems. Int. J. Control 45(2), 453–470 (1987)
MathSciNet MATH Google Scholar
Phan, M.Q., Vicario, F., Longman, R.W., Betti, R.: Optimal bilinear observers for bilinear state-space models by interaction matrices. Int. J. Control 88(8), 1504–1522 (2015)
MathSciNet MATH Google Scholar
Phan, M.Q., Čelik, H.: A superspace method for discrete-time bilinear model identification by interaction matrices. J. Astronaut. Sci. 59(1), 421–440 (2012)
Google Scholar
Gabr, M.: A recursive (on-line) identification of bilinear systems. Int. J. Control 44(4), 911–917 (1986)
MATH Google Scholar
Hizir, N.B., Phan, M.Q., Betti, R., Longman, R.W.: Identification of discrete-time bilinear systems through equivalent linear models. Nonlinear Dyn. 69(4), 2065–2078 (2012)
MathSciNet Google Scholar
Meng, D.: Recursive least squares and multi-innovation gradient estimation algorithms for bilinear stochastic systems. Circuits Syst. Signal Process. 36(3), 1052–1065 (2017)
MathSciNet MATH Google Scholar
Liu, S., Ding, F., Xu, L., Hayat, T.: Hierarchical principle-based iterative parameter estimation algorithm for dual-frequency signals. Circuits Syst. Signal Process. 38(7), 3251–3268 (2019)
Google Scholar
Cui, T., Ding, F., Jin, X.-B., Alsaedi, A., Hayat, T.: Joint multi-innovation recursive extended least squares parameter and state estimation for a class of state-space systems. Int. J. Control Autom. Syst. 1–13 (2019)
Tsai, S.-H., Li, T.-H.S.: Robust fuzzy control of a class of fuzzy bilinear systems with time-delay. Chaos, Solitons Fractals 39(5), 2028–2040 (2009)
MathSciNet MATH Google Scholar
Lu, X., Ding, F., Alsaedi, A., Hayat, T.: Decomposition-based gradient estimation algorithms for multivariable equation-error systems. Int. J. Control Autom. Syst. 17(8), 2037–2045 (2019)
Google Scholar
Li, M., Liu, X.: The least squares based iterative algorithms for parameter estimation of a bilinear system with autoregressive noise using the data filtering technique. Signal Process. 147, 23–34 (2018)
Google Scholar
Ding, F., Xu, L., Meng, D., Jin, X.-B., Alsaedi, A., Hayat, T.: Gradient estimation algorithms for the parameter identification of bilinear systems using the auxiliary model. J. Comput. Appl. Math. 112575 (2019)
Luo, H., Li, K., Huo, M., Yin, S., Kaynak, O.: A data-driven process monitoring approach with disturbance decoupling. In: Data Driven Control and Learning Systems Conference, pp. 569–574 (2018)
Zhang, X., Ding, F., Xu, L.: Recursive parameter estimation methods and convergence analysis for a special class of nonlinear systems. Int. J. Robust Nonlinear Control 30(4), 1373–1393 (2020)
MathSciNet MATH Google Scholar
Zhang, X., Xu, L., Ding, F., Hayat, T.: Combined state and parameter estimation for a bilinear state space system with moving average noise. J. Frankl. Inst. 355(6), 3079–3103 (2018)
MathSciNet MATH Google Scholar
Chen, H., Ding, F.: Hierarchical least squares identification for Hammerstein nonlinear controlled autoregressive systems. Circuits Syst. Signal Process. 34(1), 61–75 (2015)
MathSciNet MATH Google Scholar
Zhang, X., Ding, F.: Recursive parameter estimation and its convergence for bilinear systems. IET Control Theory Appl. 14(5), 677–688 (2019)
MathSciNet Google Scholar
Ding, F., Chen, H., Xu, L., Dai, J., Li, Q., Hayat, T.: A hierarchical least squares identification algorithm for Hammerstein nonlinear systems using the key term separation. J. Franklin Inst. 355(8), 3737–3752 (2018)
MathSciNet MATH Google Scholar

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Department of Power and Control Engineering, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
Fatemeh Shahriari & Mohammad Mehdi Arefi
Department of Control Science and Engineering, School of Astronautics, Harbin Institute of Technology, Harbin, 150001, China
Hao Luo
Department of Mechanical and Industrial Engineering, Faculty of Engineering, Norwegian University of Science and Technology, 7033, Trondheim, Norway
Shen Yin

Authors

Fatemeh Shahriari
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Mehdi Arefi
View author publications
You can also search for this author in PubMed Google Scholar
Hao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Shen Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Mehdi Arefi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Theorem 2

From the covariance matrix relations, we have

$$ \begin{aligned} R_{1}^{ - 1} \left( t \right) & = R_{1}^{ - 1} \left( {t - 1} \right) + \hat{\varphi }_{z} \left( t \right)\hat{\varphi }_{z}^{T} \left( t \right) \\ & = R_{1}^{ - 1} \left( 0 \right) + \mathop \sum \limits_{i = 1}^{t} \hat{\varphi }_{z} \left( i \right)\hat{\varphi }_{z}^{T} \left( i \right) \\ \end{aligned} $$

(A1)

$$ \begin{aligned} R_{2}^{ - 1} \left( t \right) & = R_{2}^{ - 1} \left( {t - 1} \right) + \hat{\varphi }_{{z\overline{u}}} \left( t \right)\hat{\varphi }_{{z\overline{u}}}^{T} \left( t \right) \\ & = R_{2}^{ - 1} \left( 0 \right) + \mathop \sum \limits_{i = 1}^{t} \hat{\varphi }_{{z\overline{u}}} \left( i \right)\hat{\varphi }_{{z\overline{u}}}^{T} \left( i \right) \\ \end{aligned} $$

(A2)

$$ \begin{aligned} R_{3}^{ - 1} \left( t \right) & = R_{3}^{ - 1} \left( {t - 1} \right) + \hat{\varphi }_{{\overline{u}}} \left( t \right)\hat{\varphi }_{{\overline{u}}}^{T} \left( t \right) \\ & = R_{3}^{ - 1} \left( 0 \right) + \mathop \sum \limits_{i = 1}^{t} \hat{\varphi }_{{\overline{u}}} \left( i \right)\hat{\varphi }_{{\overline{u}}}^{T} \left( i \right) \\ \end{aligned} $$

(A3)

$$ \begin{aligned} R_{4}^{ - 1} \left( t \right) & = R_{4}^{ - 1} \left( {t - 1} \right) + \hat{\varphi }_{n} \left( t \right)\hat{\varphi }_{n}^{T} \left( t \right) \\ & = R_{4}^{ - 1} \left( 0 \right) + \mathop \sum \limits_{i = 1}^{t} \hat{\varphi }_{n} \left( i \right)\hat{\varphi }_{n}^{T} \left( i \right) \\ \end{aligned} $$

(A4)

Considering the left side of inequalitie in (71) and the covariance relation in (A1), we have

$$ \alpha_{1} I_{{n_{2} }} \le \frac{1}{t}\mathop \sum \limits_{i = 1}^{t} \hat{\varphi }_{z} \left( i \right)\hat{\varphi }_{z}^{T} \left( i \right) $$

(A5)

Define

$$ R_{1} \left( 0 \right) = r_{0} I_{{n_{2} }} R_{1}^{ - 1} \left( 0 \right) = \frac{1}{{r_{0} }}I_{{n_{2} }} $$

(A6)

where ${r}_{0}$ is positive constant. As the elements of ${{R}_{1}}^{-1}(0)$ are nonnegative, by adding ${R}_{1}\left(0\right)$ to the right side of inequality in (A5), one can write

$${\alpha }_{1}{I}_{{n}_{2}}\le \frac{1}{t}\sum_{i=1}^{t}{\widehat{\varphi }}_{z}\left(i\right){{\widehat{\varphi }}_{z}}^{T}\left(i\right)+{{R}_{1}}^{-1}(0)$$

Therefore, we have

$$t{\alpha }_{1}{I}_{{n}_{2}}\le {{R}_{1}}^{-1}\left(t\right)$$

and

$$t{\alpha }_{1}\le {\lambda }_{\mathrm{min}}[{{R}_{1}}^{-1}\left(t\right)]$$

(A7)

Similarly

$$t{\alpha }_{2}\le {\lambda }_{\mathrm{min}}[{{R}_{2}}^{-1}\left(t\right)]$$

(A8)

$$t{\alpha }_{3}\le {\lambda }_{\mathrm{min}}[{{R}_{3}}^{-1}\left(t\right)]$$

(A9)

$$t{\alpha }_{4}\le {\lambda }_{\mathrm{min}}[{{R}_{4}}^{-1}\left(t\right)]$$

(A10)

Now, consider the right side inequalities in (71) and Theorem 1.

$$\frac{1}{t}\sum_{i=1}^{t}{\widehat{\varphi }}_{z}\left(i\right){{\widehat{\varphi }}_{z}}^{T}\left(i\right)\le {\beta }_{1}{I}_{{n}_{2}}$$

(A11)

We add ${{R}_{1}}^{-1}(0)$ to the both sides of inequality (A11):

$${{R}_{1}}^{-1}\left(t\right)\le t{\beta }_{1}{I}_{{n}_{2}}+{{R}_{1}}^{-1}(0)$$

$$\left|{{R}_{1}}^{-1}\left(t\right)\right|\le \left|\left[t{\beta }_{1}{I}_{{n}_{2}}+{{R}_{1}}^{-1}(0)\right]\right|={\left[t{\beta }_{1}+{{R}_{1}}^{-1}(0)\right]}^{{n}_{2}}$$

$$\mathrm{ln}\left|{{R}_{1}}^{-1}\left(t\right)\right|\le {n}_{2}\mathrm{ln}(t{\beta }_{1}+{{R}_{1}}^{-1}(0))$$

(A12)

Similarly

$$\mathrm{ln}\left|{{R}_{2}}^{-1}\left(t\right)\right|\le {n}_{3}\mathrm{ln}(t{\beta }_{2}+{{R}_{2}}^{-1}(0))$$

(A13)

$$\mathrm{ln}\left|{{R}_{3}}^{-1}\left(t\right)\right|\le {n}_{2}\mathrm{ln}(t{\beta }_{3}+{{R}_{3}}^{-1}(0))$$

(A14)

$$\mathrm{ln}\left|{{R}_{4}}^{-1}\left(t\right)\right|\le {n}_{4}\mathrm{ln}(t{\beta }_{4}+{{R}_{4}}^{-1}(0))$$

(A15)

According to Theorem 1, we have

$$ \begin{aligned} &{\Vert {\widehat{\rho }}_{a}\left(t\right)-{\rho }_{a}\Vert }^{2}={O}\left(\frac{{M}\left(t\right)}{{\lambda }_{\mathrm{min}}[{{R}_{1}}^{-1}\left(t\right)]}\right)\\ & ={O}\left(\frac{{\left[\mathrm{ln}\left|{{R}_{1}}^{-1}\left(t\right)\right|\right]}^{c}+{\left[\mathrm{ln}\left|{{R}_{2}}^{-1}\left(t\right)\right|\right]}^{c}+{\left[\mathrm{ln}\left|{{R}_{3}}^{-1}\left(t\right)\right|\right]}^{c}+{\left[\mathrm{ln}\left|{{R}_{4}}^{-1}\left(t\right)\right|\right]}^{c}}{{t}{\alpha }_{1}}\right)\\ &={O}\left(\frac{{{{n}_{2}}^{c}\left[\mathrm{ln}\left({t\beta }_{1}+{{R}_{1}}^{-1}(0)\right)\right]}^{c}+{{n}_{3}}^{c}{\left[\mathrm{ln}\left(t{\beta }_{2}+{{R}_{2}}^{-1}(0)\right)\right]}^{c}+{{n}_{2}}^{c}{\left[\mathrm{ln}\left(t{\beta }_{3}+{{R}_{3}}^{-1}(0)\right)\right]}^{c}+{{n}_{4}}^{c}{\left[\mathrm{ln}\left(t{\beta }_{4}+{{R}_{4}}^{-1}(0)\right)\right]}^{c}}{ t{\alpha }_{1}}\right)\\& ={O}\left(\frac{{\left[\mathrm{ln}t\right]}^{c}}{ t}\right)\to 0 \end{aligned} $$

Now, the proof is complete. □

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shahriari, F., Arefi, M.M., Luo, H. et al. Multistage parameter estimation algorithms for identification of bilinear systems. Nonlinear Dyn 110, 2635–2655 (2022). https://doi.org/10.1007/s11071-022-07749-0

Download citation

Received: 26 January 2022
Accepted: 25 July 2022
Published: 11 August 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s11071-022-07749-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Multistage parameter estimation algorithms for identification of bilinear systems

Abstract

Similar content being viewed by others

Filtering-based maximum likelihood hierarchical recursive identification algorithms for bilinear stochastic systems

Least-squares-based iterative and gradient-based iterative estimation algorithms for bilinear systems

Recursive parameter identification of the dynamical models for bilinear state space systems

1 Introduction

2 Problem statement

2.1 Input–output model of bilinear system

3 Four-stage recursive least squares algorithm

3.1 Bilinear state observer algorithm

3.2 Convergence analysis

Assumption 1

Lemma 1

Theorem 1

Proof

Theorem 2

4 The computational efficiency

5 Four-stage stochastic gradient algorithm

6 Simulation results

6.1 Numerical example

6.2 Practical example

7 Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multistage parameter estimation algorithms for identification of bilinear systems

Abstract

Similar content being viewed by others

Filtering-based maximum likelihood hierarchical recursive identification algorithms for bilinear stochastic systems

Least-squares-based iterative and gradient-based iterative estimation algorithms for bilinear systems

Recursive parameter identification of the dynamical models for bilinear state space systems

Explore related subjects

1 Introduction

2 Problem statement

2.1 Input–output model of bilinear system

3 Four-stage recursive least squares algorithm

3.1 Bilinear state observer algorithm

3.2 Convergence analysis

Assumption 1

Lemma 1

Theorem 1

Proof

Theorem 2

4 The computational efficiency

5 Four-stage stochastic gradient algorithm

6 Simulation results

6.1 Numerical example

6.2 Practical example

7 Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation