1 Introduction

Originating from the needs of structural assessment and vibration control, identification of structural systems, states and excitations based on sparse measurements of structural responses has always been a challenging and important topic in structural health monitoring (SHM) [1,2,3,4,5].

In this regard, the classical Kalman filtering (KF) method has been widely known as an effective algorithm for system identification [6, 7]. This method is suitable for real-time recursive estimation of the complete states based on sparse measurements. However, the classical KF method could only be applied to linear systems with known structural parameters and input information, which great limited the application of this method. In the subsequent studies, some scholars proposed extended Kalman filter (EKF) methods by regarding the structural parameters as a part of the augmented state [8,9,10]. However, Erazo et al. believe that the augmentation of states and parameters increases the challenge of identification and proposes an offline method for output-only Bayesian identification of stochastic nonlinear systems [11]. Since the structural parameters were coupled with traditional states (referring to displacement and velocity), in the augmented state-space model, both the state equations and the measurement equations were inevitably nonlinear. Although the EKF-based methods could perform parameters identification and nonlinear system identification, these methods still required that all the inputs information was available.

With the development of the EKF method, some system identification methods under unknown input appeared in the last two decades. In the early stage, Gillijns and Moor derived a recursive identification of joint states and inputs using linear minimum-variance unbiased estimation, which required direct feedthrough of inputs in measurement equations [12]. Yang et al. derived an adaptive extended Kalman filter under unknown inputs from the global optimal perspective and applied it to structural damage identification [13]. This method could identify the structural augmented states and unknown inputs in real time, but required that the unknown inputs should be included in the measurement equation. This requirement was equivalent to imposing restrictions on the deployment of acceleration sensors. Hwang et al. developed a Kalman method to identify unknown input using generalized inverse of matrix [14, 15], which was later improved by Niu et al. [16]. Lin proposed an estimation method based on EKF to determine the time-dependent excitation force in a nonlinear system [17]. Pan et al. used a weighted least-square estimation method to derive a Kalman filtering method under unknown inputs and proved that the method is an optimal estimation in the sense of minimum variance and unbiasedness, but feedthrough of unknown inputs was still needed in the measurement equation, and the derivation was quite complicated [18]. Lourens et al. proposed an augmented Kalman filter (AKF) for force identification, in which unknown forces are included in the augmented state vector, and the state and unknown forces are solved using a method similar to the extended Kalman filtering [19]. In order to reduce the dimensionality of the state, a method based on reduced-order model was proposed [20]. Wei et al. also proposed an AKF based on sparse constraint theory. This method introduces the random walk assumption when modeling unknown inputs [21]. He et al. proposed a new EKF-based method for simultaneously identifying structural parameters and unknown inputs by introducing a projection matrix into the measurement equation [22]. Nayek et al. proposed a latent force model for joint input-state estimation by assuming that the unknown force is a stationary Gaussian process [23]. The authors also proposed some extended Kalman filter methods with unknown inputs, which still required measurement of the accelerations at the excitation location to identify the augmented states and unknown inputs in real time [24,25,26]. All the above EKF-UI methods could recursively identify the structural augmented states and unknown inputs in real time, but they all had strict requirements on the measurement equation. These methods require that the measurement equations should have the direct feedthrough of unknown inputs, which mean that all the acceleration responses at the positions of the unknown inputs must be available. In addition, it was worth mentioning that the significant impact of sensors deployment on identification issues had led to many advances in research on sensors optimization placement [27].

There were also some researches about the EKF-UI methods, which could identify the states of the current step and the unknown inputs of the previous step without requirement of unknown inputs term containing in the measurement equation [28, 29]. Pan et al. proposed a general extended Kalman filter for simultaneous estimation of system and unknown inputs [30], which had no restrictions on the deployment of acceleration sensors, but still needs to be considered separately in terms of real-time performance. That was, when the acceleration responses at the locations of unknown inputs were measured, the unknown inputs could be identified in real time, otherwise the identification of unknown inputs had a one-step lag. When the sampling time interval is short, the impact of this one-step lag may not be very significant. However, when the sampling time interval is long, the adverse effects of the one-step lag become uncontrollable. In addition, when implementing vibration control strategies for structural systems subject to unknown external inputs, it is crucial to provide real-time structural state and input information for the control system. If there is a one-step lag in the input information, it is possible to significantly prolong the control time or change the stability of the control system, ultimately leading to control failure. Therefore, real-time identification algorithms for unknown inputs have great potential in practical engineering applications and are worthy of further in-depth research.

With the continuous deepening of research, the new algorithms proposed in recent years have made some progress in breaking through the real-time performance constraints. The generalized extended Kalman filter with unknown input (GEKF-UI) algorithm proposed by the authors eliminates the limitation of real-time performance affected by sensor deployment by introducing first-order-hold (FOH) hypothesis into the discretization process of state equation [31,32,33]. This method can be applied to most scenarios, but it is still inadequate for some extreme cases, because the sensitivity matrix of excitation is very close to zero in these cases (Table 1).

Table 1 Summary of applicable scenarios and limitations of KF-based methods

The traditional Kalman-based methods always require linearization when dealing with nonlinear problems. This approximation may generate errors when encountering strong nonlinearity. In order to reduce the adverse impact of linearization on identification, Al Hussein et al. integrated unscented Kalman filtering (UKF) with iterative least squares (ILS) technology and proposed an unscented Kalman filter with unknown input (UKF-UI) [34]. This method is offline and requires measurement of the acceleration responses on all degrees of freedom. In subsequent studies, Lei et al. proposed a novel unscented Kalman filter for recursive state-input-system identification of nonlinear systems. The method is real time and only requires partial measurement of the responses [35]. Kirchner et al. proposed a new time-domain method for joint state/input estimation of mechanical systems using compressed sensing (CS) in a moving horizon estimator (MHE). Due to the use of a sliding window of time, the real-time performance of this method is flawed [36]. KF-based and UFK-based methods can only be applied to Gaussian noise, and particle filtering (PF) can be used for non-Gaussian noise. Liu et al. combined extended Kalman particle filter (EKPF) with least squares (LS) to propose a new method (EKPF-UI) for joint identification of structural parameters and unknown excitations [37]. Lei et al. further extended the applicability of EKPF-UI to systems without direct feedthrough of unknown excitation [38]. However, the curse of dimensionality is an inherent challenge of particle filtering, which limits the application of such methods.

In practice, identification under unknown inputs usually generates another tricky problem, such as the so-called drifts in the estimated structural displacements and inputs since the previous EKF-UI approaches based on sparse acceleration measurements are inherently unstable [39]. To solve this problem, Liu et al. have proposed an improved Kalman filter with unknown inputs based on data fusion of sparse acceleration and displacement responses [26]. Ma et al. also conducted research on data fusion-based Kalman filter and proposed an adaptive multi-rate Kalman filter to fuse asynchronous acceleration and vision measurements, which can realize better estimation of structural displacement [40, 41]. These data fusion-based Kalman filter technologies show great potential in practical applications.

In order to improve the performance of existing algorithms, an adaptive generalized extended Kalman filter with unknown inputs (AGEKF-UI) algorithm is proposed in this paper. It can simultaneously identify structural complete states, structural parameters and unknown inputs in real time based on sparse measurements of structural response. The real-time performance of this algorithm is not limited by the deployment of acceleration sensors and whether the system has direct feedthrough of unknown inputs. The proposed algorithm can automatically optimize the sensitivity matrix of unknown input and improve the identification accuracy of unknown input to the maximum extent. In order to eliminate the low-frequency drift in displacement and input estimation, data fusion technology is embedded into the proposed AGEKF-UI algorithm. Finally, a numerical example and an experimental test are used to demonstrate the effectiveness of the proposed method.

This paper is organized as follows: Section 1 introduces the research advance and the limitations of existing methods. Section 2 presents a general description of the dynamic system and derives the analytical recursive solutions of the AGEKF-UI method. In Sects. 3 and 4, a numerical example and an experimental test are conducted, respectively, to validate the performances of the proposed method. In Sect. 5, some conclusions are drawn. In addition, the appendixes provide some details of the derivation of the proposed algorithm.

2 Methodology

In this section, the discrete motion equations and measurement equations of general nonlinear dynamic systems are derived firstly, and then, a novel adaptive generalized extended Kalman filter with unknown input algorithm is derived in detail based on the principle of minimum-variance unbiased estimation.

2.1 Problem formulation

The equation of motion of an n-degrees of freedom (n-DOFs) structure under external input can be generally expressed as:

$${\varvec{M}}\ddot{{\varvec{x}}}+{\varvec{F}}\left({\varvec{x}},\dot{{\varvec{x}}},{\varvec{\theta}}\right)={{\varvec{\eta}}}^{u}{{\varvec{f}}}^{u}$$
(1)

where \({\varvec{M}}\) is the mass matrix; \({\varvec{x}}\), \(\dot{{\varvec{x}}}\) and \(\ddot{{\varvec{x}}}\) are the displacement, velocity and acceleration of the structure, respectively; \({\varvec{\theta}}\) is the unknown structural parametric vector; \({\varvec{F}}\left({\varvec{x}},\dot{{\varvec{x}}},{\varvec{\theta}}\right)\) is the force vector related to the displacement, velocity and structural parameters; \({{\varvec{f}}}^{u}\) is unmeasured external input; \({{\varvec{\eta}}}^{u}\) is influence matrix associated with the unknown input \({{\varvec{f}}}^{u}\). The superscript ‘u’ denotes unknown. For linear structures, \({\varvec{F}}\left({\varvec{x}},\dot{{\varvec{x}}},{\varvec{\theta}}\right)={\varvec{C}}\left({\varvec{\theta}}\right)\dot{{\varvec{x}}}+{\varvec{K}}\left({\varvec{\theta}}\right){\varvec{x}}\), where \({\varvec{C}}\) and \({\varvec{K}}\) are the damping and stiffness matrixes, respectively.

In this paper, structures are considered as time-invariant systems (i.e., assuming \(\dot{{\varvec{\theta}}}\equiv{\mathbf {0}}\)). Therefore, by introducing an augmented state vector \({\varvec{Z}}\stackrel{\scriptscriptstyle \vartriangle}{=}[{{\varvec{x}}}^{T},{\dot{{\varvec{x}}}}^{T},{{\varvec{\theta}}}^{T}{]}^{T}\), the equation of motion (1) can be transformed into continuous state equation in the state-space model, and the measurement equation can also be expressed by the augmented state vector.

$$\left\{\begin{array}{l}\dot{{\varvec{Z}}}=\varvec g\left({\varvec{Z}}\right)+\varvec E{{\varvec{f}}}^{u}\\ \varvec y= \varvec h\left({\varvec{Z}}\right)+\varvec D{{\varvec{f}}}^{u}\end{array}\right.$$
(2)
$${\varvec{g}}\left({\varvec{Z}}\right)=\left[\begin{array}{l}\dot{{\varvec{x}}}\\ -{{\varvec{M}}}^{-1}{\varvec{F}}\left({\varvec{x}},\dot{{\varvec{x}}},{\varvec{\theta}}\right)\\ {\mathbf{0}}\end{array}\right],{\varvec{E}}={{\varvec{M}}}^{-1}{{\varvec{\eta}}}^{u}$$
(3)

\({\varvec{y}}\) is the measurement of the structural system. The specific form of \({\varvec{h}}\left({\varvec{Z}}\right)\) and \({\varvec{D}}\) will be discussed in detail in Sect. 2.4 later.

Note that, whether the structure itself is linear or not, the state equation and the measurement equation are nonlinear, which is attributed to the structural parameter \({\varvec{\theta}}\) is unknown. In order to derive the closed-form solution of the identification problem, Eq. (2) needs to be linearized at \({\widehat{{\varvec{Z}}}}_{k|k}\) and \({\widetilde{{\varvec{Z}}}}_{k+1|k}\) by the first-order Taylor series.

$$ \dot{\varvec{Z}} \approx {\varvec{G}}_{k|k} {\varvec{Z}} + {\varvec{Ef}}^{u} + \overline{\varvec{g}}_{k|k} $$
(4)
$${\varvec{y}}\approx {{\varvec{H}}}_{k+1|k}{\varvec{Z}}+{\varvec{D}}{{\varvec{f}}}^{u}+ {{\varvec{h}}}_{k+1|k}$$
(5)

in which

$${{\varvec{G}}}_{k|k}={\left.\frac{\partial {\varvec{g}}\left({\varvec{Z}}\right)}{\partial {{\varvec{Z}}}^{T}}\right|}_{{\varvec{Z}}={\widehat{{\varvec{Z}}}}_{k|k}}$$
(6)
$${{\varvec{H}}}_{k+1|k}={\left.\frac{\partial {\varvec{h}}\left({\varvec{Z}}\right) }{\partial {{\varvec{Z}}}^{T}}\right|}_{{\varvec{Z}}={\widetilde{{\varvec{Z}}}}_{k+1|k}}$$
(7)
$$ \overline{\varvec{g}}_{k|k} = {\varvec{g}}\left( {\widehat{\varvec{Z}}_{k|k} } \right) - {\varvec{G}}_{k|k} \widehat{\varvec{Z}}_{k|k} $$
(8)
$${{\varvec{h}}}_{k+1|k}={\varvec{h}}\left({\widetilde{{\varvec{Z}}}}_{k+1|k}\right)-{{\varvec{H}}}_{k+1|k}{\widetilde{{\varvec{Z}}}}_{k+1|k}$$
(9)

in which \({\widehat{{\varvec{Z}}}}_{k|k}\) represent the estimate value of \({{\varvec{Z}}}_{k}\); \({\widetilde{{\varvec{Z}}}}_{k+1|k}\) represent the approximate predictive value of \({{\varvec{Z}}}_{k+1}\). In order to calculate \({{\varvec{H}}}_{k+1|k}\) in Eq. (7), the value of \({\widetilde{{\varvec{Z}}}}_{k+1|k}\) should be calculated in advance. Generally, \({\widetilde{{\varvec{Z}}}}_{k+1|k}\) can be calculated by the following integration:

$${\widetilde{{\varvec{Z}}}}_{k+1|k}={\widehat{{\varvec{Z}}}}_{k|k}+{\int }_{k\Delta t}^{\left(k+1\right)\Delta t}{\varvec{g}}\left({\varvec{Z}}\right)dt+\left({\varvec{E}}\Delta t\right){\widehat{{\varvec{f}}}}_{k|k}^{u}$$
(10)

in which \({\widehat{{\varvec{f}}}}_{k|k}^{u}\) represent the estimate value of \({{\varvec{f}}}_{k}^{u}\), \(\Delta t\) denotes the sampling interval.

The solution of the first-order differential Eq. (4) requires an assumption for the unknown input \({{\varvec{f}}}^{u}\). In the existing literatures related to EKF-UI algorithm [13, 22, 26, 30], almost all the assumptions of unknown input are zero-order-hold (ZOH), that is,

$${{\varvec{f}}}^{u}\left(t\right)\stackrel{\scriptscriptstyle\vartriangle}{=}{{\varvec{f}}}_{k}^{u},t\in [{t}_{k},{t}_{k+1})$$
(11)

Note: It is not necessary to provide \({{\varvec{f}}}_{k+1}^{u}\) when solving Eq. (4). With this assumption [i.e., Eq. (11)], the linearized continuous state Eq. (4) can be transformed into a linearized discrete state equation as follows [32]:

$${{\varvec{Z}}}_{k+1}={{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+{{\varvec{B}}}_{k}^{zoh}{{\varvec{f}}}_{k}^{u}+{{\varvec{g}}}_{k|k}$$
(12)
$$ \left\{ {\begin{array}{*{20}l} {{\varvec{A}}_{k} = e^{{\left( {{\varvec{G}}_{k|k} \Delta t} \right)}} { }} \\ {{\varvec{B}}_{k}^{zoh} = {\mathbb{I}}_{1} \left( {{\varvec{E}}\Delta t} \right) } \\ {{\varvec{g}}_{k|k} = {\mathbb{I}}_{1} \left( {\overline{\varvec{g}}_{k|k} \Delta t} \right)} \\ \end{array} } \right. $$
(13)

in which \({\mathbb{I}}_{1}=\left({{\varvec{A}}}_{k}-\mathbf{I}\right){\left({{\varvec{G}}}_{k|k}\Delta t\right)}^{-1}\). Note that \({\mathbb{I}}_{1}\approx \mathbf{I}\) can be easily proved.

Due to the fact that actual measurements are always discrete, the measurement equation should also be discretized, i.e.,

$${{\varvec{y}}}_{k+1}={{\varvec{H}}}_{k+1|k}{{\varvec{Z}}}_{k+1}+{\varvec{D}}{{\varvec{f}}}_{k+1}^{u}+ {{\varvec{h}}}_{k+1|k}$$
(14)

Substituting Eq. (12) into Eq. (14) gives:

$${{\varvec{y}}}_{k+1}={{\varvec{H}}}_{k+1|k}\left({{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+{{\varvec{B}}}_{k}^{zoh}{{\varvec{f}}}_{k}^{u}+{{\varvec{g}}}_{k|k}\right)+{\varvec{D}}{{\varvec{f}}}_{k+1}^{u}+ {{\varvec{h}}}_{k+1|k}$$
(15)

This equation means that in order to estimate unknown input in real time, it is necessary to ensure that \({\varvec{D}}\) in the measurement equation is a column full-rank matrix. In other words, the real-time identification performance of unknown input is directly subject to whether there is a direct feedthrough of unknown input in the measurement equation.

However, in practice, sometimes it is impractical to deploy accelerometer at the position of unknown input (e.g., the case of substructure identification with unknown interaction force at the boundary), which results in \({\varvec{D}}\) becoming a column deficient-rank matrix, even a zero matrix. Another application scenario is the earthquake condition. If the absolute acceleration of the structure is used as the measurement, it will naturally lead to \({\varvec{D}}\) being equal to zero.

According to Eq. (15), when \({\varvec{D}}=0\) is hold, the unknown input of the previous instant (i.e. \({{\varvec{f}}}_{k}^{u}\)) instead of the current instant (i.e. \({{\varvec{f}}}_{k+1}^{u}\)) can be estimated by the current measurement (i.e. \({{\varvec{y}}}_{k+1}\)), which means that there is a one-step lag in the identification of the unknown input. When \({\varvec{D}}\) is a column deficient-rank matrix, only some unknown inputs can be identified in real time, while others can be identified with one-step lag [30].

2.2 Adaptive discrete equation of state for structural dynamical system.

In some existing input identification methods, it is an effective way to introduce a virtual model into the input evolution, such as the famous random walk (RW) model [19, 21], that is:

$${{\varvec{f}}}_{k+1}^{u}\stackrel{\scriptscriptstyle\vartriangle}{=}{{\varvec{f}}}_{k}^{u}+{{\varvec{\varepsilon}}}_{k}^{u}$$
(16)

in which \({{\varvec{\varepsilon}}}_{k}^{u}\) is a random walk error that satisfies the Gaussian distribution assumption and has a mean of zero and a covariance of \({{\varvec{Q}}}_{k}^{u}\).

The most important step in the novel scheme is to fuse Eqs. (12) and (16) to derive a general discrete state equation with an undetermined sensitivity matrix.

$$\begin{aligned} {{\varvec{Z}}}_{k+1}&={{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+{{\varvec{B}}}_{k}^{zoh}{{\varvec{f}}}_{k}^{u}+{{\varvec{g}}}_{k|k} \\ &={{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+\left({{\varvec{B}}}_{k}^{zoh}-{{\varvec{B}}}_{k+1}^{\text{opt}}\right){{\varvec{f}}}_{k}^{u}+{{\varvec{B}}}_{k+1}^{\text{opt}}\left({{\varvec{f}}}_{k+1}^{u}-{{\varvec{\varepsilon}}}_{k}^{u}\right)+{{\varvec{g}}}_{k|k} \\ &={{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+\left({{\varvec{B}}}_{k}^{zoh}-{{\varvec{B}}}_{k+1}^{\text{opt}}\right){{\varvec{f}}}_{k}^{u}+{{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{f}}}_{k+1}^{u}+{{\varvec{g}}}_{k|k}-{{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{\varepsilon}}}_{k}^{u} \\ &={{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+{{\varvec{B}}}_{k}^{\text{opt}}{{\varvec{f}}}_{k}^{u}+{{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{f}}}_{k+1}^{u}+{{\varvec{g}}}_{k|k}+{{\varvec{w}}}_{k}^{u} \\ \end{aligned} $$
(17)

in which \({{\varvec{B}}}_{k+1}^{\text{opt}}\) is an undetermined matrix, \({{\varvec{B}}}_{k}^{\text{opt}}={{\varvec{B}}}_{k}^{zoh}-{{\varvec{B}}}_{k+1}^{\text{opt}}\), \({{\varvec{w}}}_{k}^{u}=-{{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{\varepsilon}}}_{k}^{u}\). It is worth mentioning that the determination of \({{\varvec{B}}}_{k}^{\text{opt}}\) and \({{\varvec{B}}}_{k+1}^{\text{opt}}\) matrices need not be based on a known assumption about unknown input.

In order to clearly show the differences between different algorithms, Tables 2 and 3 summarize the coefficients of the unknown input of the discrete state equation and measurement equation in different algorithms, respectively. It can be concluded that: (1) the "ZOH scheme" cannot realize real-time identification when \({\varvec{D}}=0\), (2) the "FOH scheme" can realize real-time identification but performs poorly when \(\left({\varvec{D}}+{{\varvec{H}}}_{k+1|k}{{\varvec{B}}}_{k+1}^{foh}\right)\) is ill-posed, and (3) the "ZOH + RW scheme" can realize real-time identification and automatically improve \(\left({\varvec{D}}+{{\varvec{H}}}_{k+1|k}{{\varvec{B}}}_{k+1}^{\text{opt}}\right)\) as needed.

Table 2 Comparison of coefficients of \({f}_{k}^{u}\) and \({f}_{k+1}^{u}\) in the state equation of different schemes
Table 3 Comparison of coefficient of \({f}_{k+1}^{u}\) in the measurement equation of different schemes

Remark

The symbols \({\mathbb{I}}_{2}\) and \({\mathbb{O}}_{2}\) in Table 2 are defined as \({\mathbb{I}}_{2}=\left({{\varvec{A}}}_{k}-{\mathbb{I}}_{1}\right){\left({{\varvec{G}}}_{k|k}\Delta t\right)}^{-1}\approx \mathbf{I}\) and \({\mathbb{O}}_{2}=\left({\mathbb{I}}_{1}-\mathbf{I}\right){\left({{\varvec{G}}}_{k|k}\Delta t\right)}^{-1}\approx {\mathbf{0}}\), respectively.

2.3 The proposed AGEKF-UI algorithm

The standard forms of state equation and measurement equation are summarized as follows:

$${{\varvec{Z}}}_{k+1}={{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+{{\varvec{B}}}_{k}^{\text{opt}}{{\varvec{f}}}_{k}^{u}+{{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{f}}}_{k+1}^{u}+{{\varvec{g}}}_{k|k}+{{\varvec{w}}}_{k}$$
(18)
$${{\varvec{y}}}_{k+1}={{\varvec{H}}}_{k+1|k}{{\varvec{Z}}}_{k+1}+{\varvec{D}}{{\varvec{f}}}_{k+1}^{u}+ {{\varvec{h}}}_{k+1|k}+ {{\varvec{v}}}_{k+1}$$
(19)

in which \({{\varvec{w}}}_{k}={{\varvec{w}}}_{k}^{u}+{{\varvec{w}}}_{k}^{s}=-{{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{\varepsilon}}}_{k}^{u}+{{\varvec{w}}}_{k}^{s}\) is the total system error considering RW error and modeling error; \({{\varvec{w}}}_{k}^{s}\) is the modeling error that satisfies the Gaussian distribution assumption and has a mean of zero and a covariance of \({{\varvec{Q}}}_{k}^{s}\). Then, the mean of the total system error \({{\varvec{w}}}_{k}\) is zero and the variance is \({{\varvec{Q}}}_{k}={{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{Q}}}_{k}^{u}{\left({{\varvec{B}}}_{k+1}^{\text{opt}}\right)}^{T}+{{\varvec{Q}}}_{k}^{s}\) (Note that for the existing GEKF-UI algorithms [31,32,33], let \({{\varvec{Q}}}_{k}^{u}\equiv 0\)). \({{\varvec{v}}}_{k+1}\) is the measurement error that satisfies the Gaussian distribution assumption and has a mean of zero and a covariance of \({{\varvec{R}}}_{k+1}\).

The recursive process of the AGEKF-UI algorithm is designed as follows.

$$ \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} = {\varvec{A}}_{k} \widehat{\varvec{Z}}_{k|k} + {\varvec{B}}_{k}^{\text{opt}} \hat{\varvec{f}}_{k|k}^{u} + {\varvec{g}}_{k|k} $$
(20)
$$ \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }_{k + 1|k} = {\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} + {\varvec{h}}_{k + 1|k} $$
(21)
$$ {\varvec{r}}_{k + 1}^{{\varvec{y}}} = {\varvec{y}}_{k + 1} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }_{k + 1|k} $$
(22)
$${\widehat{{\varvec{f}}}}_{k+1|k+1}^{u}={{\varvec{S}}}_{k+1}{{\varvec{r}}}_{k+1}^{{\varvec{y}}}$$
(23)
$$ \widehat{\varvec{Z}}_{k + 1|k + 1} = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} + {\varvec{L}}_{k + 1} {\varvec{r}}_{k + 1}^{{\varvec{y}}} $$
(24)

where \({{\varvec{B}}}_{k+1}^{\text{opt}}\), \({{\varvec{S}}}_{k+1}\) and \({{\varvec{L}}}_{k+1}\) are undetermined coefficient matrices, which will be determined later by the principle of minimum-variance unbiased estimation.

2.3.1 Solving the unknown input

Define the state residual \({{\varvec{r}}}_{k+1}^{{\varvec{z}}}\) and its error \(\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{e} }_{k + 1|k}^{{\varvec{Z}}}\) as follows:

$$ {\varvec{r}}_{k + 1}^{{\varvec{z}}} \triangleq {\varvec{Z}}_{k + 1} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} = {\varvec{B}}_{k + 1}^{\text{opt}} {\varvec{f}}_{k + 1}^{u} + \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{e} }_{k + 1|k}^{{\varvec{Z}}} $$
(25)
$$ \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{e} }_{k + 1|k}^{{\varvec{Z}}} \triangleq {\varvec{A}}_{k} \hat{\varvec{e}}_{k|k}^{{\varvec{Z}}} + {\varvec{B}}_{k}^{\text{opt}} \hat{\varvec{e}}_{k|k}^{{\varvec{f}}} + {\varvec{w}}_{k} $$
(26)

Then, define the covariance matrix \(\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}}\) of the state residual error as follows:

$$ \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} \triangleq E\left[ {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{e} }_{k + 1|k}^{{\varvec{Z}}} \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{e} }_{k + 1|k}^{{\varvec{Z}}} } \right)^{T} } \right] = \left[ {\begin{array}{*{20}l} {{\varvec{A}}_{k} } & {{\varvec{B}}_{k}^{\text{opt}} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {\hat{\varvec{P}}_{k|k}^{{{\varvec{ZZ}}}} } & {\hat{\varvec{P}}_{k|k}^{{{\varvec{Zf}}}} } \\ {\hat{\varvec{P}}_{k|k}^{{{\varvec{fZ}}}} } & {\hat{\varvec{P}}_{k|k}^{{{\varvec{ff}}}} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {{\varvec{A}}_{k}^{T} } \\ {\left( {{\varvec{B}}_{k}^{\text{opt}} } \right)^{T} } \\ \end{array} } \right] + {\varvec{Q}}_{k} $$
(27)

where \(E\left[ \cdot \right]\) represents the operation of calculating the mathematical expectation of a random variable. \(\left( \cdot \right)^{T}\) represents the operation of calculating the transposed matrix. \({\widehat{{\varvec{P}}}}_{k|k}^{{\varvec{Z}}{\varvec{Z}}}\), \({\widehat{{\varvec{P}}}}_{k|k}^{{\varvec{f}}{\varvec{f}}}\), \({\widehat{{\varvec{P}}}}_{k|k}^{{\varvec{Z}}{\varvec{f}}}\) and \({\widehat{{\varvec{P}}}}_{k|k}^{{\varvec{f}}{\varvec{Z}}}\) represent the state covariance matrix, the unknown input covariance matrix and the cross-term covariance matrix, respectively.

Define the measurement residual \({{\varvec{r}}}_{k+1}^{{\varvec{y}}}\) and its error \({{\varvec{e}}}_{k+1}^{{\varvec{y}}}\) as follows:

$$ {\varvec{r}}_{k + 1}^{{\varvec{y}}} \triangleq {\varvec{y}}_{k + 1} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }_{k + 1|k} = {\varvec{T}}_{k + 1} {\varvec{f}}_{k + 1}^{u} + {\varvec{e}}_{k + 1}^{{\varvec{y}}} $$
(28)
$$ {\varvec{e}}_{k + 1}^{{\varvec{y}}} \triangleq {\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{e} }_{k + 1|k}^{{\varvec{Z}}} + {\varvec{v}}_{k + 1} $$
(29)

In which \({{\varvec{T}}}_{k+1}={\varvec{D}}+{{\varvec{H}}}_{k+1|k}{{\varvec{B}}}_{k+1}^{\text{opt}}\). Then, define the covariance matrix \({\widetilde{{\varvec{R}}}}_{k+1}\) of the measurement residual error as follows:

$$ \tilde{\varvec{R}}_{k + 1} \triangleq E\left[ {{\varvec{e}}_{k + 1}^{{\varvec{y}}} \left( {{\varvec{e}}_{k + 1}^{{\varvec{y}}} } \right)^{T} } \right] = {\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} + {\varvec{R}}_{k + 1} $$
(30)

Note that the reversibility of \({\widetilde{{\varvec{R}}}}_{k+1}\) can generally be guaranteed, partly because \({{\varvec{R}}}_{k+1}\) is reversible. In order to ensure that the estimate of the unknown input is unbiased, the following equation must hold.

$$E\left[{\widehat{{\varvec{f}}}}_{k+1|k+1}^{u}\right]\equiv {{\varvec{f}}}_{k+1}^{u}\Rightarrow {{\varvec{S}}}_{k+1}{{\varvec{T}}}_{k+1}{{\varvec{f}}}_{k+1}^{u}\equiv {{\varvec{f}}}_{k+1}^{u}$$
(31)

Simplify the above equation to get the following equation:

$${{\varvec{S}}}_{k+1}{{\varvec{T}}}_{k+1}=\mathbf{I}$$
(32)

Equation (32) is the first constraint that the undetermined coefficients matrix \({{\varvec{S}}}_{k+1}\) must satisfy. In addition, \({{\varvec{S}}}_{k+1}\) should also meet the requirement of minimizing the estimation error of the unknown input. Define the error of the unknown input and the corresponding covariance matrix as follows:

$${\widehat{{\varvec{e}}}}_{k+1|k+1}^{{\varvec{f}}}\triangleq {{\varvec{f}}}_{k+1}^{u}-{\widehat{{\varvec{f}}}}_{k+1|k+1}^{u}={{\varvec{f}}}_{k+1}^{u}-{{\varvec{S}}}_{k+1}{{\varvec{r}}}_{k+1}^{{\varvec{y}}}=-{{\varvec{S}}}_{k+1}{{\varvec{e}}}_{k+1}^{{\varvec{y}}}$$
(33)
$${\widehat{{\varvec{P}}}}_{k+1|k+1}^{{\varvec{f}}{\varvec{f}}}\triangleq E\left[\left({\widehat{{\varvec{e}}}}_{k+1|k+1}^{{\varvec{f}}}\right){\left({\widehat{{\varvec{e}}}}_{k+1|k+1}^{{\varvec{f}}}\right)}^{T}\right]={{\varvec{S}}}_{k+1}{\widetilde{{\varvec{R}}}}_{k+1}{{\varvec{S}}}_{k+1}^{T}$$
(34)

Using the Lagrange multiplier method (LMM), define the objective function that minimizes the unknown input error as follows:

$${P}_{{\varvec{f}}}\left({{\varvec{S}}}_{k+1},{\boldsymbol{\Gamma }}_{k+1}|{{\varvec{B}}}_{k+1}^{\text{opt}}\right)\triangleq {\text{tr}}\left({{\varvec{S}}}_{k+1}{\widetilde{{\varvec{R}}}}_{k+1}{{\varvec{S}}}_{k+1}^{T}\right)-2{\text{tr}}\left[\left({{\varvec{S}}}_{k+1}{{\varvec{T}}}_{k+1}-\mathbf{I}\right){\boldsymbol{\Gamma }}_{k+1}^{T}\right]$$
(35)

in which \({\text{tr}}\left( \cdot \right)\) represents the operation of computing the trace of a square matrix. \({\boldsymbol{\Gamma }}_{k+1}\) is the Lagrange multiplier coefficient matrix to be determined. In order to obtain the minimum value of the objective function \({P}_{{\varvec{f}}}\), the derivative of the objective function with respect to \({{\varvec{S}}}_{k+1}\) should be set to zero, whereby the second constraint on the unknown coefficient matrix \({{\varvec{S}}}_{k+1}\) can be obtained.

$$\frac{\partial {P}_{{\varvec{f}}}}{\partial {{\varvec{S}}}_{k+1}}=0\Rightarrow {{\varvec{S}}}_{k+1}{\widetilde{{\varvec{R}}}}_{k+1}-{\boldsymbol{\Gamma }}_{k+1}{{\varvec{T}}}_{k+1}^{T}=0$$
(36)

The two constraints on \({{\varvec{S}}}_{k+1}\) [i.e., Eq. (32) and Eq. (36)] can be combined to constitute a simultaneous equation system as follows:

$$\left[\begin{array}{cc}{{\varvec{S}}}_{k+1}& {\boldsymbol{\Gamma }}_{k+1}\end{array}\right]\left[\begin{array}{cc}{\widetilde{{\varvec{R}}}}_{k+1}& {{\varvec{T}}}_{k+1}\\ -{{\varvec{T}}}_{k+1}^{T}& {\mathbf {0}}\end{array}\right]=\left[\begin{array}{cc}{\mathbf {0}}& {\mathbf{I}}\end{array}\right]$$
(37)

According to the matrix theory (refer to “Appendix 2”), when \({\widetilde{{\varvec{R}}}}_{k+1}\) is invertible, the above equation system has a unique solution. Thus, the undetermined coefficient matrix \({{\varvec{S}}}_{k+1}\) can be obtained.

$${\boldsymbol{\Gamma }}_{k+1}={\left({{\varvec{T}}}_{k+1}^{T}{\widetilde{{\varvec{R}}}}_{k+1}^{-1}{{\varvec{T}}}_{k+1}\right)}^{-1}$$
(38)
$${{\varvec{S}}}_{k+1}={\boldsymbol{\Gamma }}_{k+1}{{\varvec{T}}}_{k+1}^{T}{\widetilde{{\varvec{R}}}}_{k+1}^{-1}$$
(39)

Substituting Eq. (39) into Eq. (34) gives the minimum of \({\widehat{{\varvec{P}}}}_{k+1|k+1}^{{\varvec{f}}{\varvec{f}}}\) under the condition that \({{\varvec{B}}}_{k+1}^{\text{opt}}\) is known a priori as follows:

$${\widehat{{\varvec{P}}}}_{k+1|k+1}^{{\varvec{f}}{\varvec{f}}}={\boldsymbol{\Gamma }}_{k+1}$$
(40)

In addition, it can be seen from Eq. (38) that the implicit premise of the existence of solutions for \({\boldsymbol{\Gamma }}_{k+1}\) is that \(\left({{\varvec{T}}}_{k+1}^{T}{\widetilde{{\varvec{R}}}}_{k+1}^{-1}{{\varvec{T}}}_{k+1}\right)\) is invertible, which is equivalent to the number of measurements must be greater than or equal to the number of unknown inputs. This conclusion will be proved as follows.

Let the number of measurements (including the sum of acceleration, velocity, displacement, strain, etc.) be \({n}_{s}\), and the number of unknown inputs be \({n}_{f}\). Define \({\text{Rank}}\left( \cdot \right)\) as an operation of computing the rank of a matrix. Then, inequality \({\text{Rank}}\left({{\varvec{T}}}_{k+1}^{T}{\widetilde{{\varvec{R}}}}_{k+1}^{-1}{{\varvec{T}}}_{k+1}\right)\le {\text{min}}\left({n}_{s},{n}_{f}\right)\) can be easily obtained according to matrix theory. When \({n}_{s}<{n}_{f}\), \(\left({{\varvec{T}}}_{k+1}^{T}{\widetilde{{\varvec{R}}}}_{k+1}^{-1}{{\varvec{T}}}_{k+1}\right)\) is a deficient-rank matrix, which will result in no solution for Eq. (38). Therefore, \({n}_{s}\ge {n}_{f}\) becomes a necessary condition for applying the proposed method. That is, the number of measurements must be greater than or equal to the number of unknown inputs. So far, we have obtained the unknown input of the current step according to the principle of minimum-variance unbiased estimation. Next, we will use a similar process to solve the minimum-variance unbiased estimate of state.

2.3.2 Solving the unknown state

In order to ensure that the estimate of the unknown state is unbiased, the following equation must hold.

$$E\left[{\widehat{{\varvec{Z}}}}_{k+1|k+1}\right]\equiv {{\varvec{Z}}}_{k+1}\Rightarrow \left({{\varvec{Z}}}_{k+1}-{{\varvec{B}}}_{k+1}^{\text{opt}}{{\varvec{f}}}_{k+1}^{u}\right)+{{\varvec{L}}}_{k+1}{{\varvec{T}}}_{k+1}{{\varvec{f}}}_{k+1}^{u}\equiv {{\varvec{Z}}}_{k+1}$$
(41)

Simplify the above equation to get the following equation:

$${{\varvec{L}}}_{k+1}{{\varvec{T}}}_{k+1}-{{\varvec{B}}}_{k+1}^{\text{opt}}={\mathbf{ 0}}$$
(42)

Equation (42) is the first constraint that the undetermined coefficients matrix \({{\varvec{L}}}_{k+1}\) must satisfy. In addition, \({{\varvec{L}}}_{k+1}\) should also meet the requirement of minimizing the estimation error of the unknown state. Define the error of the unknown state and the corresponding covariance matrix as follows:

$$ \hat{\varvec{e}}_{k + 1|k + 1}^{{\varvec{Z}}} \triangleq {\varvec{Z}}_{k + 1} - \widehat{\varvec{Z}}_{k + 1|k + 1} { } = \left( {{\mathbf{I}} - {\varvec{L}}_{k + 1} {\varvec{H}}_{k + 1|k} } \right)\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{e} }_{k + 1|k}^{{\varvec{Z}}} - {\varvec{L}}_{k + 1} {\varvec{v}}_{k + 1} $$
(43)
$$ \begin{aligned} \hat{\varvec{P}}_{k + 1|k + 1}^{{{\varvec{ZZ}}}} & \triangleq E\left[ {\left( {\hat{\varvec{e}}_{k + 1|k + 1}^{{\varvec{Z}}} } \right)\left( {\hat{\varvec{e}}_{k + 1|k + 1}^{{\varvec{Z}}} } \right)^{T} } \right] \\ & = \left( {{\mathbf{I}} - {\varvec{L}}_{k + 1} {\varvec{H}}_{k + 1|k} } \right)\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} \left( {{\mathbf{I}} - {\varvec{L}}_{k + 1} {\varvec{H}}_{k + 1|k} } \right)^{T} + {\varvec{L}}_{k + 1} {\varvec{R}}_{k + 1} {\varvec{L}}_{k + 1}^{T} \\ & = {\varvec{L}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{L}}_{k + 1}^{T} - {\varvec{L}}_{k + 1} {\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} {\varvec{L}}_{k + 1}^{T} + \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} \\ \end{aligned} $$
(44)

Using the Lagrange multiplier method, define the objective function that minimizes the unknown state error as follows:

$$ \begin{aligned} P_{{\varvec{Z}}} \left( {{\varvec{L}}_{k + 1} ,{\varvec{\varLambda}}_{k + 1} |{\varvec{B}}_{k + 1}^{\text{opt}} } \right) & \triangleq tr\left( {{\varvec{L}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{L}}_{k + 1}^{T} - 2\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} {\varvec{L}}_{k + 1}^{T} + \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} } \right) \\ & - 2tr\left[ {\left( {{\varvec{L}}_{k + 1} {\varvec{T}}_{k + 1} - {\varvec{B}}_{k + 1}^{\text{opt}} } \right){\varvec{\varLambda}}_{k + 1}^{T} } \right] \\ \end{aligned} $$
(45)

in which \({\boldsymbol{\Lambda }}_{k+1}\) is the Lagrange multiplier coefficient matrix to be determined. In order to obtain the minimum value of the objective function \({P}_{{\varvec{Z}}}\), the derivative of the objective function with respect to \({{\varvec{L}}}_{k+1}\) should be set to zero, whereby the second constraint on the unknown coefficient matrix \({{\varvec{L}}}_{k+1}\) can be obtained.

$$ \frac{{\partial P_{{\varvec{Z}}} }}{{\partial {\varvec{L}}_{k + 1} }} = {\mathbf{ 0}} \Rightarrow {\varvec{L}}_{k + 1} \tilde{\varvec{R}}_{k + 1} -{\varvec{\varLambda}}_{k + 1} {\varvec{T}}_{k + 1}^{T} = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} $$
(46)

The two constraints on \({{\varvec{L}}}_{k+1}\) [i.e., Eqs. (42) and (46)] can be combined to constitute a simultaneous equation system as follows:

$$ \left[ {\begin{array}{*{20}l} {{\varvec{L}}_{k + 1} } & {{\varvec{\varLambda}}_{k + 1} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {\tilde{\varvec{R}}_{k + 1} } & {{\varvec{T}}_{k + 1} } \\ { - {\varvec{T}}_{k + 1}^{T} } &{\mathbf{ 0}} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} } & {{\varvec{B}}_{k + 1}^{\text{opt}} } \\ \end{array} } \right] $$
(47)

According to the matrix theory (refer to “Appendix 2”), when \({\widetilde{{\varvec{R}}}}_{k+1}\) is invertible, the above equation system has a unique solution. Thus, the undetermined coefficient matrix \({{\varvec{L}}}_{k+1}\) can be obtained.

$$ \left[ {\begin{array}{*{20}l} {{\varvec{L}}_{k + 1} } \\ {{\varvec{\varLambda}}_{k + 1} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} \tilde{\varvec{R}}_{k + 1}^{ - 1} \left( {{\mathbf{I}} - {\varvec{T}}_{k + 1} {\varvec{S}}_{k + 1} } \right) + {\varvec{B}}_{k + 1}^{\text{opt}} {\varvec{S}}_{k + 1} } \\ { - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} {\varvec{S}}_{k + 1}^{T} + {\varvec{B}}_{k + 1}^{\text{opt}}{\varvec{\varGamma}}_{k + 1} } \\ \end{array} } \right] $$
(48)

In order to simplify the above formulas, some new definitions are introduced as follows:

$$ {\varvec{K}}_{k + 1} \triangleq \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} \tilde{\varvec{R}}_{k + 1}^{ - 1} $$
(49)
$${\boldsymbol{\Phi }}_{k+1}\triangleq {{\varvec{B}}}_{k+1}^{\text{opt}}-{{\varvec{K}}}_{k+1}{{\varvec{T}}}_{k+1}$$
(50)

Substituting Eqs. (39), (49) and Eq. (50) into Eq. (48) gives the expression of \({{\varvec{L}}}_{k+1}\) and \({\boldsymbol{\Lambda }}_{k+1}\) as follows:

$${\boldsymbol{\Lambda }}_{k+1}={\boldsymbol{\Phi }}_{k+1}{\boldsymbol{\Gamma }}_{k+1}$$
(51)
$${{\varvec{L}}}_{k+1}={{\varvec{K}}}_{k+1}+{\boldsymbol{\Phi }}_{k+1}{{\varvec{S}}}_{k+1}$$
(52)

Substituting Eq. (52) into Eq. (44) and considering \({\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} = \tilde{\varvec{R}}_{k + 1} {\varvec{K}}_{k + 1}^{T}\), the minimum of \({\widehat{{\varvec{P}}}}_{k+1|k+1}^{{\varvec{Z}}{\varvec{Z}}}\) under the condition that \({{\varvec{B}}}_{k+1}^{\text{opt}}\) is known a priori can be expressed as follows:

$$ \begin{aligned} \hat{\varvec{P}}_{k + 1|k + 1}^{{{\varvec{ZZ}}}} & = \left( {{\varvec{K}}_{k + 1} +{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} } \right)\tilde{\varvec{R}}_{k + 1} \left( {{\varvec{K}}_{k + 1} +{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} } \right)^{T} \\ & \quad - \left( {{\varvec{K}}_{k + 1} +{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} } \right){\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} \left( {{\varvec{K}}_{k + 1} +{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} } \right)^{T} + \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} \\ & = \left( {{\varvec{K}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{K}}_{k + 1}^{T} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} {\varvec{K}}_{k + 1}^{T} } \right) \\ & \quad + \left( {{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{K}}_{k + 1}^{T} -{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} {\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} } \right) \\ & \quad + \left( {{\varvec{K}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{S}}_{k + 1}^{T}{\varvec{\varPhi}}_{k + 1}^{T} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} {\varvec{S}}_{k + 1}^{T}{\varvec{\varPhi}}_{k + 1}^{T} } \right) \\ & \quad + \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} - {\varvec{K}}_{k + 1} {\varvec{H}}_{k + 1|k} \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} +{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{S}}_{k + 1}^{T}{\varvec{\varPhi}}_{k + 1}^{T} } \right) \\ & = \left( {{\mathbf{I}} - {\varvec{K}}_{k + 1} {\varvec{H}}_{k + 1|k} } \right)\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} +{\varvec{\varPhi}}_{k + 1} \hat{\varvec{P}}_{k + 1|k + 1}^{{{\varvec{ff}}}}{\varvec{\varPhi}}_{k + 1}^{T} \\ \end{aligned} $$
(53)

2.3.3 Solving the unknown matrix \({{\varvec{B}}}_{k+1}^{\text{opt}}\)

Both the objective functions \({P}_{{\varvec{f}}}\) and \({P}_{{\varvec{Z}}}\) are related to the undetermined matrix \({{\varvec{B}}}_{k+1}^{\text{opt}}\), because both \({\widetilde{{\varvec{R}}}}_{k+1}\), \({{\varvec{T}}}_{k+1}\) and \(\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}}\) are functions of \({{\varvec{B}}}_{k+1}^{\text{opt}}\). Define the objective function that simultaneously minimizes the unknown input error and unknown state error as follows:

$${\text{min}}{P}_{{\varvec{f}}{\varvec{Z}}}\stackrel{\scriptscriptstyle\vartriangle}{=}{P}_{{\varvec{f}}}+\rho {P}_{{\varvec{Z}}}$$
(54)

in which \(\rho \) is a factor used to balance the weight of unknown input and unknown state.

In order to obtain the minimum value of the objective function \({P}_{{\varvec{f}}{\varvec{Z}}}\), the derivative of the objective function with respect to \({{\varvec{B}}}_{k+1}^{\text{opt}}\) should be set to zero.

$$\frac{\partial {P}_{{\varvec{f}}{\varvec{Z}}}}{\partial {{\varvec{B}}}_{k+1}^{\text{opt}}}=\frac{\partial {P}_{{\varvec{f}}}}{\partial {{\varvec{B}}}_{k+1}^{\text{opt}}}+\rho \frac{\partial {P}_{{\varvec{Z}}}}{\partial {{\varvec{B}}}_{k+1}^{\text{opt}}}=0$$
(55)

in which

$$\frac{\partial {P}_{{\varvec{f}}}}{\partial {{\varvec{B}}}_{k+1}^{\text{opt}}}={\boldsymbol{\Pi }}_{k+1}^{T}\left({\boldsymbol{\Pi }}_{k+1}{\boldsymbol{\Psi }}_{k+1}-{\boldsymbol{\Gamma }}_{k+1}\right)$$
(56)
$$\frac{\partial {P}_{{\varvec{Z}}}}{\partial {{\varvec{B}}}_{k+1}^{\text{opt}}}={\boldsymbol{\Omega }}_{k+1}^{T}\left({\boldsymbol{\Omega }}_{k+1}{\boldsymbol{\Psi }}_{k+1}-{\boldsymbol{\Lambda }}_{k+1}\right)$$
(57)

in which

$${\boldsymbol{\Pi }}_{k+1}\stackrel{\scriptscriptstyle\vartriangle}{=}{{\varvec{S}}}_{k+1}{{\varvec{H}}}_{k+1|k}$$
(58)
$${\boldsymbol{\Omega }}_{k+1}\stackrel{\scriptscriptstyle\vartriangle}{=}{{\varvec{L}}}_{k+1}{{\varvec{H}}}_{k+1|k}-\mathbf{I}$$
(59)
$${\boldsymbol{\Psi }}_{k+1}\stackrel{\scriptscriptstyle\vartriangle}{=}\left[{{\varvec{B}}}_{k+1}^{\text{opt}}\left({\widehat{{\varvec{P}}}}_{k|k}^{{\varvec{f}}{\varvec{f}}}+{{\varvec{Q}}}_{k}^{u}\right)-\left({{\varvec{A}}}_{k}{\widehat{{\varvec{P}}}}_{k|k}^{{\varvec{Z}}{\varvec{f}}}+{{\varvec{B}}}_{k}^{zoh}{\widehat{{\varvec{P}}}}_{k|k}^{{\varvec{f}}{\varvec{f}}}\right)\right]$$
(60)

Substituting Eqs. (56) and (57) into (55) yields an equation for \({{\varvec{B}}}_{k+1}^{\text{opt}}\):

$$\mathcal{L}\left({{\varvec{B}}}_{k+1}^{\text{opt}}\right)\stackrel{\scriptscriptstyle\vartriangle}{=}{\boldsymbol{\Pi }}_{k+1}^{T}\left({\boldsymbol{\Pi }}_{k+1}{\boldsymbol{\Psi }}_{k+1}-{\boldsymbol{\Gamma }}_{k+1}\right)+\rho {\boldsymbol{\Omega }}_{k+1}^{T}\left({\boldsymbol{\Omega }}_{k+1}{\boldsymbol{\Psi }}_{k+1}-{\boldsymbol{\Lambda }}_{k+1}\right)=0$$
(61)

“Appendix 1” provides a detailed derivation process for Eqs. (56) and (57). The unknown matrix \({{\varvec{B}}}_{k+1}^{\text{opt}}\) can be solved by Eq. (61), and then, \({{\varvec{S}}}_{k+1}\) and \({{\varvec{L}}}_{k+1}\) can be further solved. However, Eq. (61) is a nonlinear equation about \({{\varvec{B}}}_{k+1}^{\text{opt}}\) and cannot be solved directly. To this end, finding a more relaxed sufficient condition to replace Eq. (61) becomes a way to solve \({{\varvec{B}}}_{k+1}^{\text{opt}}\). Along this line of thought, this paper puts forward a hypothesis that \({{\varvec{B}}}_{k+1}^{\text{opt}}={\lambda }_{k+1}{{\varvec{B}}}_{k}^{zoh}\).

$$\frac{\partial {P}_{{\varvec{f}}{\varvec{Z}}}}{\partial {\lambda }_{k+1}}=\frac{\partial {P}_{{\varvec{f}}}}{\partial {\lambda }_{k+1}}+\rho \frac{\partial {P}_{{\varvec{Z}}}}{\partial {\lambda }_{k+1}}=0$$
(62)

in which

$$ \begin{aligned} \frac{{\partial P_{{\varvec{f}}} }}{{\partial \lambda_{k + 1} }} = \frac{{\partial P_{{\varvec{f}}} }}{{\partial {\varvec{B}}_{k + 1}^{\text{opt}} }}:\frac{{\partial {\varvec{B}}_{k + 1}^{\text{opt}} }}{{\partial \lambda_{k + 1} }} & = {\text{tr}}\left\{ {\left( {{\varvec{B}}_{k}^{zoh} } \right)^{T}{\varvec{\varPi}}_{k + 1}^{T} \left( {{\varvec{\varPi}}_{k + 1}{\varvec{\varPsi}}_{k + 1} -{\varvec{\varGamma}}_{k + 1} } \right)} \right\} \\ & = \frac{1}{{\lambda_{k + 1} }}tr\left\{ {\left( {{\mathbf{I}} - {\varvec{D}}^{T} {\varvec{S}}_{k + 1}^{T} } \right)\left( {{\varvec{\varPi}}_{k + 1}{\varvec{\varPsi}}_{k + 1} -{\varvec{\varGamma}}_{k + 1} } \right)} \right\} \\ \end{aligned} $$
(63)
$$ \begin{aligned} \frac{{\partial P_{{\varvec{Z}}} }}{{\partial \lambda_{k + 1} }} = \frac{{\partial P_{{\varvec{Z}}} }}{{\partial {\varvec{B}}_{k + 1}^{\text{opt}} }}:\frac{{\partial {\varvec{B}}_{k + 1}^{\text{opt}} }}{{\partial \lambda_{k + 1} }} & = {\text{tr}}\left\{ {\left( {{\varvec{B}}_{k}^{zoh} } \right)^{T}{\varvec{\varOmega}}_{k + 1}^{T} \left( {{\varvec{\varOmega}}_{k + 1}{\varvec{\varPsi}}_{k + 1} -{\varvec{\varLambda}}_{k + 1} } \right)} \right\} \\ & = \frac{1}{{\lambda_{k + 1} }}{\text{tr}}\left\{ { - {\varvec{D}}^{T} {\varvec{L}}_{k + 1}^{T} \left( {{\varvec{\varOmega}}_{k + 1}{\varvec{\varPsi}}_{k + 1} -{\varvec{\varLambda}}_{k + 1} } \right)} \right\} \\ \end{aligned} $$
(64)

in which (\(:\)) represents the operation of the two-point product of the tensor. Substituting Eqs. (63) and (64) into (62) yields an equation for \({\lambda }_{k+1}\):

$$\mathcal{L}\left({\lambda }_{k+1}\right)={\text{tr}}\left\{\begin{array}{l}-{{\varvec{D}}}^{T}\left[{{\varvec{S}}}_{k+1}^{T}\left({\boldsymbol{\Pi }}_{k+1}{\boldsymbol{\Psi }}_{k+1}-{\boldsymbol{\Gamma }}_{k+1}\right)+\rho {{\varvec{L}}}_{k+1}^{T}\left({\boldsymbol{\Omega }}_{k+1}{\boldsymbol{\Psi }}_{k+1}-{\boldsymbol{\Lambda }}_{k+1}\right)\right]\\ +\left({\boldsymbol{\Pi }}_{k+1}{\boldsymbol{\Psi }}_{k+1}-{\boldsymbol{\Gamma }}_{k+1}\right)\end{array}\right\}=0$$
(65)

It should be noted that Eq. (65) is only a conditional and feasible solution of Eq. (61). Therefore, subsequent research may generate more practical algorithms based on Eq. (61).

2.3.4 Update of error covariance of cross terms

Define the cross-covariance matrix of the state and the unknown input and substitute Eqs. (29), (33) and Eq. (43) for simplification as follows:

$$ \begin{aligned} \hat{\varvec{P}}_{k + 1|k + 1}^{{{\varvec{Zf}}}} & \triangleq E\left[ {\left( {\hat{\varvec{e}}_{k + 1|k + 1}^{{\varvec{Z}}} } \right)\left( {\hat{\varvec{e}}_{k + 1|k + 1}^{{\varvec{f}}} } \right)^{T} } \right] \\ & = - \left( {{\mathbf{I}} - {\varvec{L}}_{k + 1} {\varvec{H}}_{k + 1|k} } \right)\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} {\varvec{S}}_{k + 1}^{T} + {\varvec{L}}_{k + 1} {\varvec{R}}_{k + 1} {\varvec{S}}_{k + 1}^{T} \\ & = - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{P} }_{k + 1|k}^{{{\varvec{ZZ}}}} {\varvec{H}}_{k + 1|k}^{T} {\varvec{S}}_{k + 1}^{T} + {\varvec{L}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{S}}_{k + 1}^{T} \\ & ={\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} \tilde{\varvec{R}}_{k + 1} {\varvec{S}}_{k + 1}^{T} ={\varvec{\varPhi}}_{k + 1} \hat{\varvec{P}}_{k + 1|k + 1}^{{{\varvec{ff}}}} \\ \end{aligned} $$
(66)
$${\widehat{{\varvec{P}}}}_{k+1|k+1}^{{\varvec{f}}{\varvec{Z}}}\triangleq E\left[\left({\widehat{{\varvec{e}}}}_{k+1|k+1}^{{\varvec{f}}}\right){\left({\widehat{{\varvec{e}}}}_{k+1|k+1}^{{\varvec{Z}}}\right)}^{T}\right]={\left({\widehat{{\varvec{P}}}}_{k+1|k+1}^{{\varvec{Z}}{\varvec{f}}}\right)}^{T}$$
(67)

2.3.5 Comparison with traditional Kalman paradigm

Substituting Eq. (52) and Eq. (23) into Eq. (24) gives the expression of \({\widehat{{\varvec{Z}}}}_{k+1|k+1}\) as follows:

$$ \begin{aligned} \widehat{\varvec{Z}}_{k + 1|k + 1} & = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} + \left( {{\varvec{K}}_{k + 1} +{\varvec{\varPhi}}_{k + 1} {\varvec{S}}_{k + 1} } \right){\varvec{r}}_{k + 1}^{{\varvec{y}}} \\ & = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} + {\varvec{K}}_{k + 1} {\varvec{r}}_{k + 1}^{{\varvec{y}}} + \left( {{\varvec{B}}_{k + 1}^{\text{opt}} - {\varvec{K}}_{k + 1} {\varvec{T}}_{k + 1} } \right)\hat{\varvec{f}}_{k + 1|k + 1}^{u} \\ & = \left( {\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} + {\varvec{B}}_{k + 1}^{\text{opt}} \hat{\varvec{f}}_{k + 1|k + 1}^{u} } \right) + {\varvec{K}}_{k + 1} \left( {{\varvec{r}}_{k + 1}^{{\varvec{y}}} - {\varvec{T}}_{k + 1} \hat{\varvec{f}}_{k + 1|k + 1}^{u} } \right) \\ & = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} + {\varvec{K}}_{k + 1} \left( {{\varvec{y}}_{k + 1} - \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }_{k + 1|k} } \right) \\ \end{aligned} $$
(68)

in which

$$ \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} = \varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k} + {\varvec{B}}_{k + 1}^{\text{opt}} \hat{\varvec{f}}_{k + 1|k + 1}^{u} $$
(69)
$$ \overline{\varvec{y}}_{k + 1|k} = {\varvec{H}}_{k + 1|k} \overline{\varvec{Z}}_{k + 1|k} + \varvec{D\hat{f}}_{k + 1|k + 1}^{u} + {\varvec{h}}_{k + 1|k} $$
(70)

Note that \(\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Z} }_{k + 1|k}\) and \(\overline{\varvec{Z}}_{k + 1|k}\) are biased and unbiased estimate of \({{\varvec{Z}}}_{k+1}\), respectively. \(\varvec{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }_{k + 1|k}\) and \(\overline{\varvec{y}}_{k + 1|k}\) are biased and unbiased estimate of \({{\varvec{y}}}_{k+1}\), respectively. Equation (68) has a similar form to the traditional extended Kalman filtering method. When \({{\varvec{A}}}_{k}{{\varvec{Z}}}_{k}+{{\varvec{B}}}_{k}^{\text{opt}}{{\varvec{f}}}_{k}^{u}\) in Eq. (18) is considered as a whole, the formula for identifying \({{\varvec{f}}}_{k}^{u}\) in the existing literatures is consistent with the formula for identifying \({{\varvec{f}}}_{k+1}^{u}\) in this paper [13, 26]. Therefore, the method proposed in this paper can be degraded to the existing EKF-UI and GEKF-UI methods under special cases.

2.4 Data fusion technology in measurement

Since only taking acceleration as the measurements sometimes lead to drifts in the identified structural displacement and unknown input [39], Liu et al. proposed a Kalman filter method based on data fusion technology of partial acceleration and displacement measurements to overcome this problem [26]. Due to the fact that low-frequency and high-frequency vibration characteristics are included in displacement and acceleration measurements, respectively, and there is a deterministic derivation relationship between structural displacement and surface strain in finite element theory, acceleration and displacement or strain are usually combined to form simultaneous measurements to suppress drift. This is the application of data fusion technology. It is worth mentioning that Huang et al. have further explained the reasons for the so-called drifts in the estimated unknown inputs and structural displacement [32, 42] and pointed out that acceleration-only measurements did not always cause drift, such as in the earthquake scenarios. Therefore, data fusion is required in general application scenarios to prevent drift, while it is not required in earthquake scenarios.

The strain–displacement relationship is presented below in the context of finite element model.

$$ {\varvec{\varepsilon}} = \overline{\varvec{C}}\varvec{x} $$
(71)
$$ \overline{\varvec{C}} = \left[ {\left( {\overline{\varvec{B}}_{1} { }\overline{\varvec{T}}_{1} { }\overline{\varvec{S}}_{1} } \right)^{T} ,\left( {\overline{\varvec{B}}_{2} { }\overline{\varvec{T}}_{2} { }\overline{\varvec{S}}_{2} } \right)^{T} , \cdots ,\left( {\overline{\varvec{B}}_{m} \overline{\varvec{T}}_{m} { }\overline{\varvec{S}}_{m} } \right)^{T} } \right]^{T} $$
(72)

where \({\varvec{\varepsilon}}\) is the strain vector; \(\overline{\varvec{S}}_{i}\) is the matrix used to select the displacement related to element i (a total of m elements), and the number of selected DOFs is dependent on the element type; \(\overline{\varvec{T}}_{i}\) is the matrix that transforms the element nodal displacement in global coordinate to those in local coordinate; \(\overline{\varvec{B}}_{i}\) is the matrix representing the relationship between the node displacement of an element and the strain in this element, which can be developed using the shape function of the element; \(\overline{\varvec{C}}\) is the strain–displacement transformation matrix.

The acceleration and displacement/strain can be expressed as follows, respectively.

$${{\varvec{y}}}_{\ddot{{\varvec{x}}},k+1}=-{{\varvec{L}}}_{a}{{\varvec{M}}}^{-1}{\varvec{F}}\left({{\varvec{x}}}_{k+1},{\dot{{\varvec{x}}}}_{k+1},{{\varvec{\theta}}}_{k+1}\right)+{{\varvec{L}}}_{a}{{\varvec{M}}}^{-1}{{\varvec{\eta}}}^{u}{{\varvec{f}}}_{k+1}^{u}+{{\varvec{v}}}_{\ddot{{\varvec{x}}},k+1}$$
(73)
$$ \left\{ {\begin{array}{*{20}l} {{\varvec{y}}_{{{\varvec{x}},k + 1}} = {\varvec{L}}_{u} \left[ {\begin{array}{*{20}l} {{\mathbf{I}}_{n} } & {{\mathbf{0}}_{n} } & {{\mathbf{0}}_{{n \times n_{\theta } }} } \\ \end{array} } \right]{\varvec{Z}}_{k + 1} + {\varvec{v}}_{{{\varvec{x}},k + 1}} } \\ {{\varvec{y}}_{{{\varvec{\varepsilon}},k + 1}} = {\varvec{L}}_{{\varvec{\varepsilon}}} \overline{\varvec{C}}\left[ {\begin{array}{*{20}l} {{\mathbf{I}}_{n} } & {{\mathbf{0}}_{n} } & {{\mathbf{0}}_{{n \times n_{\theta } }} } \\ \end{array} } \right]{\varvec{Z}}_{k + 1} + {\varvec{v}}_{{{\varvec{\varepsilon}},k + 1}} } \\ \end{array} } \right. $$
(74)

in which \({{\varvec{L}}}_{a}\), \({{\varvec{L}}}_{u}\) and \({{\varvec{L}}}_{{\varvec{\varepsilon}}}\) are the deployment matrix of accelerometer, displacement gauge and strain gauge, respectively; \({\mathbf{I}}_{n}\) is a n-dimensional identity matrix; \({0}_{n\times {n}_{\theta }}\) is a matrix of size \(n\times {n}_{\theta }\) and all elements are 0.

In general application scenarios, data fusion technology is introduced into the measurement equation to eliminate drift mixed in the identification results. When data fusion of sparse measurement of acceleration and strain are adopted, the measurement data are integrated together to constitute the measurement equation as follows:

$${{\varvec{y}}}_{k+1}=\left[\begin{array}{l}{{\varvec{y}}}_{\ddot{{\varvec{x}}},k+1}\\ {{\varvec{y}}}_{{\varvec{\varepsilon}},k+1}\end{array}\right]={\varvec{h}}\left({{\varvec{Z}}}_{k+1}\right)+{\varvec{D}}{{\varvec{f}}}_{k+1}^{u}+{{\varvec{v}}}_{k+1}$$
(75)

in which \({\varvec{h}}\left( {{\varvec{Z}}_{k + 1} } \right) = \left[ {\begin{array}{*{20}l} { - {\varvec{L}}_{a} {\varvec{M}}^{ - 1} {\varvec{F}}\left( {{\varvec{x}}_{k + 1} ,\dot{\varvec{x}}_{k + 1} ,{\varvec{\theta}}_{k + 1} } \right)} \\ {{\varvec{L}}_{{\varvec{\varepsilon}}} \overline{\varvec{C}}\varvec{x}_{k + 1} } \\ \end{array} } \right]\), \({\varvec{D}} = \left[ {\begin{array}{*{20}l} {{\varvec{L}}_{a} {\varvec{M}}^{ - 1} {\varvec{\eta}}^{u} } \\ {\mathbf{0}} \\ \end{array} } \right]\)

When data fusion of sparse measurement of acceleration and displacement responses are adopted, the integrated measurement equation is similar to Eq. (75).

2.5 Calculation process

The calculation flow of the proposed AGEKF-UI method is shown in Table 4 and Fig. 1.

Table 4 The equations list of AGEKF-UI algorithm
Fig. 1
figure 1

The flowchart of the proposed AGEKF-UI algorithm

3 Numerical validation of the proposed AGEKF-UI algorithm

In order to verify the performance of the proposed AGEKF-UI algorithm, a numerical example is used for demonstration. Since the real-time performance of existing EKF-UI methods is generally limited by whether the measurement equation has a direct feedthrough of unknown input, the application scope of the proposed AGEKF-UI method is naturally larger than that of existing EKF-UI methods.

In all application scenarios, the most unfavorable scenario for identification is \({\varvec{D}}={\mathbf{0}}\). In order to demonstrate that the proposed AGEKF-UI algorithm has the ability to surpass the existing GEKF-UI methods, all numerical cases only consider the extremely poor situation which existing EKF-UI methods are no longer applicable. For the case of \({\varvec{D}}\) as a column full-rank matrix, the identification effect of the proposed AGEKF-UI algorithm will undoubtedly be better than the existing EKF-UI methods. To save space, this type of example will not be given.

3.1 Multi-story nonlinear hysteretic structure

In this example, the type of structural system is a shear frame with Bouc–Wen hysteretic nonlinearity, which is used to simulate the motion state of the structure after yield failure under strong excitation.

The structural parameters of the six-story shear frame are: \({m}_{i}=50\mathrm{ kg}\), \({k}_{i}=1.0\times {10}^{5} {\text{N}}/{\text{m}}\) and \({c}_{i}=500\mathrm{ Ns}/{\text{m}}\), where the value of i traverses from 1 to 6. It is assumed that story nonlinear hysteretic restoring force in Bouc–Wen model exists in the first story. The nonlinear force and the inter-story hysteretic drift \({z}_{i}\) can be described by:

$${f}_{1}={\alpha }_{1}{k}_{1}{x}_{1}+\left(1-{\alpha }_{1}\right){k}_{1}{z}_{1}$$
(76)
$$ \dot{z}_{1} = \dot{x}_{1} \left\{ {\beta _{1} \left| {\dot{x}_{1} } \right|\left| {z_{1} } \right|^{{n_{1} - 1}} z_{1} + \gamma _{1} \dot{x}_{1} \left| {z_{1} } \right|^{{n_{1} }} } \right\} $$
(77)

in which \({\beta }_{1}\), \({\gamma }_{1}\) and \({n}_{1}\) are the Bouc–Wen hysteretic parameters; \({\alpha }_{1}\) is the ratio of post-yielding stiffness to pre-yielding stiffness. These parameters are selected as: \({\alpha }_{1}=0.5\), \({\beta }_{1}=1000{ {\text{s}}}^{2}/{{\text{m}}}^{2}\), \({\gamma }_{1}=500{ {\text{s}}}^{2}/{{\text{m}}}^{2}\) and \({n}_{1}=1.3\). The sketch of the shear frame structure is shown in Fig. 2. Two mutually independent wide-band (upper cut-off frequency 99 Hz) white noise excitations are applied to the 3rd and 5th floors.

Fig. 2
figure 2

Sketch of six-story shear frame structure with Bouc–Wen hysteresis nonlinearity

In order to identify the structural parameters and external excitation of the structure, a structural health monitoring system is deployed on the structure. Three acceleration sensors are deployed on the 1st, 4th and 6th floors respectively, and displacement sensors are deployed on the same floor. Due to the unavoidable measurement error, noises with 2% noise-to-signal ratio in root mean square (RMS) is mixed in the data collected by the sensor. It should be noted that there are many scholars studying the optimal arrangement of sensors [43, 44]. Niu et al. summarized the limitations of existing load identification methods in terms of sensor type, number and location and gave guidance on method selection [16]. However, since this is not the focus of this paper, optimal deployments of sensors are not discussed in all examples in this paper. In addition, note that the accelerations at the positions where the unknown forces act are not measured, so \({\varvec{D}}\) is equal to zero. In this case, the existing EKF-UI methods are incompetent, but can be solved by the method proposed in this paper.

3.2 Identification results based on the proposed AGEKF-UI algorithm

The augmented state and unknown excitation of the structure are initialized before executing the recursive algorithm. The initial values of displacement, velocity, and unknown excitation are all set to zero, and the initial values of structural parameters are set to 0.7 times of their actual values to simulate unknown structural parameters (note: since the exponent n in Bouc–Wen model must be greater than 1, its initial value is set to 0.8 times of its actual value). The initial variance and covariance of the augmented state and unknown excitation are set as:

\(\hat{\varvec{P}}_{0|0}^{{{\varvec{ZZ}}}} = {\text{diag}}\left( {10^{ - 4} 1_{6 \times 1} ;10^{ - 2} 1_{6 \times 1} ;10^{ - 5} ;10^{10} 1_{6 \times 1} ;10^{5} 1_{6 \times 1} ;0.1;10^{7} ;10^{6} ;10} \right)\), \(\hat{\varvec{P}}_{0|0}^{{{\varvec{ff}}}} = 10^{6} {\mathbf{I}}_{2} {\text{and}} \hat{\varvec{P}}_{0|0}^{{{\varvec{Zf}}}} = \left( {\hat{\varvec{P}}_{0|0}^{{{\varvec{fZ}}}} } \right)^{T} = 1_{29 \times 2}\), in which \(1_{m \times n}\) represents a matrix of size \(m \times n\) and all elements are 1.

The variance matrixes related to system noise and random walk of unknown excitation are set as: \({{\varvec{Q}}}_{k}^{s}={10}^{-11}{\mathbf{I}}_{29}\) and \({{\varvec{Q}}}_{k}^{u}=2\times {10}^{7}{\mathbf{I}}_{2}\) respectively. The measurement noise variance matrix is set as: \({{\varvec{R}}}_{k+1}={\text{diag}}\left(\left[{10}^{-1}{1}_{3\times 1}; {2\times 10}^{-6}{1}_{3\times 1}\right]\right)\). The factor (i.e., \(\rho \)) used to balance the weights of unknown excitation and unknown state is set to 10. The identification results based on the proposed AGEKF-UI algorithm are shown in Figs. 3, 4, 5, 6, 7, 8 and 9.

Fig. 3
figure 3

Comparison between identified and exact structural states

Fig. 4
figure 4

Comparison between identified and exact hysteresis loop

Fig. 5
figure 5

The convergence process of stiffness and damping coefficient

Fig. 6
figure 6

The convergence process of Bouc–Wen model parameters \({\alpha }_{1}\) and \({n}_{1}\)

Fig. 7
figure 7

The convergence process of Bouc–Wen model parameters \({\beta }_{1}\) and \({\gamma }_{1}\)

Fig. 8
figure 8

Comparison between identified and exact excitation \({f}_{1}^{u}\)

Fig. 9
figure 9

Comparison between identified and exact excitation \({f}_{2}^{u}\)

It can be seen from Fig. 3 that the state identification of the structure matches well with its accurate value. Figure 4 shows the identification effect of the nonlinear hysteretic model of the first layer. It can be seen that there are some small errors. Figure 5 shows the identification effect of structural linear parameters. It can be seen that the stiffness and damping can converge to the real value very well and quickly. Figures 6 and 7 show the convergence process of Bouc–Wen model parameters. It can be seen that the convergence errors of \({\alpha }_{1}\) and \({n}_{1}\) are very small, while those of \({\beta }_{1}\) and \({\gamma }_{1}\) are slightly larger. The reason for this difference is that each parameter has different sensitivity with respect to measurement. Figures 8 and 9 show the identification effect of unknown excitations. It can be seen that there is a certain error between the identification and the true value.

3.3 Discussion and analysis

This example simulates the identification problem in the worst scenario (\({\varvec{D}}={\mathbf{0}}\)), and the existing EKF-UI methods cannot realize real-time identification. At present, the only applicable method is GEKF-UI. Therefore, the proposed AGEKF-UI method is compared with the existing GEKF-UI method to illustrate its advancement. Tables 5 and 6 show the identification errors of parameters based on AGEKF-UI and GEKF-UI methods. (The data in parentheses are based on GEKF-UI method.) It can be seen that except for a few parameters (marked with underline), the AGEKF-UI method is slightly better than the GEKF-UI method in terms of parameters identification accuracy. In fact, the difference in parameter identification error between AGEKF-UI and GEKF-UI is not significant. Once the decimal display accuracy is changed, there is almost no difference. Table 7 shows that compared to the existing GEKF-UI method, the built-in optimization mechanism of the proposed AGEKF-UI method results in an average reduction of 37% in the root-mean-square error (RMSE) of inputs estimation [41]. It can be seen that AGEGF-UI method is significantly superior to the GEKF-UI method in the identification accuracy of unknown inputs. These conclusions are basically consistent with the inference of Eqs. (62) to (65). When \({\varvec{D}}={\mathbf{0}}\) is hold, Eq. (64) shows \(\frac{\partial {P}_{{\varvec{Z}}}}{\partial {\lambda }_{k+1}}\equiv 0\), indicating that the optimization factor \({\lambda }_{k+1}\) has no effect on improving the identification accuracy of augmented states (including structural parameters), but is beneficial for improving the identification accuracy of unknown inputs.

Table 5 List of identification errors of structural system parameters
Table 6 List of identification errors of Bouc–Wen nonlinear model parameters
Table 7 List of identification errors of the unknown inputs

In order to solve Eq. (61), a simplified assumption is adopted, namely \({{\varvec{B}}}_{k+1}^{\text{opt}}={\lambda }_{k+1}{{\varvec{B}}}_{k}^{zoh}\). The adaptive process of \({\lambda }_{k+1}\) is shown in Fig. 10. It can be seen that \({\lambda }_{k+1}\) converges to 1.0189 after an initial fluctuation. In addition, the weight balance factor \(\rho \) has little influence on the identification effect. This conclusion comes from numerical experiments and will not be discussed in depth here.

Fig. 10
figure 10

The adaptive process of \({\lambda }_{k+1}\)

4 Experimental validation of the proposed AGEKF-UI algorithm

To demonstrate and validate the performance of AGEKF-UI algorithm in experiment, a five-story shear frame experiment is conducted.

4.1 Experiment model and equipment

As shown in Fig. 11, the experiment equipment is a five-story shear frame. The main structure is 350 mm in length and 250 mm in width. The first story is 240 mm in height and the others are 200 mm. The connections are double-row bolts, which can be approximated as a fixed connection. The mass of the shear frame is assumed lumped at every story level. The actual stiffness of each layer is calibrated by statics. Table 8 shows the mechanical parameters of the experimental equipment. Acceleration sensors are the small size sensors of type 333B30 produced by PCB company, which is widely used in structural vibration and modal analysis experiments with high sensitivity. Strain sensors are piezoelectric strain gauges of type 740B02, which are suitable for dynamic strain response measurement. Force sensor of type 208C03 is installed at the middle of the 3rd story and connected to the electromagnetic vibrator as shown in Fig. 12. The excitation can be generated by signal generator of type RIGOL DG-1022, and signals (including acceleration, dynamic strain and excitation) can be collected synchronously by data acquisition instrument of type PXIe-1082 produced by National Instruments company. The measured structural responses are fed to the algorithm for identification, and the measured excitation is used for comparison with the one identified by AGEKF-UI.

Fig. 11
figure 11

Experimental equipment and sensors

Table 8 List of mechanical parameters of the experimental model
Fig. 12
figure 12

Electromagnetic vibrator and signal collector

A hammer force acting as a pulse is conducted on the structure, so free attenuation response of each story can be measured. After FFT (fast Fourier transform) of the measurement, the first two natural frequencies of the structure can be estimated as 5.6 Hz and 16.3 Hz, respectively. When damping ratios \(\xi <0.2\), the first two damping ratios of the structure can be obtained by free attenuation method. Assume that the structural damping is Rayleigh damping, then the damping coefficient can be obtained by conversion as follows.

$$\left[\begin{array}{l}\alpha \\ \beta \end{array}\right]=\frac{2{\omega }_{i}{\omega }_{j}}{{\omega }_{j}^{2}-{\omega }_{i}^{2}}\left[\begin{array}{cc}{\omega }_{j}& -{\omega }_{i}\\ -1/{\omega }_{j}& 1/{\omega }_{i}\end{array}\right]\left[\begin{array}{l}{\xi }_{i}\\ {\xi }_{j}\end{array}\right]$$
(78)

Based on Eq. (78), the Rayleigh damping coefficients are: \(\alpha =0.524\), \(\beta =1.453\times 1{0}^{-4}\).

4.2 Experiment and result

Wide-band white noise excitation is exerted on the 3rd story. Data fusion of sparse measurement of strain and acceleration responses is adopted in this experimental test. Four acceleration sensors are deployed on the 1st, 2nd, 4th and 5th floors, respectively, and one strain gauge is deployed near the 3rd floor. (Such deployment will result in \({\varvec{D}}={\mathbf{0}}\).) The sampling frequency is 100 Hz. The strain gauge is installed on the surface of the steel sheet 20 mm down from the 3rd layer. The relationship between the strain and displacement of the 2nd and 3rd story is:

$$\varepsilon =\left[\frac{\left(-6+12x/l\right)}{{l}^{2}}{x}_{2}+\frac{\left(6-12x/l\right)}{{l}^{2}}{x}_{3}\right]\times \frac{d}{2}$$
(79)

where \(l\) is the length of supporting steel sheet between adjacent story levels, \(d\) is the thickness of the supporting steel sheet, \(x\) indicates the position of the strain gauge.

Before starting the identification algorithm, the structure state and unknown excitation are initialized to 0, and the structure stiffness are initialized to 0.8 times of its calibration value. The initial variance and covariance of the augmented state and unknown excitation are set as:

$$ \hat{\varvec{P}}_{0|0}^{{{\varvec{ZZ}}}} = {\text{diag}}\left( {10^{ - 4} 1_{5 \times 1} ;10^{ - 4} 1_{5 \times 1} ;10^{10} 1_{5 \times 1} } \right),\;\hat{\varvec{P}}_{0|0}^{{{\varvec{ff}}}} = 10^{3} {\mathbf{I}}_{2} \;{\text{and}}\;\hat{\varvec{P}}_{0|0}^{{{\varvec{Zf}}}} = \left( {\hat{\varvec{P}}_{0|0}^{{{\varvec{fZ}}}} } \right)^{T} = 1_{15 \times 1} . $$

The variance matrixes related to system noise and random walk of unknown excitation are set as \({{\varvec{Q}}}_{k}^{s}={10}^{-7}{\mathbf{I}}_{15}\) and \({{\varvec{Q}}}_{k}^{u}={10}^{7}\), respectively. The measurement noise variance matrix is set as \({{\varvec{R}}}_{k+1}={\text{diag}}\left(\left[{10}^{2}{1}_{4\times 1}; {10}^{-8}\right]\right)\). The factor (i.e., \(\rho \)) used to balance the weights of unknown excitation and unknown state is set to 1. The identification results based on the proposed AGEKF-UI algorithm are shown in Figs. 13, 14, 15, 16, 17 and 18.

Fig. 13
figure 13

Comparison between identified and exact structural states (\({x}_{1}\))

Fig. 14
figure 14

Comparison between identified and exact structural states (\({\dot{x}}_{2}\))

Fig. 15
figure 15

The convergence process of stiffness (\({k}_{1} \;{\text{and}} \;{k}_{2}\))

Fig. 16
figure 16

The convergence process of stiffness (\({k}_{3} \;{\text{and}} \;{k}_{4}\))

Fig. 17
figure 17

The convergence process of stiffness (\({k}_{5}\))

Fig. 18
figure 18

Comparison between identified and exact excitation \({f}_{1}^{u}\)

It can be seen from Figs. 13 and 14 that there are some errors in the identified structure state, but on the whole, the identification effect is basically satisfactory. Figs. 15, 16 and 17 show the convergence process of structural stiffness. It can be seen that the proposed method is effective for identification of structural parameters. Figure 18 shows the identification effect of unknown excitation. It can be seen that real-time identification of unknown excitation is feasible under the condition of \({\varvec{D}}={\mathbf{0}}\), and the identification result is basically consistent with the real value.

4.3 Discussion and analysis

Tables 9 and 10, respectively, list the identification errors of structural parameters and unknown excitation. (The data in parentheses are the results based on GEKF-UI method.) It can be seen that AGEKF-UI and GEKF-UI have almost the same ability in parameters identification (the maximum error does not exceed 0.45%), while AGEKF-UI is obviously superior to GEKF-UI in excitation identification (the RMSE decreased by 37%). This conclusion does not violate the inference of the algorithm itself.

Table 9 List of identification errors of structural system parameters
Table 10 List of identification errors of the unknown inputs

The adaptive process of \({\lambda }_{k+1}\) is shown in Fig. 19. It can be seen that \({\lambda }_{k+1}\) fluctuate in a narrow range close to 1. According to Tables 2 and 3, the sensitivity matrix of unknown excitation \({{\varvec{f}}}_{k+1}^{u}\) in GEKF-UI algorithm is close to zero, while that in AGEKF-UI algorithm is close to the maximum, which is the internal reason why AGEKF-UI algorithm surpasses GEKF-UI algorithm in excitation identification.

Fig. 19
figure 19

Comparison between identified and exact structural states

5 Conclusions

In this paper, a novel discrete state equation is constructed by combining zero-order-hold (ZOH) and random walk (RW). This innovation connects the current unknown input with the current state and further connects it with the current measurement, which is the key to ensure the real-time performance of the system without direct feedthrough of unknown input. The adaptive discrete equation of state for structural dynamical system regards \({{\varvec{B}}}_{k+1}^{\text{opt}}\) as a basic unknown quantity, thus avoiding the problem of finding the optimal sampling assumption of unknown input, and opening a window for optimizing the identification accuracy of unknown input.

The proposed adaptive generalized extended Kalman filter with unknown input (AGEKF-UI) algorithm completely eliminates the limitation that real-time performance depends on whether there is a direct feedthrough of unknown input in the measurement equation, which improves the applicability of existing extended Kalman filtering with unknown input (EKF-UI) algorithms. On the other hand, the proposed algorithm can automatically adjust the sensitivity matrix of unknown input in an optimal way, which improves the identification accuracy of existing extended Kalman filter method with unknown input (GEKF-UI) algorithms. In order to verify the effectiveness and advancement of the proposed algorithm, a numerical case and an experimental test are presented.

In the proposed AGEKF-UI algorithm, the solution of \({{\varvec{B}}}_{k+1}^{\text{opt}}\) still needs further study. The assumption employed in numerical example and experiment to simplify the solution is only a suboptimal and feasible choice. If new solutions are proposed in subsequent research, more effective and practical new algorithms may be produced.