INTRODUCTION

The main problem of optimal nonlinear filtering of Markov signals consists in the great complexity of implementing at the required rate over time an absolutely optimal filter (AOF) [15] that establishes an explicit dependence of the estimate on all the accumulated measurements. It is an infinite-dimensional dynamic (recurrent) system with distributed parameters since its state is determined by the function of the posterior probability density. Therefore, the modern particle filter [68] that implements the direct trajectory realization of the AOF requires employing a very powerful computer due to the use of a cumbersome sequential Monte Carlo method. In practice, this forces us to apply suboptimal finite-dimensional estimation algorithms such as the generalized Kalman filter that is the simplest linearized approximation to AOF. However, the accuracy of such approximations may be insufficient, in particular due to the recurrent accumulation of methodological and computational errors over a long period of time [9, 10]. In addition to the vector of the estimate, the need to calculate the matrix of its covariances makes even such covariance approximations to the AOF rather complicated to implement, especially in the case of a significant number of estimated variables [11].

An alternative to the slow AOF is finite-dimensional optimal structure filters (OSFs) of various orders, which are generalizations of a conditionally optimal filter of a specified structure and fixed order [3, 12, 13]. Continuously discrete versions of the OSF for Markov continuous (diffusion) signals are given in [14, 15] while the continuous OSF for piecewise continuous (jump diffusion ) signals is described in [16]. In a finite-dimensional approach, the nonlinear filtering problem is split into two parts. The most complicated of them consisting in obtaining the filter equations is solved in advance before the appearance of measurements at the filter design stage. As a result, the implementation of the OSF consisting in recurrently obtaining estimates based on incoming measurements requires only the solution (modeling) of these equations. The complexity of this operation is determined by the number of equations and the type of their nonlinearities.

However, the listed continuously discrete OSFs have a number of disadvantages. The accuracy of a small-order filter (SOF) having a continuous prediction [14] is limited due to the invariable dimension of its state vector equal to the number of estimated variables. The order of a finite memory filter (FMF) and a piecewise constant prediction [15] can be already chosen since it is a multiple of the dimension of the measurement vector but the estimation of this filter depends only on the last few measurements.

In this paper, we propose a procedure for constructing a sufficiently simple recursive large-order filter (LOF) with a piecewise polynomial prediction. Unlike the FMF, its memory time is infinite, while the contradiction between accuracy and complexity can be controlled by choosing the filter order. The estimation vector is still sought as the best function of the last measurement and the filter state vector. However, the latter is now formed from the vectors of several previous estimates, thereby increasing, in comparison with the SOF, the RAM of the filter, which allows us to improve the accuracy by expanding the allowable set of estimates. In this case, the old measurements are not forgotten because information about them is accumulated in the estimates. Besides, the class of diffusion estimated signals extends to the jump diffusion one.

The synthesis of a new LOF is shown to be reduced to finding an advance relevant conditional probability density from a recursive chain of transformations of the prediction-correction type. It is based on the Kolmogorov–Feller integro-differential equation in partial derivatives (prediction) and the Bayes–Stratonovich integral formula (correction). The LOF can be otherwise synthesized numerically by the Monte Carlo method but with the cumbersome construction of a histogram of the desired structural vector-function of the filter on each measuring step depending on a large number of arguments. Therefore, the construction of two traditional numerical-analytical covariance approximations to the LOF is considered. An example of assessing the state of the Van der Pol stochastic oscillator is demonstrated.

1. PROBLEM STATEMENT FOR CONTINUOUS-DISCRETE FILTERING

Let us consider the following hidden Markov model of an observation system.

Let a piecewise-continuous n-dimensional estimated useful signal \({{X}_{t}} \in {{\mathbb{R}}^{n}}\) that varies in time \(t \in [0,\,T]\), is continuous on the right \({{X}_{t}} = {{X}_{{t + }}}\), and has a limit on the left \({{X}_{{t - }}}\) be the state vector of its forming filter. The latter, which is an object of observation, is perturbed both by continuous Gaussian white noise and by pulsed Poisson white noise. It is described by the Ito stochastic differential equation that has the following integral form:

$${{X}_{t}} = {{X}_{0}} + \int\limits_0^t {a(\tau {\text{,}}{{X}_{{\tau - }}}) d\tau } + \int\limits_0^t {B(\tau {\text{,}}{{X}_{{\tau - }}})d{{W}_{\tau }}} + \sum\limits_{k = 1}^{P_{0}^{t}({{X}_{0}})} {{{S}_{{{{\Theta }_{k}}({{X}_{{\Theta _{k}^{{\, - }}}}})}}}({{X}_{{\Theta _{k}^{{\, - }}}}})} .$$
(1.1)

Here, X0 is the random initial signal value characterized by the probability density \({{p}_{0}}(x)\), \(a(t,x)\) is the bias vector-function, \(B(t,x)\) is the diffusion matrix function, Wt is the standard Wiener process, \({{S}_{t}}({{X}_{{t - }}}) = {{X}_{t}} - {{X}_{{t - }}}\) is a random vector of the hopping amplitude of the random process Xt at time t, which is determined by the conditional probability density \(\xi (t{\text{,}}s{\kern 1pt} {\text{|}}{\kern 1pt} x)\), and \({{\Theta }_{1}},{{\Theta }_{2}}, \ldots \) is the Poisson flow of the time points of the hoppings, which depend on the current value of the signal and have a conditional intensity \(\mu (t,x)\),

$$\operatorname{Prob} \,[{{\Theta }_{i}}({{X}_{{\Theta _{i}^{{\, - }}}}}) - {{\Theta }_{{i - 1}}}({{X}_{{\Theta _{{i - 1}}^{{\, - }}}}}) < \tau {\kern 1pt} {\text{|}}{\kern 1pt} {{X}_{t}} = x] = 1 - {{e}^{{ - \mu (t,x)\tau }}},\quad i = 1,2,...,$$

\(P_{0}^{t}(x)\) is an integer conditional process counting the number of these hoppings over a time period of [0, t),

$$P_{0}^{t}(x) = \{ j:[{{\Theta }_{1}}({{X}_{{\Theta _{1}^{{\, - }}}}}), \ldots ,{{\Theta }_{k}}({{X}_{{\Theta _{j}^{{\, - }}}}})] \in [0,t){\kern 1pt} {\text{|}}{\kern 1pt} {{X}_{t}} = x\} $$

with the Poisson distribution law:

$$\operatorname{Prob} [P_{0}^{t}(x) = l] = \frac{{{{\nu }^{l}}(t,x)}}{{l!{{e}^{{\nu (t,x)}}}}},\quad l = 0,1,...,\quad \nu (t,x) = \int\limits_0^t {\mu (\tau ,x)d\tau } .$$

Note that, in Eq. (1.1), the integral over time is understood as the mean square and the integral over the Wiener process Wt is understood as the Ito stochastic integral. The probability density \(p(t,x)\) of a piecewise continuous signal Xt is known to satisfy the Kolmogorov–Feller integro-differential equation [2, 17, 18], which is a generalization of the more well-known Fokker–Planck–Kolmogorov differential equation over the class of jump diffusion processes.

Also let instant inertialess measurements of the vector \({{X}_{t}}\), including incomplete or inaccurate ones, determined by the following formula of the meter:

$${{Y}_{k}} = {{c}_{k}}({{X}_{{t_{k}^{ - }}}},{{V}_{k}}),\quad {{V}_{k}} \sim {{q}_{k}}(v),\quad k = \overline {0,K} ,$$
(1.2)

be performed at known clock time points of

$${{t}_{0}} = 0 < {{t}_{1}} < ... < {{t}_{K}} \leqslant T.$$

Here, \({{Y}_{k}} \in {{\mathbb{R}}^{m}}\) is the m-dimensional measurement vector, \({{c}_{k}}(x,v)\) is the meter vector-function, and Vk is the vector of independent discrete white noise with a probability density \({{q}_{k}}(v)\).

Regarding relations (1.1) and (1.2), we make the following natural assumptions.

Assumption 1.Equation (1.1) has a strong solution \({{X}_{t}}\) and it is unique. Sufficient conditions for this are given in [17, 18].

Assumption 2. The random process Xt and its measurements Yk have finite second moments

$${\rm M}{{\left\| {{{X}_{t}}} \right\|}^{2}} < \infty ,\quad {\rm M}{{\left\| {{{Y}_{k}}} \right\|}^{2}} < \infty ,$$

where \({\rm M}\) is the mathematical expectation operator and \({{\left\| \varepsilon \right\|}^{2}} = {{\varepsilon }^{{\text{T}}}}\varepsilon \) is the square of the Euclidean norm.

Assumption 3. For simplicity, we will consider all random variables to be absolutely continuous, which allows characterizing them with probability densities.

It is required at each moment of time from any half-interval between two adjacent clock instants of time to find the estimate \({{\hat {X}}_{t}}\) of the vector Xt as a function of all the measurements accumulated by this time:

$${{\hat {X}}_{t}} = {{x}_{{t,k}}}(Y_{0}^{k}),\quad t \in ({{t}_{k}},{{t}_{{k + 1}}}],\quad Y_{0}^{k} = ({{Y}_{0}},{{Y}_{1}}, \ldots ,{{Y}_{k}}),$$
(1.3)

which is optimal in terms of the minimum standard error of the estimate:

$${\rm M}[{{({{X}_{t}} - {{\hat {X}}_{t}})}^{{\text{T}}}}{{C}_{t}}({{X}_{t}} - {{\hat {X}}_{t}})] \to \min ,\quad t \in [0,T].$$
(1.4)

Here, \({{C}_{t}} = C_{t}^{{\text{T}}} \succ 0\) is a symmetric positive definite matrix of weight coefficients. Note that the criterion (1.4) provides the non-bias of the estimate

$${\rm M}[{{\hat {X}}_{t}}] = {\rm M}[{{X}_{t}}],$$

which guarantees the absence of a systematic error in it. Therefore, this criterion serves to minimize the spread of estimates relative to the estimated value.

2. EQUATION SELECTION FOR THE NEW FILTER

We correct the given problem statement, for which we first describe the known result of its solution.

2.1. Constructing an Absolutely Optimal Filter

If no restrictions are imposed on the class of estimates (1.3) determined by the functions \({{\hat {x}}_{{t,k}}}( \cdot )\) of the number of arguments \(m(k + 1)\) that grow over time, then the optimal function in the sense of (1.4) is the function of the posterior mean:

$$\begin{array}{*{20}{l}} {{{x}_{{t,k}}}(y_{0}^{k}) = {\rm M}[{{X}_{t}}|Y_{0}^{k} = y_{0}^{k}] = \int x{{\rho }_{k}}(t,x|y_{0}^{k})dx} \end{array},\quad t \in ({{t}_{k}},{{t}_{{k + 1}}}].$$
(2.1)

Here, \({{\rho }_{k}}( \cdot )\) is the posterior probability density, and the integrals here and below are taken over the entire Euclidean space of the corresponding dimension:

$$\int {f(x)dx} = \int\limits_{{{\mathbb{R}}^{n}}} {f(x)dx} .$$

In this case, the density \({{\rho }_{k}}( \cdot )\) does not depend on the previous estimating functions (2.1) and the corresponding estimates \({{\hat {X}}_{t}}\) and satisfies the Kolmogorov–Feller integro-differential equation in partial derivatives (prediction equation) on each interval between measurements \(({{t}_{k}},{{t}_{{k + 1}}}]\). At the time of the next measurement \({{t}_{{k + 1}}}\), the final section \({{\rho }_{k}}(t_{{k + 1}}^{ - },x{\kern 1pt} {\text{|}}{\kern 1pt} y_{0}^{k})\) of the solution of this equation according to the Bayes formula (correction equation) is recalculated into the initial section \({{\rho }_{{k + 1}}}(t_{{k + 1}}^{ + },x{\kern 1pt} {\text{|}}{\kern 1pt} y_{0}^{{k + 1}})\) for the next interval \(({{t}_{{k + 1}}},{{t}_{{k + 2}}}]\). It is usually complicated to implement these calculations at a rate with the arrival of measurements.

Therefore, we abandon the requirement that the estimate be explicitly dependent on all measurements (1.3), which leads to the AOF’s complexity indicated above. Note that the search for an explicit dependence of the clock estimates \({{\hat {X}}_{{{{t}_{k}}}}}\) of the diffusion signal Xt similar to (1.3) only on the last few measurements \({{Y}_{k}},{{Y}_{{k - 1}}}, \ldots ,{{Y}_{{k - l}}}\) led to a much simpler, albeit, less accurate, continuously discrete OSF [15], but its memory time is finite.

2.2. Filter with a Polynomial Prediction

In order to obtain an equally simple discrete OSF but with an infinite memory time, we propose to search for the best estimate Zk of only the clock value \({{X}_{{t_{k}^{{\, - }}}}}\) of the n-dimensional jump diffusion vector Xt at each measuring cycle k. For possible savings in computational costs, it is also possible to evaluate only its certain, most interesting \(n{\kern 1pt} '\)-dimensional part \(X_{t}^{'} \in {{\mathbb{R}}^{{n'}}}\), which, without loss of generality, consists of the first \(n{\kern 1pt} ' \leqslant n\) components of the vector Xt. Therefore, the criterion of the optimality of the estimate (1.4) is everywhere, at any moment of time, replaced by a similar, but weaker, clock optimality condition:

$${\rm M}[{{(X_{{t_{k}^{ - }}}^{'} - {{Z}_{k}})}^{{\text{T}}}}{{C}_{{{{t}_{k}}}}}(X_{{t_{k}^{ - }}}^{'} - {{Z}_{k}})] \to \min ,\quad k = 0,1, \ldots $$
(2.2)

The estimate \({{Z}_{k}}\) will be sought in the form of an explicit dependence on the last measurement Yk and on not more than \(l \in \mathbb{N}\) previous estimates \(Z_{{k - l}}^{{k - 1}} = \left[ {{{Z}_{{k - 1}}},{{Z}_{{k - 2}}}, \ldots ,{{Z}_{{k - l}}}} \right]\), namely,

$${{Z}_{k}} = {{f}_{k}}({{Y}_{k}},Z_{{\max (0,k - l)}}^{{k - 1}}),\quad \,k \geqslant 1,\quad {{Z}_{0}} = {{f}_{0}}({{Y}_{0}}),$$
(2.3)

determining the best function \({{f}_{k}}( \cdot )\) based on condition (2.2).

We heuristically construct a prediction \(\hat {X}_{t}^{'}\) of the information vector \(X_{t}^{'}\) between measurements using one or several optimal clock estimates Zk, for example, in the form of an extrapolating polynomial. For example, the prediction may be constant, linear, etc.:

$$\forall t \in [{{t}_{k}},{{t}_{{k + 1}}}){\text{:}}\,\,\,\hat {X}_{t}^{'} = {{Z}_{k}},\quad k \geqslant 0,\quad {\text{or}}\quad \hat {X}_{t}^{'} = {{Z}_{k}} + \frac{{{{Z}_{k}} - {{Z}_{{k - 1}}}}}{{{{t}_{k}} - {{t}_{{k - 1}}}}}(t - {{t}_{k}}),\quad k \geqslant 1,\quad {\text{etc}}{\text{.}}$$
(2.4)

Thus, to evaluate the random process (1.1) from the discrete measurements (1.2), we propose to synthesize a continuously discrete filter with an optimal clock estimate (2.3) and a polynomial prediction (2.4). Unlike the FMF [15], the old measurements are not forgotten here because the information about them is accumulated in the estimates and so the memory time of the new filter as a dynamic system is infinite. A similar filter for discrete Markov signals is described in [19].

2.3. Filter with a Piecewise Polynomial Prediction

To obtain a more accurate prediction, on the time interval between adjacent measurements \([{{t}_{k}},{{t}_{{k + 1}}}]\), we introduce similarly to [15] additional L inter-cycle points \(\tau _{k}^{i}\), \(i = \overline {1,L} \), denoting the interval boundaries as \(\tau _{k}^{0}\) and \(\tau _{k}^{{L + 1}}\):

$${{t}_{k}} = \tau _{k}^{0} < \tau _{k}^{1} < \ldots < \tau _{k}^{L} < \tau _{k}^{{L + 1}} = {{t}_{{k + 1}}}.$$

At these points, we will find inter-cycle estimates \(Z_{k}^{i}\) as their explicit dependences on l clock estimates

$$Z_{k}^{i} = g_{k}^{i}(Z_{{\max (0,k - l + 1)}}^{k}),$$
(2.5)

determining functions \(g_{k}^{i}( \cdot )\) from the optimality condition analogous to (2.2):

$${\rm M}[{{(X_{{\tau _{k}^{i} - }}^{'} - Z_{k}^{i})}^{{\text{T}}}}{{C}_{{\tau _{k}^{i}}}}\,(X_{{\tau _{k}^{i} - }}^{'} - Z_{k}^{i})] \to \min ,\quad i = \overline {1,L} ,\quad k \geqslant 1.$$
(2.6)

Using these additional estimates \(Z_{k}^{i}\), we can construct any of the polynomial predictions of type (2.4) and on each inter-cycle interval separately:

$$\begin{gathered} \forall t \in [\tau _{k}^{i},\tau _{k}^{{i + 1}}),\quad i = \overline {0,L} {\text{:}}\,\,\,\hat {X}_{t}^{'} = Z_{k}^{i},\quad k \geqslant 0, \\ {\text{or}}\quad \hat {X}_{t}^{'} = Z_{k}^{i} + \frac{{Z_{k}^{i} - Z_{k}^{{i - 1}}}}{{\tau _{k}^{i} - \tau _{k}^{{i - 1}}}}(t - \tau _{k}^{i}),\quad k \geqslant 1,\quad {\text{etc}}. \\ \end{gathered} $$
(2.7)

3. THE STATE VECTOR AND THE PROPOSED FILTER ORDER

Equation (2.3) linking the current estimate Zk with the previous ones is a high-order recurrence formula (difference equation). We transform it into a system of first-order equations, for which we collect l of the latest estimates into a block column-vector of a dimension growing initially:

$${{U}_{k}} = Z_{{\max (0,k - l + 1)}}^{k} = \left\{ \begin{gathered} Z_{0}^{k},\quad k = \overline {0,l - 1} \;\;({\text{accumulation stage}}),\quad \dim Z_{0}^{k} = n{\kern 1pt} '(k + 1), \hfill \\ Z_{{k - l + 1}}^{k},\quad k \geqslant l\;\;({\text{upgrade stage}}),\quad \dim Z_{{k - l + 1}}^{k} = n{\kern 1pt} 'l. \hfill \\ \end{gathered} \right.$$
(3.1)

The process of changing this vector Uk of considered estimates may be written recursively,

$${{U}_{k}} = {{s}_{k}}({{Z}_{k}},{{U}_{{k - 1}}}),\quad k \geqslant 1,\quad {{U}_{0}} = {{Z}_{0}},$$
(3.2)

using the following vector-functions of estimate accumulation or updating:

$${{s}_{k}}({{Z}_{k}},{{U}_{{k - 1}}}) = \left\{ \begin{gathered} \left[ {\begin{array}{*{20}{c}} {{{Z}_{k}}} \\ {{{U}_{{k - 1}}}} \end{array}} \right],\quad k = \overline {1,l - 1} , \hfill \\ \left[ {\begin{array}{*{20}{c}} {{{Z}_{k}}} \\ {C\,{{U}_{{k - 1}}}} \end{array}} \right],\quad k \geqslant l, \hfill \\ \end{gathered} \right.\quad C = \left[ {\begin{array}{*{20}{c}} {{{E}_{{(l - 1)n'}}}}&{{{O}_{{(l - 1)n' \times n'}}}} \end{array}} \right].$$

Here, C is the matrix of the removal of its last, obsolete block \({{Z}_{{k - l}}}\) from the block column-vector \({{U}_{{k - 1}}}\), while E and O are the unity and zero matrices, respectively. Then, the main filter equation (2.3) and additional expressions for inter-cycle estimates (2.5) take the following form:

$${{Z}_{k}} = {{f}_{k}}({{Y}_{k}},{{U}_{{k - 1}}}),\quad k \geqslant 1,\quad {{Z}_{0}} = {{f}_{0}}({{Y}_{0}}),$$
(3.3)
$$Z_{k}^{i} = g_{k}^{i}({{U}_{k}}),\quad i = \overline {1,L} ,\quad k \geqslant 0.$$
(3.4)

Remark 1. Instead of (3.4), we can search for direct dependences of the inter-cycle estimates \(Z_{k}^{i}\) and on the last measurement Yk that are similar to (2.3), replacing (2.5) by the relations

$$Z_{k}^{i} = f_{k}^{i}({{Y}_{k}},Z_{{\max (0,k - l)}}^{{k - 1}}),\quad k \geqslant 1,\quad Z_{0}^{i} = f_{0}^{i}({{Y}_{0}}),\quad i = \overline {1,L} .$$

Given (3.1), these formulas will be similar to (3.3):

$$Z_{k}^{i} = f_{k}^{i}({{Y}_{k}},{{U}_{{k - 1}}}),\quad k \geqslant 1,\quad Z_{0}^{i} = f_{0}^{i}({{Y}_{0}}),\quad i = \overline {1,L} .$$

Consequently, the initial input-output filter equations (2.3) and (2.5) are represented by equivalent input-state-output relations (3.2)–(3.4). Indeed, the first-order difference equation (3.2) is the state equation of the filter, while relations (3.3) and (3.4) are the formulas of its outputs. In this case, the state of the filter is determined by the block vector (3.1), and its maximal dimension \(p = ln{\kern 1pt} '\), which does not change from the lth step \(k = l - 1\), is the filter order. By increasing the coefficient of multiplicity l, this order may be arbitrarily large. Only the filter’s output functions \({{f}_{k}}( \cdot )\) and \(f_{k}^{i}( \cdot )\) are subject to optimization by criterion (2.2), while the function of its state \({{s}_{k}}( \cdot )\) is fixed. As a result, the following statement is proved.

Lemma.The filter defined by recurrenceformulas (2.3)and(2.5)is finite-dimensional with a fixed equation of state (3.2) and optimized formulas of outputs (3.3) and (3.4). Its state vector (3.1) consists of the desired numberlof estimates remembered by it, so that the filter order is equal to \(ln{\kern 1pt} '\), where \(n{\kern 1pt} '\) is the dimension of the estimation vector.

Remark 2. Due to the invariance of the state function \({{s}_{k}}( \cdot )\), the proposed estimation algorithm belongs to the type of semi-optimal OSF of an arbitrary order [20]. The new filter differs from the FMF by principle replacing past measurements \(Y_{{\max (0,k - l)}}^{{k - 1}}\) with past estimates \(Z_{{\max (0,k - l)}}^{{k - 1}}\). In the particular case when the coefficient of multiplicity l = 1 when Eqs. (2.3) and (2.5) take the simplest form

$${{Z}_{k}} = {{f}_{k}}({{Y}_{k}},{{Z}_{{k - 1}}}),\quad Z_{k}^{i} = f_{k}^{i}({{Y}_{k}},{{Z}_{{k - 1}}}),\quad i = \overline {1,L} ,\quad k \geqslant 1,\quad {{Z}_{0}} = {{f}_{0}}({{Y}_{0}}),$$

this filter degenerates into the well-known SOF [15, 21].

Therefore, we give the following definition.

Definition. The recurrent dependence of clock estimates on measurements (2.3), as well as the equivalent pair of the state equation (3.2) and the output formula (3.3), will be termed a filter of large order \(p = ln{\kern 1pt} '\) with a polynomial prediction (2.4). These same relations, but with additional formulas for inter-cycle predictions (2.5) or (3.4), will be termed a filter with a piecewise polynomial prediction (2.7).

4. FINDING FILTER OUTPUT FUNCTIONS

Substituting the desired formulas for estimates (2.3) and (2.5) into the corresponding optimality criteria (2.2) and (2.6), we obtain the following result similar to (2.1) based on the well-known theorem on the best mean square regression.

Theorem 1 (on the optimal structure of the LOF).Under assumptions 1–3, the best functions among all filter output functions measured by Borel exist and are found as the following conditional means:

$$\begin{array}{*{20}{l}} {{{f}_{0}}({{y}_{0}}) = {\text{M}}[X_{0}^{'}|{{y}_{0}}] = \int \,x_{0}^{'}{{\rho }_{0}}({{x}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{0}})d{{x}_{0}},} \end{array}$$
(4.1)
$$\begin{array}{*{20}{l}} {{{f}_{k}}({{y}_{k}},{{u}_{{k - 1}}}) = {\text{M}}[X_{{t_{k}^{ - }}}^{'}{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{k}},{{u}_{{k - 1}}}] = \int \,x{\kern 1pt} '{{\rho }_{k}}(t_{k}^{ - },x{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{k}},{{u}_{{k - 1}}})dx,} \end{array}\quad k \geqslant 1,$$
(4.2)
$$\begin{array}{*{20}{l}} {g_{k}^{i}({{u}_{k}}) = {\text{M}}[X_{{\tau _{k}^{i} - }}^{'}{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{k}}] = \int \,x{\kern 1pt} '{{\pi }_{k}}(\tau _{k}^{i} - ,x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{k}})dx,\quad i = \overline {1,L} ,\quad k \geqslant 1} \end{array},$$
(4.3)

and their corresponding estimates Zkand \(Z_{k}^{i}\) are unbiased.

However, these relationships only express the desired functions in terms of three conditional probability densities: initial \({{\rho }_{0}}({{x}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{0}})\), corrective \({{\rho }_{k}}(t_{k}^{ - },x{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{k}},{{u}_{{k - 1}}})\), and predictive \({{\pi }_{k}}(t,x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{k}})\), \(t \in [{{t}_{k}},{{t}_{{k + 1}}})\). Using the Bayes formula, we find the first of them by the well-known density \({{p}_{0}}(x)\) of the initial state of object (1.1),

$${\begin{array}{*{20}{l}} {{{\rho }_{0}}({{x}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{0}}) = {{{{\beta }_{0}}({{y}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} {{x}_{0}}){{p}_{0}}({{x}_{0}})} \mathord{\left/ {\vphantom {{{{\beta }_{0}}({{y}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} {{x}_{0}}){{p}_{0}}({{x}_{0}})} {\int \,numerator\,d{{x}_{0}}}}} \right. \kern-0em} {\int \,numerator\,d{{x}_{0}}}}} \end{array}},$$
(4.4)

and represent each correcting density \({{\rho }_{k}}({{x}_{{{{t}_{k}}}}}{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{k}},{{u}_{{k - 1}}})\) through the final cross section at \(t = t_{k}^{ - }\) of the previous predictive \({{\pi }_{{k - 1}}}(t,x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{{k - 1}}})\), which is valid only on the interval \(t \in [{{t}_{{k - 1}}},{{t}_{k}})\):

$${\begin{array}{*{20}{l}} {{{\rho }_{k}}(t_{k}^{ - },x{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{k}},{{u}_{{k - 1}}}) = {{{{\beta }_{k}}({{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} x){{\pi }_{{k - 1}}}(t_{k}^{ - },x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{{k - 1}}})} \mathord{\left/ {\vphantom {{{{\beta }_{k}}({{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} x){{\pi }_{{k - 1}}}(t_{k}^{ - },x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{{k - 1}}})} {\int \,numerator\,dx}}} \right. \kern-0em} {\int \,numerator\,dx}}} \end{array}},\quad k \geqslant 1.$$
(4.5)

In the last two relations, \({{\beta }_{k}}({{y}_{k}}|{{x}_{k}})\) is the likelihood function obtained from the measuring formula (1.2),

$${{\beta }_{k}}({{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} {{x}_{k}}) = \int {\delta [{{y}_{k}} - {{c}_{k}}({{x}_{k}},v)]{{q}_{k}}(v)dv} ,$$
(4.6)

where \(\delta ( \cdot )\) is the Dirac function, and the symbol \(numerator\) hereinafter designates the numerator of the corresponding fraction. Similarly, the predicted density \({{\pi }_{k}}(t,x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{k}})\) can be expressed in terms of the joint probability density \({{r}_{k}}(t,x,{{u}_{k}})\) of random states of the observation object Xt and filter Uk:

$${\begin{array}{*{20}{l}} {{{\pi }_{k}}(t,x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{k}}) = {{{{r}_{k}}(t,x,{{u}_{k}})} \mathord{\left/ {\vphantom {{{{r}_{k}}(t,x,{{u}_{k}})} {\int \,numerator\,dx}}} \right. \kern-0em} {\int \,numerator\,dx}}} \end{array}},\quad t \in [{{t}_{k}},{{t}_{{k + 1}}}),\quad k \geqslant 0.$$
(4.7)

Finally, using the Markov property of the jump diffusion process Xt determined by the Ito equation (1.1), each of the probability densities \({{r}_{k}}( \cdot )\) on the corresponding time interval between measurements can be shown [15] to satisfy the well-known Kolmogorov–Feller equation:

$$\frac{{\partial {{r}_{k}}(t,x,{{u}_{k}})}}{{\partial t}} = {{K}_{x}}[{{r}_{k}}(t,x,{{u}_{k}})] + {{F}_{x}}[{{r}_{k}}(t,x,{{u}_{k}})],\quad t \in [{{t}_{k}},{{t}_{{k + 1}}}),\quad k \geqslant 0.$$
(4.8)

Here, Kx and Fx are two direct generating process operators: the Fokker–Planck–Kolmogorov differential operator

$${{K}_{x}}[r(t,x,u)] = - \nabla _{x}^{{\text{T}}}[a(t,x)r(t,x,u)] + 0.5{\text{tr}}\left[ {{{\nabla }_{x}}\nabla _{x}^{{\text{T}}}\left( {B(t,x){{B}^{{\text{T}}}}(t,x)r(t,x,u)} \right)} \right]$$

with the vector \({{\nabla }_{x}}\) of the gradient with respect to x and the trace operator of the matrix tr, as well as the Feller integral operator

$${{F}_{x}}[r(t,x,u)] = - \mu (t,x)r(t,x,u) + \int {\mu (t,x - s)r(t,x - s,z)\xi (t,s{\kern 1pt} {\text{|}}{\kern 1pt} x - s)ds} .$$

Note that, if the bias function \(a(t,x)\) is non-differentiable with respect to the variable x once and the diffusion functions \(B(t,x)\) is non-differentiable with respect to the variable x twice, the solution to Eq. (4.8) is understood in the generalized sense by Bubnov–Galerkin [22, 23].

Based on the relations between random variables \({{U}_{0}} = {{Z}_{0}}\) and \({{Z}_{0}} = {{f}_{0}}({{Y}_{0}})\) given in (3.2) and (3.3) and the known properties of probability densities, the initial condition for equation (4.8) on the first interval \(t \in [{{t}_{0}},{{t}_{1}})\) is the probability density:

$${{r}_{0}}({{t}_{0}},x,{{u}_{0}}) = {{p}_{0}}(x)\int {\delta [{{u}_{0}} - {{f}_{0}}({{y}_{0}})]{{\beta }_{0}}({{y}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} x)d{{y}_{0}}} .$$
(4.9)

Here, the initial estimate function \({{f}_{0}}({{y}_{0}})\) is already known from (4.2) and (4.4).

Similarly, the initial condition \({{r}_{k}}({{t}_{k}},x,{{u}_{k}})\) for Eq. (4.8) is found on each of the subsequent inter-cycle time intervals \(t \in [{{t}_{k}},{{t}_{{k + 1}}})\), \(k \geqslant 1\). It is related to the final value \({{r}_{{k - 1}}}(t_{k}^{ - },x,{{u}_{{k - 1}}})\) of the solution to this equation on the previous interval \(t \in [{{t}_{{k - 1}}},{{t}_{k}})\) due to the change in the filter state vector from \({{U}_{{k - 1}}}\) to \({{U}_{k}}\) at the time instant \(t = {{t}_{k}}\) by formula (3.2). Indeed, as a result of the emergence of a new measurement Yk and obtaining, according to (3.3), the next estimate of Zk using the found function \({{f}_{k}}({{y}_{k}},{{u}_{{k - 1}}})\), the latter is added to the vector \({{U}_{{k - 1}}}\) with the possible removal of the outdated estimate \({{Z}_{{k - l}}}\) from it at the update stage. Since \({{U}_{k}} = Z_{0}^{k}\) at the estimate accumulation stage and \({{U}_{k}} = Z_{{k - l + 1}}^{k}\) at the update stage, the sections of probability densities \({{r}_{k}}( \cdot )\) and \({{r}_{{k - 1}}}( \cdot )\) at the same time tk are related by one of the following two relations:

$$\begin{gathered} {{r}_{k}}({{t}_{k}},x,z_{0}^{k}) = {{r}_{{k - 1}}}(t_{k}^{ - },x,z_{0}^{{k - 1}})\int {\delta [{{z}_{k}} - {{f}_{k}}({{y}_{k}},z_{0}^{{k - 1}})]{{\beta }_{k}}({{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} x)d{{y}_{k}}} ,\quad k = \overline {1,l - 1} , \\ {{r}_{k}}({{t}_{k}},x,z_{{k - l + 1}}^{k}) = \iint {{{r}_{{k - 1}}}(t_{k}^{ - },x,z_{{k - l}}^{{k - 1}})\delta [{{z}_{k}} - {{f}_{k}}({{y}_{k}},z_{{k - l}}^{{k - 1}})]{{\beta }_{k}}({{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} x)d{{y}_{k}}d{{z}_{{k - l}}}},\quad k \geqslant l. \\ \end{gathered} $$
(4.10)

Here, the function \({{f}_{k}}( \cdot )\) is known from (4.2), (4.5), and (4.7), while integration over the variable \({{z}_{{k - l}}}\) in the second of these expressions corresponds to the removal of its block \({{Z}_{{k - l}}}\) from the vector \({{U}_{{k - 1}}} = Z_{{k - l}}^{{k - 1}}\).

5. ALGORITHM FOR PRECISE FILTER SYNTHESIS

The relations given above allow determining the following sequence of actions. They are performed in advance before the measurement results appear and, therefore, can be implemented using a rather powerful computer. The latter favorably distinguishes this filter and other OSFs from the classical AOF.

The first method for finding the optimal filter clock functions (4.2) is to alternate the process of solving the Kolmogorov–Feller equation (4.8) on the next inter-cycle time interval \(t \in [{{t}_{k}},{{t}_{{k + 1}}})\), \(k = 0,1, \ldots \) with a suitable clock conversion (4.10) of the final value of the obtained probability density \({{r}_{k}}(t_{{k + 1}}^{ - },x,{{u}_{k}})\) to the initial condition \({{r}_{{k + 1}}}({{t}_{{k + 1}}},x,{{u}_{{k + 1}}})\) for solving the same equation on the next interval. To perform such a conversion, a clock function \({{f}_{{k + 1}}}({{y}_{{k + 1}}},{{u}_{k}})\) is required. Therefore, each solution \({{r}_{k}}( \cdot )\) to this equation obtained on the kth interval by formula (4.7) is converted into a predictive probability density \({{\pi }_{k}}( \cdot )\), the final section of which, according to (4.5), is converted into a correction density \({{\rho }_{{k + 1}}}( \cdot )\). The latter allows finding by (4.2) a function \({{f}_{{k + 1}}}( \cdot )\) that, firstly, is the result of the filter’s synthesis at a specified clock cycle and, secondly, allows performing the aforementioned clock conversion of \({{r}_{k}}( \cdot )\) to \({{r}_{{k + 1}}}( \cdot )\) in accordance with (4.10). The initial condition for solving Eq. (4.8) on the interval \(t \in [{{t}_{0}},{{t}_{1}})\) is the probability density \({{r}_{0}}( \cdot )\) immediately obtained from (4.1), (4.4), and (4.9). At the same time, each inter-cycle cross section of the predictive density \({{\pi }_{k}}( \cdot )\) at time \(\tau _{k}^{i}\) by (4.3) also allows finding the function \(g_{k}^{i}({{u}_{k}})\) of calculating an additional estimate \(Z_{k}^{i}\), which serves to construct a more accurate piecewise polynomial prediction (2.7).

The second numerical method for filter synthesis is to obtain the almost exact optimal functions of the conditional means (4.1)–(4.3) due to these calculations in advance using the Monte Carlo method applied step-by-step in time. For this purpose, it is necessary to perform multiple statistical simulations of the equations of object (1.1), meter (1.2), and filter (3.2) and (3.3) at each measuring step k in order to calculate sufficiently large packages of realizations of the random variables \(\{ {{X}_{{t_{k}^{{\, - }}}}}\} \), \(\{ {{Y}_{k}}\} \), and \(\{ {{U}_{{k - 1}}}\} \). In this case, the Ito equation can be integrated using one of the well-known stochastic difference schemes of the Runge–Kutta type, for example, the Euler–Maruyama method. Then, the kth processing of the simulation results should be performed by finding a histogram of the optimal function of the conditional mean \({{f}_{k}}({{y}_{k}},{{u}_{{k - 1}}})\) and smoothing it out using any of the known methods. Knowledge of this function allows continuing the process of filter modeling for the next clock cycle. At the same time, if it is necessary, the inter-cycle values of the signal \(\{ {{X}_{{\tau _{k}^{i} - }}}\} \) should be fixed for similarly obtaining additional functions \(g_{k}^{i}({{u}_{k}})\).

However, both the described methods for the precise synthesis of the optimal filter are rather technically complicated. Therefore, we further consider the construction of two rather simple numerical-analytical approximations to the proposed filter. These approximations take into account only the covariance of some random variables losing the estimation accuracy due to this, while the form of their functions can be represented by analytical expressions. It only remains for us to numerically obtain the parameters of these expressions.

6. GAUSSIAN APPROXIMATION TO THE FILTER

We use the procedure described in [15] for constructing this well-known approximation to the FMF that, as noted above, differs from the proposed LOF only in the form of the state vector Uk.

We approximate the numerators of the fractions of the Bayes formulas (4.4), (4.5), and (4.7), namely, one conditional and two unconditional probability densities:

$${{\eta }_{k}}(t_{k}^{ - },x,{{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{{k - 1}}}) = {{\beta }_{k}}({{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} x){{\pi }_{{k - 1}}}(t_{k}^{ - },x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{{k - 1}}}),\quad {{r}_{{k - 1}}}(t_{k}^{ - },x,{{u}_{{k - 1}}}),\quad {{r}_{0}}({{x}_{0}},{{y}_{0}}) = {{\beta }_{0}}({{y}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} {{x}_{0}}){{p}_{0}}({{x}_{0}})$$

by the normal density \(N( \cdot \parallel m,D)\) with the corresponding conditional or unconditional parameters, such as the mean m and covariance D. Then, by the property of the Gaussian distribution law, we obtain that the conditional densities \({{\rho }_{k}}(t_{k}^{{\, - }},x{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{k}},{{u}_{{k - 1}}})\), \({{\pi }_{k}}(t,x{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{k}})\), and \({{\rho }_{0}}({{x}_{0}}{\kern 1pt} {\text{|}}{\kern 1pt} {{y}_{0}})\) determined by these fractions are also almost Gaussian, while their parameters, among which the desired conditional means (4.1)–(4.3) are present, are expressed in terms of m and D by the normal correlation theorem. Therefore, the optimal filter output formulas (3.3) obtain approximations that are only linear in the measurements of Yk and Y0. Furthermore, due to the following relation between the densities

$${{\pi }_{{k - 1}}}(t_{k}^{ - },x{\kern 1pt} {\text{|}}{{{\kern 1pt} }_{{k - 1}}}) = \int {{{\eta }_{k}}(t_{k}^{ - },x,{{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{{k - 1}}})\,d{{y}_{k}}} $$

and type (4.6) of the likelihood function \({{\beta }_{k}}({{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} x)\), three of the five parameters of the Gaussian approximation of the conditional density \({{\eta }_{k}}(t_{k}^{ - },x,{{y}_{k}}{\kern 1pt} {\text{|}}{\kern 1pt} {{u}_{{k - 1}}})\) can be expressed by its two other parameters using the measuring function \({{c}_{k}}(x,v)\) (1.2). As a result, similarly to [15], we obtain this algorithm for computing clock and inter-cycle estimates.

Theorem 2 (on the Gaussian LOF equations).If two conditional means of the meter (1.2)

$$\begin{gathered} {{\nu }_{k}}(x) = {\text{M[}}{{c}_{k}}(x,{{V}_{k}}){\text{]}} \triangleq \int \,{{c}_{k}}(x,v){{q}_{k}}(v)dv, \\ {{\Pi }_{k}}(x) = {\text{M[}}{{c}_{k}}(x,{{V}_{k}})c_{k}^{{\text{T}}}(x,{{V}_{k}}){\text{]}} \\ \end{gathered} $$

have Gaussian moments (the characteristics of their statistical linearization according to Kazakov)

$$\begin{array}{*{20}{c}} {{{h}_{k}}(m,D) = {\text{M}}_{N}^{{m,D}}{\text{[}}{{\nu }_{k}}(X){\text{]}} \triangleq \int \,{{\nu }_{k}}(x)N(x||m,D)dx,} \\ {{{F}_{k}}(m,D) = {\text{M}}_{N}^{{m,D}}{\text{[}}{{\Pi }_{k}}(X){\text{]}} - {{h}_{k}}(m,D)h_{k}^{{\text{T}}}(m,D),} \end{array}\quad {{G}_{k}}(m,D) = \frac{{\partial {{h}_{k}}(m,D)}}{{\partial m}},$$
(6.1)

then the following suboptimal filter equations hold:

$$\begin{gathered} {{Z}_{0}} = {{H}_{0}}{{Y}_{0}} + {{e}_{0}},\quad ~{{U}_{k}} = Z_{{\max (0,k - l + 1)}}^{k},\quad {{\Lambda }_{k}} = {{\Gamma }_{k}}{{U}_{{k - 1}}} + {{\kappa }_{k}},\quad k \geqslant 1, \\ {{Z}_{k}} = \Lambda _{k}^{'} + T_{k}^{'}G_{k}^{{\text{T}}}({{\Lambda }_{k}},{{T}_{k}})F_{k}^{ \oplus }({{\Lambda }_{k}},{{T}_{k}})\left[ {{{Y}_{k}} - {{h}_{k}}({{\Lambda }_{k}},{{T}_{k}})} \right],\quad k \geqslant 1, \\ {Z_{k}^{i} = \Gamma _{k}^{i}{{U}_{k}} + \kappa _{k}^{i}},\quad i = \overline {1,L} ,\quad k \geqslant 1. \\ \end{gathered} $$
(6.2)

Here, \({{\Lambda }_{k}}\) is the clock prediction vector, \(\Lambda _{k}^{'}\) and \(T_{k}^{'}\) are the matrices from the first \(n{\kern 1pt} '\) rows of the \(n\)-row matrices \({{\Lambda }_{k}}\) and \({{T}_{k}}\), \( \oplus \) is the Moore–Penrose matrix’s pseudoinverse symbol, \({{H}_{0}},\) and e0 are the initial filter parameters, \({{\Gamma }_{k}},{{\kappa }_{k}},\) and Tk are the clock parameters, and \(\Gamma _{k}^{i}\) and \(\kappa _{k}^{i}\) are the inter-cycle parameters. In this case, all these parameters are expressed through the first two moments m and D of the well-known random variables by the following formulas:

$${{H}_{0}} = D_{{00}}^{{x{\kern 1pt} 'y}}{{(D_{0}^{y})}^{ \oplus }},\quad {{e}_{0}} = m_{0}^{{x{\kern 1pt} '}} - {{H}_{0}}m_{0}^{y},$$
$${{\Gamma }_{k}} = D_{{t_{k}^{ - },k - 1}}^{{x,u}}\,{{(D_{{k - 1}}^{u})}^{ \oplus }},\quad {{\kappa }_{k}} = m_{{t_{k}^{ - }}}^{x} - {{\Gamma }_{k}}m_{{k - 1}}^{u},\quad {{T}_{k}} = D_{{t_{k}^{ - }}}^{x} - {{\Gamma }_{k}}{{(D_{{t_{k}^{ - },k - 1}}^{{x,u}})}^{{\text{T}}}},$$
(6.3)
$$\Gamma _{k}^{i} = D_{{\tau _{k}^{i} - ,k}}^{{x{\kern 1pt} ',u}}{{(D_{k}^{u})}^{ \oplus }},\quad \kappa _{k}^{i} = m_{{\tau _{k}^{i} - }}^{{x{\kern 1pt} '}} - \Gamma _{k}^{i}m_{k}^{u},\quad i = \overline {1,L} .$$

Note that these parameters are also now determined easily by the Monte Carlo method by processing the results of the multiple cycle-by-cycle statistical modeling of the equations of object (1.1), meter (1.2), and filter (6.2) without plotting histograms.

7. LINEARIZED APPROXIMATION TO THE FILTER

Unfortunately, the integral procedure for finding the characteristics of the statistical linearization of nonlinearities (6.1) that is necessary for constructing a Gaussian filter is rather complicated, while they do not always exist in real estimation problems [24]. Therefore, another well-known approximation, which is much simpler and, therefore, more widely used, is applicable. Similarly to [15], we can easily obtain such a statement.

Corollary (on the linearized LOF correction functions). If the nonlinearity \({{c}_{k}}(x,v)\) of meter (1.2) is differentiable with respect to both arguments and the Taylor formula at the point \(({{\Lambda }_{k}},m_{k}^{v})\), where Λk is the prediction of the vector \({{X}_{{t_{k}^{ - }}}}\) and \(m_{k}^{v}\) is the average value of the interference Vk,

$${{c}_{k}}({{X}_{{t_{k}^{ - }}}},{{V}_{k}}) \approx {{c}_{k}}({{\Lambda }_{k}},m_{k}^{v}) + C_{k}^{x}({{\Lambda }_{k}})({{X}_{{t_{k}^{ - }}}} - {{\Lambda }_{k}}) + C_{k}^{v}({{\Lambda }_{k}})({{V}_{k}} - m_{k}^{v}),$$

is valid for it, then the functions of the Gaussian correction (6.1) are approximated by the following expressions:

$$\begin{gathered} {{h}_{k}}(m,D) \approx {{c}_{k}}(m,m_{k}^{v}),\quad {{G}_{k}}(m,D) \approx C_{k}^{x}(m), \\ {{F}_{k}}(m,D) \approx C_{k}^{x}(m)DC{{_{k}^{x}}^{{\text{T}}}}(m) + C_{k}^{{}}(m){{R}_{k}}C{{_{k}^{v}}^{{\text{T}}}}(m). \\ \end{gathered} $$
(7.1)

Here, \(C_{k}^{x}(x) = {{\left. {{{\partial {{c}_{k}}(x,v)} \mathord{\left/ {\vphantom {{\partial {{c}_{k}}(x,v)} {\partial x}}} \right. \kern-0em} {\partial x}}} \right|}_{{v = m_{k}^{v}}}}\) and \(C_{k}^{v}(x) = {{\left. {{{\partial {{c}_{k}}(x,v)} \mathord{\left/ {\vphantom {{\partial {{c}_{k}}(x,v)} {\partial v}}} \right. \kern-0em} {\partial v}}} \right|}_{{v = m_{k}^{v}}}}\) are the sections of the Jacobi matrices of partial derivatives, and \({{R}_{k}} = {\text{cov}}\,[{{V}_{k}},{{V}_{k}}]\) is the covariance matrix of the interference measurement.

As a result, the linearized LOF equations also have form (6.2), but with easily obtained functions (7.1), while its parameters are still calculated by formulas (6.3).

Remark 3. The disadvantages of the two considered numerical-analytical approximations to the OSF such as linearized (differentiability of nonlinearities and a small deviation from the linearization point) and Gaussian (the difficulty of analytically obtaining Gaussian moments of nonlinearities and the possibility of their non-existence) may be overcome by using purely numerical approximations occupying an intermediate position between them, such as cubature [25] and unscented or sigma-point [9, 26, 27] approximations.

8. COMPARISON OF THE NEW AND CLASSIC COVARIANCE FILTERS

To clarify the fundamental differences of the suboptimal LOF (6.2) and FMF [15] from a similar approximation to a continuously discrete AOF, we give the equations of the latter. In this case, for simplicity, we restrict ourselves to considering the problem of estimating a purely diffusion signal, assuming that there are no hoppings in the process Xt, so that \({{X}_{{t - }}} = {{X}_{t}}\). Moreover, let the LOF estimate the entire vector Xt, i.e., \(n{\kern 1pt} ' = n\), as a result of which its estimates \({{Z}_{k}}\) and \(Z_{k}^{i}\) can be designated as \({{\hat {X}}_{{{{t}_{k}}}}}\) and \(\hat {X}_{{{{t}_{k}}}}^{i}\).

The linearized version of the AOF is well known as an extended Kalman filter (EKF), while its Gaussian version is not as well-known as a normal approximation filter (NAF). Their equations also differ only in the form of structural functions. Any of them builds a prediction between measurements \(t \in [{{t}_{k}},{{t}_{{k + 1}}})\) continuously, integrating an autonomous (closed) system of ordinary differential equations for the estimate \({{\hat {X}}_{t}} = {\rm M}[{{X}_{t}}{\kern 1pt} {\text{|}}{\kern 1pt} Y_{0}^{k}]\) and the matrix \({{P}_{t}} = \operatorname{cov} [{{X}_{t}}{\kern 1pt} {\text{|}}{\kern 1pt} Y_{0}^{k}]\) of the posterior covariances of the estimated vector [3, p. 354]:

$$\left\{ \begin{gathered} \tfrac{d}{{dt}}{{{\hat {X}}}_{t}} = \tau (t,{{{\hat {X}}}_{t}},{{P}_{t}}), \hfill \\ \tfrac{d}{{dt}}{{P}_{t}} = A(t,{{{\hat {X}}}_{t}},{{P}_{t}}){{P}_{t}} + {{P}_{t}}{{A}^{{\text{T}}}}(t,{{{\hat {X}}}_{t}},{{P}_{t}}) + \Theta (t,{{{\hat {X}}}_{t}},{{P}_{t}}), \hfill \\ \end{gathered} \right.\quad t \in [{{t}_{k}},{{t}_{{k + 1}}}),\quad k = 0,1, \ldots $$
(8.1)

Then, at each clock time point \({{t}_{k}}\), the final values of the solutions \({{\hat {X}}_{{t_{k}^{ - }}}}\) and \({{P}_{{t_{k}^{ - }}}}\) of system (8.1) on the previous interval \(t \in [{{t}_{{k - 1}}},{{t}_{k}})\) are adjusted by measuring Yk in the initial conditions \({{\hat {X}}_{{{{t}_{k}}}}}\) and \({{P}_{{{{t}_{k}}}}}\) for the next interval \(t \in [{{t}_{k}},{{t}_{{k + 1}}})\) using the following formulas [3, p. 458]:

$$\left\{ \begin{gathered} {{{\hat {X}}}_{{{{t}_{k}}}}} = {{{\hat {X}}}_{{t_{k}^{ - }}}} + {{H}_{k}}[{{Y}_{k}} - {{h}_{k}}({{{\hat {X}}}_{{t_{k}^{ - }}}},{{P}_{{t_{k}^{ - }}}})], \hfill \\ {{P}_{{{{t}_{k}}}}} = {{P}_{{t_{k}^{ - }}}} - {{H}_{k}}{{G}_{k}}({{{\hat {X}}}_{{t_{k}^{ - }}}},{{P}_{{t_{k}^{ - }}}}){{P}_{{t_{k}^{ - }}}}. \hfill \\ \end{gathered} \right.\quad {{H}_{k}} = {{P}_{{t_{k}^{ - }}}}G_{k}^{{\text{T}}}({{\hat {X}}_{{t_{k}^{ - }}}},{{P}_{{t_{k}^{ - }}}})F_{k}^{ \oplus }({{\hat {X}}_{{t_{k}^{ - }}}},{{P}_{{t_{k}^{ - }}}}),\quad k = 0,1, \ldots $$
(8.2)

Here, the initial values are deterministic:

$${{\hat {X}}_{{t_{0}^{ - }}}} = m_{0}^{x},\quad {{P}_{{t_{0}^{ - }}}} = D_{0}^{x}.$$

Thus, in addition to the same correction functions \({{h}_{k}}( \cdot ),\,\,{{G}_{k}}( \cdot ),\) and \({{F}_{k}}( \cdot )\) as in the LOF, three prediction functions \(\tau \,( \cdot )\), \(A\,( \cdot )\), and \(\Theta \,( \cdot )\) are also needed now. In the Gaussian approximation for the NAF, they are also found as, similar to (6.1), the characteristics of statistical linearization but of the bias and diffusion functions of the object’s equation (1.1):

$$\begin{array}{*{20}{c}} {\tau (t,m,D) = {\text{M}}_{N}^{{m,D}}[a(t,X)],} \\ {\Theta (t,m,D) = {\text{M}}_{N}^{{m,D}}[B(t,X){{B}^{{\text{T}}}}(t,X)],} \end{array}\quad A(t,m,D) = \frac{{\partial \tau (t,m,D)}}{{\partial m}}$$
(8.3)

while, in the linearized approximation, for the EKF as in (7.1), they have a simpler form:

$$\begin{array}{*{20}{c}} {\tau (t,m,D) = a(t,m),} \\ {\Theta (t,m,D) = B(t,m){{B}^{{\text{T}}}}(t,m),} \end{array}\quad A(t,m,D) = \frac{{\partial a(t,m)}}{{\partial m}}.$$
(8.4)

However, the system of differential equations (8.1), the second of which is an Riccati-type equation, has to be solved numerically with a sufficiently small step while ensuring the symmetry and positive definiteness of the covariance matrix, which is not at all simple [9, 10]. An n-dimensional vector of the estimate \({{\hat {X}}_{t}}\) and various elements of the n × n-matrix of the covariance \({{P}_{{{{t}_{k}}}}}\) form a rather large state vector of the filter, so that its order is equal to \(n(n + 3){\text{/}}2\). In this case, here it is not possible to save by evaluating only part of the state variables.

Comparing the equations of the new filter (6.2) with the well-known (8.1) and (8.2), we note the following fundamental advantages of approximate OSFs.

1. Clock estimates of suboptimal filters are calculated using the same (common to them) correction function

$${{\tilde {f}}_{k}}(\xi ,\Psi ,y) = \xi + \Psi G_{k}^{{\text{T}}}(\xi ,\Psi )\,F_{k}^{ \oplus }(\xi ,\Psi )\left[ {y - {{h}_{k}}(\xi ,\Psi )} \right],$$

but with respect to different values of its arguments:

$$\hat {X}_{{{{t}_{k}}}}^{{{\text{AOF}}}} = {{\tilde {f}}_{k}}({{\hat {X}}_{{t_{k}^{ - }}}},{{P}_{{t_{k}^{ - }}}},{{Y}_{k}}),\quad \hat {X}_{{{{t}_{k}}}}^{{{\text{LOF}}}} = {{\tilde {f}}_{k}}({{\Lambda }_{k}},{{T}_{k}},{{Y}_{k}}).$$

Here, the clock AOF prediction \({{\hat {X}}_{{t_{k}^{ - }}}}\) obtained by integrating the system of Eqs. (8.1) is replaced by the OSF prediction vector Λk, which is easily found by the state vector of this filter \({{U}_{{k - 1}}}~ = \hat {X}_{{{{t}_{{\max (0,k - l)}}}}}^{{{{t}_{{k - 1}}}}}\) as Λk = \({{\Gamma }_{k}}{{U}_{{k - 1}}} + {{\kappa }_{k}}\) using the precalculated parameters Γk and \({{\kappa }_{k}}\), and thus it is unbiased. Besides, the clock matrix \({{P}_{{t_{k}^{ - }}}} = {\text{cov}}[{{X}_{{t_{k}^{ - }}}}{\kern 1pt} {\text{|}}{\kern 1pt} Y_{0}^{{k - 1}}]\) obtained from the same system (8.1) is replaced in the OSF by the predetermined determinate matrix \({{T}_{k}} = {\text{cov}}[{{X}_{{t_{k}^{ - }}}} - {{\Lambda }_{k}}]\), so that it is also not necessary to solve a complex equation of the Riccati type.

2. Over the time interval between measurements, the AOF prediction is constructed by solving a complex system of differential equations (8.1), while the LOF prediction is determined by a simple extrapolation (2.4) or (2.7) of the clock \({{\hat {X}}_{{{{t}_{k}}}}}\) or inter-cycle \(\hat {X}_{{{{t}_{k}}}}^{i}\) estimates. The latter are calculated as \({\hat {X}_{{{{t}_{k}}}}^{i} = \Gamma _{k}^{i}{{U}_{k}} + \kappa _{k}^{i}}\) and are unbiased.

The performed analysis allows drawing the following conclusion.

Statement (on comparing approximations to LOF and AOF). The approximate OSFs do not need to integrate the system of equations of approximate AOFs for the prediction vector and its covariance matrix (8.1) and, therefore it is not necessary to find three prediction functions according to (8.3) or (8.4). Instead, it is necessary to preliminarily find from (6.3) and use the clock parameters \({{\Gamma }_{k}},{{\kappa }_{k}},{{T}_{k}}\) and inter-cycle parameters \(\Gamma _{k}^{i},\kappa _{k}^{i}\), which compensate for the lack of information about the covariance of the estimation error Pt.

9. AN EXAMPLE OF THE NUMERICAL COMPARISON OF SUBOPTIMAL FILTERS

The calculations were performed by a student, A.A. Rick, under the supervision of the author.

To analyze the accuracy of the approximations to the AOF and the OSF, we consider the problem of estimating a two-dimensional state vector of a stochastic version of a nonlinear Van der Pol oscillator. The corresponding system of Ito equations (1.1), which is given here in a non-strict Langevin form, has the following form:

$$\left\{ \begin{gathered} {{{\dot {X}}}_{{1,t}}} = {{X}_{{2,t}}},\quad \omega = 0.1\pi ,\quad \alpha = 2,\quad \beta = 1,\quad {{X}_{{1,0}}} \sim N({{x}_{1}}||2,0.15), \hfill \\ {{{\dot {X}}}_{{2,t}}} = - {{\omega }^{2}}{{X}_{{1,t}}} + \alpha {{X}_{{2,t}}}(1 - \beta X_{{1,t}}^{2}) + {{X}_{{1,t}}}{{{\dot {W}}}_{t}},\quad {{X}_{{2,0}}} \sim N({{x}_{2}}||0,0.15), \hfill \\ \end{gathered} \right.$$

where \({{\dot {W}}_{t}}\) is Gaussian white noise. Figure 1 shows the phase plane (\({{X}_{{1,t}}}\), \({{X}_{{2,t}}}\)) with two limit cycles of this oscillator, perturbed (at \({{\dot {W}}_{t}} \ne 0\)) and unperturbed (at \({{\dot {W}}_{t}} = 0\)), that form the corresponding hysteresis loops.

Fig. 1.
figure 1

Limit cycles of the Van der Pol oscillator on the phase plane (\({{X}_{{1,t}}}\), \({{X}_{{2,t}}}\)).

Let both state variables of this oscillator be measured with additive errors, so that the discrete meter (1.2) is two-dimensional and non-linear in X1,t:

$$\left\{ \begin{gathered} {{Y}_{{1,k}}} = X_{{1,{{t}_{k}}}}^{2} + {{V}_{{1,k}}},\quad {{V}_{{1,k}}} \sim N({{v}_{1}}||0,1), \hfill \\ {{Y}_{{2,k}}} = {{X}_{{2,{{t}_{k}}}}} + {{V}_{{2,k}}},\quad {{V}_{{2,k}}} \sim N({{v}_{2}}||0,1). \hfill \\ \end{gathered} \right.$$

The accuracy of the compared filters was determined by the Monte Carlo method with averaging over a sample of 800 realizations. Statistical modeling of the differential equations of the oscillator and the filter with continuous prediction was carried out using the method of the Runge–Kutta type on the interval [0, 3] with an integration step of Δtint = 0.01. The time interval was taken to be Δtmeas = 0.5 between measurements and Δtprediction = 0.1 between piecewise-constant discrete predictions. The structural functions of the linearized and Gaussian filters for this example were found in [15].

Figure 2 shows the plots of the standard deviations (RMS) of the errors Se1 for estimating the variable X1,t by several linearized (L) continuously-discrete filters with various types of predictions. Four filters are compared:

Fig. 2.
figure 2

Time-dependent standard deviations of errors in estimating the variable \({{X}_{{1,t}}}\) of the Van der Pol oscillator by various linearized filters: (1) L-AOF, (2) L-SOF with a continuous prediction, (3) L-LOF of multiplicity l = 1 with piecewise constant predictions, and (4) L-FMF of multiplicity l = 4 with piecewise constant predictions

1. L-AOF (EKF) (8.1), (8.2), and (8.4), the prediction of which is necessarily continuous;

2. L-SOF with a continuous prediction [14];

3. L-LOF of multiplicity l = 1 with piecewise constant predictions (2.7) at \(L = 4\);

4. L-FMF of multiplicity l = 4 [15] with the same piecewise constant predictions.

As can be seen in Fig. 2, the accuracy of the L-AOF, which is of order \({p} = 2 \times (2 + 3){\text{/}}2 = 5\) in this example, over time loses increasing to both L-FMPs of order p = 2 and, moreover, to 4-fold L-FMF of order \({p} = 4 \times 2 = 8\).

The implementation time of the new filters in comparison with the classic L-AOF is also better:

—for an L-SOF with a continuous prediction, it is smaller by a factor of 1.6, since, unlike L-AOF, the latter does not integrate the three differential equations for covariances,

—for L-SOF and L-FMF with piecewise constant predictions, it is even smaller by a factor of 2.8 because they do not integrate any differential equations, even for prediction, while the large order of L-FMF barely affected the estimation rate.

Thus, this example both confirms the conclusion formulated in Section 8 about the computational advantages of the linearized LOF and FMF and even demonstrates some of their superiority in accuracy over EKF.

CONCLUSIONS

The synthesis method for a fast continuously discrete nonlinear filter with infinite memory that produces estimates only at discrete time points, based on which it is possible to construct a prediction of the desired type, is proposed. This filter can be implemented in real time on a low-performance computer and has the highest accuracy in its class of simple filters, which remember the last few clock estimates of the information part of the state vector of a continuous object, which is also exposed to pulsed impacts.

Formulas for the optimal output functions of the new filter are obtained. For the corresponding probability densities, a chain of integro-differential equations and clock conversion formulas for its solution is found. A method for calculating the filter’s structural functions by the Monte Carlo method is described. A Gaussian approximation to the proposed filter and its linearized simplification are also constructed. A comparative analysis of these approximations with similar ones for the classical filter are performed. The accuracy and estimation rate of the linearized filter is illustrated by a two-dimensional example, which confirms the theoretical conclusions.