INTRODUCTION

The resolution of digital images depends on the characteristics of the systems that form and record these images. In addition, the resolution of images is affected by restrictions arising from their transmission over communication channels. For the efficient operation of many information processing systems, high-resolution (HR) images are required, which provide the required level of scene detail, but cannot be obtained in hardware, primarily due to the limited capabilities of the recording and data transmission facilities. In this regard, a general approach to the construction of multi-frame superresolution (SR) algorithms is known [112], which allows the reconstruction of images from HR by accumulating a sequence of images with low resolution (LR). Using this approach, from the observed sequence of LR images displaying the same scene, if there are fractional pixel displacements (not multiples of one LR pixel) between them, an image with an HR is recreated.

Another important factor that determines the quality of the recorded images is that the resulting graphic materials are often exposed to not only additive but also the so-called applicative noise (AN) (shadowing of objects, as well as the presence of damaged areas in the images and anomalous observations). The impact of the latter leads to the appearance of distributed areas of anomalous observations in each original image, which can also be considered as an additional factor in reducing the resolution, which is distinguished by the irregular nature of the location of areas of low or zero resolution.

Currently, there are several different algorithms for constructing SR [1, 2], but only some of them contain attempts to solve the problem of compensating the losses in conditions of applicative distortions with a simultaneous increase in the resolution. At the same time, some of these algorithms [3, 4] are aimed at combating the effect of AN by painting over the affected areas of LR images, while other algorithms [512] are based on the accumulation and processing of a sequence of LR images. These works include the previous publications of the authors of this article [812], based on the use of methods and algorithms for optimal filtering of a Kalman-type image sequence in combination with machine learning algorithms that are focused on identifying and localizing areas affected by AN.

At the same time, in the known works, the issues of solving the presented problem under conditions of statistical uncertainty regarding the parameters of the mathematical model of observations, for example, the parameters of interframe shifts [11] and the dispersion of the blur of the imaging system [12], have been insufficiently considered. The statement of the problem under the indicated conditions can be considered as the statement of the problem of the synthesis of algorithms for the optimal filtering of the sequence of images that are adaptive to the parameters of the used models. In addition, in the known works there are no quantitative data and results of the comparison of the known approaches and algorithms based on the use of optimal filtering methods in combination with machine learning algorithms for a complex solution of the problem of increasing the image resolution in conditions of AN.

The aim of this paper is to study algorithms for constructing multiframe SR in conditions of AN in an adaptive setting and compare them with the known algorithms used or potentially applicable to solve the considered problem of increasing the resulting resolution of a sequence of images.

1. ANALYSIS OF KNOWN ALGORITHMS

1.1. Algorithms Based on the Spin Glass Model (ABSGM)

The algorithms presented in [5, 6] are based on the spin glass model [13], and this model is used to simulate the noise acting on the images, but not the images themselves. Hereinafter, images are considered as vectors obtained in the course of the progressive scanning of two-dimensional pixel arrays. Assume {\({{{\mathbf{y}}}^{{(t)}}}\)} is a sequence of the initial LR images obtained at the moments of time \(t = \overline {1,T} \) from which we want to recreate the HR image x, and \({{{\mathbf{z}}}^{{(t)}}} \in {{\{ - 1, + 1\} }^{M}}\) is the hidden vector defining pixels \({{{\mathbf{y}}}^{{(t)}}}\) which are affected by noise (–1 indicates presence of noise and +1 indicates absence of noise). We present probabilistic models for \({{{\mathbf{y}}}^{{(t)}}}\), x, and \({{{\mathbf{z}}}^{{(t)}}}\)used in [5, 6].

The prior distribution of the hidden variables \({{{\mathbf{z}}}^{{(t)}}}\) is described by the Boltzmann distribution:

$$p({{{\mathbf{z}}}^{{(t)}}}) = \frac{1}{Z}\exp ( - E({{{\mathbf{z}}}^{{(t)}}})).$$
(1.1)

Energy \(E({{{\mathbf{z}}}^{{(t)}}})\) is set as follows:

$$E({{{\mathbf{z}}}^{{(t)}}}) = - {{J}_{{{\text{self}}}}}\sum\limits_i {z_{i}^{{(t)}} - {{J}_{{{\text{inner}}}}}\sum\limits_{i\sim j} {z_{i}^{{(t)}}z_{j}^{{(t)}}} } ,$$
(1.2)

where i ~ j means that the ith and jth pixels are adjacent; and constants \({{J}_{{{\text{self}}}}}\) and \({{J}_{{{\text{inner}}}}}\) are set based on the properties of noise: the self-coupling coefficient \({{J}_{{{\text{self}}}}}\) characterizes the trend of the propagation of local areas of closure (LAC) of AN; and the coefficient of internal connections \({{J}_{{{\text{inner}}}}}\), the degree of correlation between adjacent pixels of the LAC. It is assumed that noise usually occupies a smaller part of the image; hence, \({{J}_{{{\text{self}}}}} > 0\). Since the AN affects the entire areas of the image, and not individual pixels, \({{J}_{{{\text{inner}}}}} > 0\). Since the elements z(t) take only two values: ±1 (Ising model), expressions (1.1) and (1.2) are analogs of the spin glass model.

The conditional distribution of observations \({{{\mathbf{y}}}_{t}}\) is described by Gaussian law

$$p({{{\mathbf{y}}}^{{(t)}}}\,{\text{|}}\,{\mathbf{x}},{{{\mathbf{z}}}^{{(t)}}}) = N({{{\mathbf{y}}}^{{(t)}}}\,{\text{|}}\,{{{\mathbf{W}}}^{{(t)}}}{\mathbf{x}},{{{\mathbf{B}}}^{{ - 1}}}({{{\mathbf{z}}}^{{(t)}}})),$$
(1.3)

where \({{{\mathbf{W}}}^{{(t)}}}\) is the operator characterizing the effect of the system for forming the observed images and \({\mathbf{B}}({{{\mathbf{z}}}^{{(t)}}})\) is the matrix inverse to the diagonal covariance matrix, the diagonal elements of which are specified as follows:

$$\beta (z_{i}^{{(t)}}) = \left\{ {\begin{array}{*{20}{c}} {{{\beta }_{H}},\quad z_{i}^{{(t)}} = + 1,} \\ {{{\beta }_{L}},\quad z_{i}^{{(t)}} = - 1.} \end{array}} \right.$$
(1.4)

Here \({{\beta }_{H}} > {{\beta }_{L}}\), because \(z_{i}^{{(t)}} = + 1\) means greater reliability; and \(z_{i}^{{(t)}} = - 1\), less reliability. A priori distribution x is also described by the Gaussian law

$$p({\mathbf{x}}) = N({\mathbf{x}},0,{{(\rho {\mathbf{A}})}^{{ - 1}}}),$$
(1.5)

where ρ is the accuracy coefficient. In (1.5) A is a precision matrix with a smoothing effect:

$${{{\mathbf{A}}}_{{ij}}} = \left\{ {\begin{array}{*{20}{l}} {{\text{|}}{\rm N}(i){\text{|}},\quad i = j,} \\ { - 1,\quad i\sim j,} \\ {0\quad {\text{otherwise}}{\text{,}}} \end{array}} \right.$$
(1.6)

where \({\rm N}(i) = \{ j\,{\text{|}}\,i\sim j\} \) is the number of neighbors of the ith pixel of the image.

The HR image x can be defined as the mathematical expectation of the a posteriori distribution \(p({\mathbf{x}}\,{\text{|}}\,{\mathbf{y}})\); however, this cannot be done in an explicit analytical form due to the presence of hidden variables. Therefore, in [13], it is proposed to use the method of variational Bayesian inference, in which the a posteriori distribution over a set of unobservable variables is approximated by another distribution, called the variational distribution \(p({\mathbf{z}}\,{\text{|}}\,{\mathbf{x}}) \approx q({\mathbf{z}})\) in such a way that the Kullback–Leibler divergence between the two distributions would be minimal.

In this case, the numerical solution for x is the expectation μ of the Gaussian distribution:

$$q\text{*}({\mathbf{x}}) = N({\mathbf{x}},{\boldsymbol{\mu }_{{\mathbf{x}}}},{\boldsymbol{\Sigma }_{{\mathbf{x}}}}),\quad {\boldsymbol{\mu }_{{\mathbf{x}}}} = {\boldsymbol{\Sigma }_{{\mathbf{x}}}}\sum\limits_{t = 1}^T {{{{\mathbf{W}}}^{{(t){\text{T}}}}}\langle B({{{\mathbf{z}}}^{{(t)}}})\rangle {{{\mathbf{y}}}^{{(t)}}}} ,\quad {\boldsymbol{\Sigma }_{{\mathbf{x}}}} = {{\left( {\rho {\mathbf{A}} + \sum\limits_{t = 1}^T {{{{\mathbf{W}}}^{{(t){\text{T}}}}}\langle B({{{\mathbf{z}}}^{{(t)}}})\rangle {{{\mathbf{W}}}^{{(t){\text{T}}}}}} } \right)}^{{ - 1}}},$$
(1.7)

where \(\left\langle {...} \right\rangle \) is the expectation operator, and the diagonal elements of the expected inverse covariance matrix \(\langle B({{{\mathbf{z}}}^{{(t)}}})\rangle \) are set as follows:

$$\langle \beta (z_{i}^{{(t)}})\rangle = q(z_{i}^{{(t)}} = 1){{\beta }_{H}} + q(z_{i}^{{(t)}} = - 1){{\beta }_{L}}.$$
(1.8)

Due to the large dimension \({{{\mathbf{\Sigma }}}_{{\mathbf{x}}}}\) it is difficult to get this matrix directly. However, it follows from (1.7) that

$${\mathbf{S}}{\boldsymbol{\mu }_{{\mathbf{x}}}} = {\mathbf{b}},\quad {\mathbf{S}} = \boldsymbol{\Sigma} _{{\mathbf{x}}}^{{ - 1}} = \rho {\mathbf{A}} + \sum\limits_{t = 1}^T {{{{\mathbf{W}}}^{{(t){\text{T}}}}}\langle B({{{\mathbf{z}}}^{{(t)}}})\rangle {{{\mathbf{W}}}^{{(t)}}}} ,\quad {\mathbf{b}} = \sum\limits_{t = 1}^T {{{{\mathbf{W}}}^{{(t){\text{T}}}}}\langle B({{{\mathbf{z}}}^{{(t)}}})\rangle {{{\mathbf{y}}}^{{(t)}}}} .$$
(1.9)

Thus, to find μ, it suffices to solve the linear equation (1.9) taking into account the strong sparseness of the matrix \({{\Sigma }_{{\mathbf{x}}}}\). For this, it is proposed to use the conjugate gradient method [14].

The optimal variation distribution for values \(z_{i}^{{(t)}}\) is described by the Bernoulli distribution:

$$\begin{gathered} q\text{*}(z_{i}^{{(t)}}) = Ber(z_{i}^{{(t)}}\,{\text{|}}\,{{v}_{{ti}}}) = v_{{ti}}^{{\frac{1}{2}(1 + z_{i}^{{(t)}})}}{{(1 - {{v}_{{ti}}})}^{{\frac{1}{2}(1 - z_{i}^{{(t)}})}}},\quad {{v}_{{ti}}} = \text{sig} 2{{\lambda }_{{ti}}} = \frac{1}{{1 + \exp ( - 2{{\lambda }_{{ti}}})}}, \\ {{\lambda }_{{ti}}} = {{J}_{i}} + \sum\limits_{j \in {\rm N}(i)} {{{J}_{{ij}}}\left\langle {{{z}_{{tj}}}} \right\rangle + \frac{1}{4}\left( {\ln \frac{{{{\beta }_{H}}}}{{{{\beta }_{L}}}} - ({{\beta }_{H}} - {{\beta }_{L}})\langle e_{{ti}}^{2}\rangle } \right)} , \\ \end{gathered} $$
(1.10)

where sig is sigmoid and \({{e}_{{ti}}} = {{{\mathbf{y}}}_{{ti}}} - {{[{{{\mathbf{W}}}_{t}}{\mathbf{x}}]}_{i}}\) is a mistake in the ith pixel of the tth observation (LR images), the expected value of which can be approximately calculated:

$$\left\langle {e_{{ti}}^{2}} \right\rangle \approx {{({{{\mathbf{y}}}_{{ti}}} - {\mathbf{w}}_{{ti}}^{{\text{T}}}{\boldsymbol{\mu }_{{\mathbf{x}}}})}^{2}},$$
(1.11)

where \({{{\mathbf{w}}}_{{ti}}}\) is the ith row of the matrix Wt.

Taking into account the relations described above, the algorithm for obtaining an HR image takes the following form.

1. \(l \leftarrow 0\).

2. \(l \leftarrow l + 1\).

3. To find \(\boldsymbol{\mu} _{{\mathbf{x}}}^{{(l)}}\) (\(\boldsymbol{\mu} _{{\mathbf{x}}}^{{}}\) on step l), according to (1.7)–(1.9), using the conjugate gradient method.

4. Recalculate \(v_{{ti}}^{{(l)}}\) (coefficients \(v_{{ti}}^{{}}\) on step l), according to (1.10).

5. Repeat 2–4 until the following condition is met: \({\text{||}}\boldsymbol{\mu} _{{\mathbf{x}}}^{{(l)}} - \boldsymbol{\mu} _{{\mathbf{x}}}^{{(l - 1)}}{\text{||/||}}\boldsymbol{\mu} _{{\mathbf{x}}}^{{(l - 1)}}{\text{||}} < \varepsilon \).

6. \({\mathbf{\tilde {x}}} \leftarrow \boldsymbol{\mu} _{{\mathbf{x}}}^{{(l)}}\).

1.2. ABGSM Taking into Account the Inertial Motion of the AN (ABGSMIM)

The authors of [6] supplement the approach given above, proceeding from the assumption that the areas affected by AN are not in random places of the LR images but move inertially between frames. For this, the vector \({\boldsymbol{\theta }^{{(t)}}} = {{[\theta _{1}^{{(t)}},\theta _{2}^{{(t)}}]}^{{\text{T}}}}\) describing the displacement of the AN between observations t and t + 1 in Cartesian coordinates is introduced in the model. With this in mind, the energy function (E) is defined as

$$E({{{\mathbf{z}}}^{{(t)}}},{{{\mathbf{z}}}^{{(t + 1)}}},{\boldsymbol{\theta }^{{(t)}}},{\mathbf{J}}) = - {{J}_{{{\text{self}}}}}\sum\limits_i {z_{i}^{{(t + 1)}} - {{J}_{{{\text{inner}}}}}\sum\limits_{i\sim j} {z_{i}^{{(t + 1)}}z_{j}^{{(t + 1)}}} - {{J}_{{{\text{move}}}}}{{{\mathbf{z}}}^{{(t + 1){\text{T}}}}}G({\boldsymbol{\theta }^{{(t)}}}){{{\mathbf{z}}}^{{(t)}}}} ,$$
(1.12)

where \(G({\boldsymbol{\theta }^{{(t)}}})\) is the shift matrix and \(G({\boldsymbol{\theta }^{{(t)}}}){{{\mathbf{z}}}^{{(t)}}}\) is the predicted position of the AN on frame t + 1 and constant \({{J}_{{{\text{move}}}}} > 0\) describes the degree of similarity of the AN areas between adjacent frames.

Here the a priori distribution \(G({\boldsymbol{\theta }^{{(t)}}}){{{\mathbf{z}}}^{{(t)}}}\) is set based on the fact that the displacement process is Markov and the LAC will most likely keep the direction of movement

$$p(\boldsymbol{\theta} ) = p({{\boldsymbol{\theta} }^{{(1)}}})\prod\limits_{t = 1}^{T - 1} {N({{\boldsymbol{\theta} }^{{(t + 1)}}}\,{\text{|}}\,{\boldsymbol{\theta }^{{(t)}}},{{{(r{\mathbf{I}})}}^{{ - 1}}})} ,$$
(1.13)

where r is the accuracy coefficient and I is the identity matrix.

Using the Laplace method, the density of the variational distribution θ is described as

$$q\text{*}(\boldsymbol{\theta} ) = N(\boldsymbol{\theta} \,{\text{|}}\,{\boldsymbol{\mu }_{\boldsymbol{\theta} }},{\boldsymbol{\Sigma }_{\boldsymbol{\theta} }}),$$
(1.14)

where \({\boldsymbol{\Sigma }_{\boldsymbol{\theta} }} = {{\left\langle {{\mathbf{\bar {H}}}} \right\rangle }^{{ - 1}}}\) and \(\left\langle {{\mathbf{\bar {H}}}} \right\rangle \) is the Hessian \({{\langle {\text{ln}}q {\text{*}}({\boldsymbol{\mu }_{\boldsymbol{\theta }}})\rangle }_{{q({\mathbf{z}})}}}\).

Taking into account these considerations, the values \({{\lambda }_{{ti}}}\) are calculated as follows:

$$\begin{gathered} {{\lambda }_{{ti}}} = {{J}_{{{\text{self}}}}} + {{J}_{{{\text{inner}}}}}\sum\limits_{j \in {\rm N}(i)} {\langle z_{j}^{{(t)}}\rangle + {{J}_{{{\text{move}}}}}{{{[{{{\bar {G}}}^{{(t - 1)}}}\langle {{{\mathbf{z}}}^{{(t - 1)}}}\rangle + \bar {G}_{{}}^{{(t){\text{T}}}}\langle {{{\mathbf{z}}}^{{(t + 1)}}}\rangle ]}}_{i}}} \\ \, + \frac{1}{2}{{J}_{{{\text{move}}}}}\sum\limits_{j{\text{,}}l{\text{,}}k} {[\bar {G}_{{{{\theta }_{k}}{{\theta }_{l}}ij}}^{{(t - 1)}}\langle z_{j}^{{(t - 1)}}\rangle {\boldsymbol{\Sigma }_{{\boldsymbol{\theta} kl}}} + \bar {G}_{{{{\theta }_{k}}{{\theta }_{l}}ji}}^{{(t)}}\langle z_{j}^{{(t + 1)}}\rangle {\boldsymbol{\Sigma }_{{\boldsymbol{\theta} kl}}}]} + \frac{1}{4}\left( {\ln \frac{{{{\beta }_{H}}}}{{{{\beta }_{L}}}} - ({{\beta }_{H}} - {{\beta }_{L}})\langle e_{{ti}}^{2}\rangle } \right), \\ \end{gathered} $$
(1.15)

where \(\bar {G}_{{{{\theta }_{k}}}}^{{(t)}} = {{\partial G(\boldsymbol{\mu} _{\boldsymbol{\theta }}^{{(t)}})} \mathord{\left/ {\vphantom {{\partial G(\boldsymbol{\mu} _{\boldsymbol{\theta }}^{{(t)}})} {\partial \theta _{k}^{{(t)}}}}} \right. \kern-0em} {\partial \theta _{k}^{{(t)}}}}\) and \(\bar {G}_{{{{\theta }_{l}}{{\theta }_{k}}}}^{{(t)}} = {{{{\partial }^{2}}G(\mu _{\boldsymbol{\theta }}^{{(t)}})} \mathord{\left/ {\vphantom {{{{\partial }^{2}}G(\boldsymbol{\mu} _{\boldsymbol{\theta }}^{{(t)}})} {\partial \theta _{l}^{{(t)}}\partial \theta _{k}^{{(t)}}}}} \right. \kern-0em} {\partial \theta _{l}^{{(t)}}\partial \theta _{k}^{{(t)}}}}\).

1.3. Algorithms Based on Models of Markov Random Fields (ABMMRFs)

The approach presented in [6] is based on the use of the model of random Markov fields [15]. The observation model is described by the following relation:

$${{{\mathbf{y}}}_{t}} = {{{\mathbf{O}}}_{t}}{\mathbf{D}}{{{\mathbf{H}}}_{t}}{\mathbf{x}} + {\boldsymbol{\omega }_{t}},$$
(1.16)

where operator Ot removes the pixels affected by AN, operator D performs decimation of the HR image, operator Ht characterizes the impact of the observed image formation system, and \({\boldsymbol{\omega }_{t}}\) is Gaussian noise. It is believed that these operators for obtaining the observed LR images are either known or can be determined. In this case, the assessment procedure Ot assumes the procedure for the independent segmentation of each LR image yt into useful and false observations, which will be discussed below.

The HR image x is considered as an inhomogeneous adaptive random Markov random field (DAMRF model) [15], which allows preserving the inhomogeneities and details of the HR image x. The joint distribution density x is set as follows:

$$p({\mathbf{x}}) = \frac{1}{Z}\exp \left( { - \sum\limits_{c \in C} {{{V}_{c}}({\mathbf{x}})} } \right),$$
(1.17)

where Z is the normalization constant, C is the set of all clicks, and \({{V}_{c}}({\mathbf{x}})\) are potential functions such that

$$\sum\limits_{c \in C} {{{V}_{c}}({\mathbf{x}})} = \sum\limits_{c \in C} {g({{d}_{c}}{\mathbf{x}})} .$$
(1.18)

The choice of the model is important, since it reflects information about the smoothness of the image using the measure of local spatial changes \({{d}_{c}}{\mathbf{x}}\). The DAMRF model adaptively estimates the level of pixel similarity to preserve heterogeneities:

$$g(\eta ) = - \gamma \cdot \exp \left( { - \frac{{{{\eta }^{2}}}}{\gamma }} \right),$$
(1.19)

where η is the difference between the values of two adjacent pixels.

The maximum a posteriori probability x can be obtained based on gradual nonconvex (GNC) optimization [15] as follows:

$${\mathbf{\tilde {x}}} = \arg \mathop {\min }\limits_{\mathbf{x}} \left( {{{{\left\| {{{{\mathbf{y}}}_{t}} - {{{\mathbf{O}}}_{t}}{\mathbf{D}}{{{\mathbf{H}}}_{t}}{\mathbf{x}}} \right\|}}^{2}} + \beta \sum\limits_{c \in C} {{{V}_{c}}({\mathbf{x}})} } \right),$$
(1.20)

where β is the regularization factor. During the optimization, variable γ changes at each iteration according to the rule \({{\gamma }^{{(i + 1)}}} = k{{\gamma }^{{(i)}}}\), \(0 < k < 1\).

2. ALGORITHMS BASED ON THE USE OF OPTIMAL FILTERING METHODS (AOF)

The approach proposed by the authors of this work in [810] and developed further is based on the use of optimal linear filtering methods. The state model describing the initial sequence of images is given by the following relation: \({{{\mathbf{x}}}_{{k + 1}}} = {{{\mathbf{F}}}_{k}}{{{\mathbf{x}}}_{k}} + {{{\mathbf{u}}}_{k}},\) (2.1)where \(k = \overline {1,K} \), \({{{\mathbf{x}}}_{{k + 1}}}\), and xk are L-dimensional image vectors, \(L = {{L}_{1}}{{L}_{2}}\); \({{{\mathbf{F}}}_{k}}\) is an L × L operator defining the values of interframe shifts of the objects in the images; and uk is the L-dimensional centered Gaussian random vector with the covariance matrix Qk.

Each image xk corresponds to the observable LR image (size \({{M}_{1}} \times {{M}_{2}}\)) characterized by the M-dimensional vector yk \((M = {{M}_{1}}{{M}_{2}} < L)\). The model of an imaging system that receives an image xk should take into account the following factors [8–12]:

—shifts and displacements caused by the movement of the camera (or objects) relative to the scene;

—blur due to the scattering function of photodetectors;

—uniform decimation to match the resolution of the observed images.

The possibility of the presence of areas of closure and AN in the generated images, which lead to the replacement of the (useful) information about the observed object by extraneous (false) information not related to the object of observation, should also be taken into account. The observation model corresponding to the considered formation system is given by the following relations:

$$\begin{gathered} {{{\mathbf{y}}}_{k}} = {{{\mathbf{A}}}_{k}}({{{\mathbf{H}}}_{k}}{{{\mathbf{x}}}_{k}} + {{{\mathbf{v}}}_{k}}) + {{{\mathbf{B}}}_{k}}({\mathbf{\tilde {y}}}_{{k|k - 1}}^{{}} + {{{\mathbf{w}}}_{k}}),\quad {{{\mathbf{A}}}_{k}} + {{{\mathbf{B}}}_{k}} = {\mathbf{I}}, \\ {{{\mathbf{A}}}_{k}} = \left( {\begin{array}{*{20}{c}} {{{a}_{{k\;1}}}}&0&0 \\ 0& \ddots &0 \\ 0&0&{{{a}_{{k\;{{M}^{2}}}}}} \end{array}} \right),\quad {{{\mathbf{B}}}_{k}} = \left( {\begin{array}{*{20}{c}} {{{b}_{{k\;1}}}}&0&0 \\ 0& \ddots &0 \\ 0&0&{{{b}_{{k\;{{M}^{2}}}}}} \end{array}} \right), \\ \end{gathered} $$
(2.2)

where yk is the vector of dimension M2 corresponding to the next LR frame, which is received for processing; vk is the centered Gaussian vector of additive noise with the covariance matrix \({{{\mathbf{R}}}_{k}}\); \({{{\mathbf{F}}}_{k}} = {\mathbf{I}},\) \(k = \overline {1,K} \); \({{{\mathbf{H}}}_{k}}\) is the operator characterizing the effect of the system for forming the observed images (in meaning it corresponds to the operator \({\mathbf{D}}{{{\mathbf{H}}}_{t}}\) of model (1.16)) and takes into account all of the factors mentioned above (displacement, decimation, blurring); \({{{\mathbf{A}}}_{k}},{{{\mathbf{B}}}_{k}}\) are diagonal matrices with random elements taking the value zero or one in the case of receiving from the primary sensor useful (\({{a}_{{k\;l}}} = 1,\) \({{b}_{{k\;l}}} = 0,\) \(l = \overline {1,M} \)) or false (\({{a}_{{k\;l}}} = 0,\) \({{b}_{{k\;l}}} = 1\)) information (Ak corresponds in meaning to the operator Ot of model (1.16)); \({\mathbf{\tilde {y}}}_{{k|k - 1}}^{{}}\) is the prediction of the vector of evaluation of the observed LR image, obtained based on the a priori information about the nature of the images, taking into account the processing of the k – 1 frame; and wk is a vector with zero mean and covariance matrix \({{{\mathbf{S}}}_{k}}\) describing the deviation of the emerging false observations with respect to the vector \({\mathbf{\tilde {y}}}_{{k|k - 1}}^{{}}\).

Diagonal matrices are introduced that characterize the probabilities of the usefulness of the block’s pixels \({{{\mathbf{P}}}_{{{\text{A}}k}}}\) and \({{{\mathbf{P}}}_{{{\text{B}}k}}},\) such that \({{{\mathbf{P}}}_{{{\text{A}}k}}} + {{{\mathbf{P}}}_{{{\text{B}}k}}} = {\mathbf{I}}\). The diagonal elements of these matrices contain the probabilities of unit values \({{p}_{{akl}}} = P({{a}_{{kl}}} = 1),\) \({{p}_{{bkl}}} = P({{b}_{{kl}}} = 1)\). Matrices of pairwise probabilities of the joint emergence of useful \({{{\mathbf{P}}}_{{{\text{AA}}k}}} = \left\| {{{p}_{{aklm}}}} \right\|\) or false \({{{\mathbf{P}}}_{{{\text{BB}}k}}} = \left\| {{{p}_{{bklm}}}} \right\|\) observations for each pair of the image’s pixels, where \({{p}_{{aklm}}} = P\left( {{{a}_{{kl}}} = 1,\;{{a}_{{km}}} = 1} \right)\), \({{p}_{{bklm}}} = P\left( {{{b}_{{kl}}} = 1,\;{{b}_{{km}}} = 1} \right)\), \(l,m = \overline {1,M} \), are also introduced.

Using the obtained models, two types of algorithms can be synthesized: the optimal algorithm in the class of linear filtering algorithms and the optimal conditionally linear filtering algorithm. The class-optimal linear filtering algorithms implements the estimation using the standard relations for the weight matrices of the recurrent filter that do not depend on observations. Its capabilities are discussed in detail in [9, 10]. At the same time, it was found that this type of algorithm has insufficient efficiency for constructing SR under the influence of anomalous observations and, in particular, AN. Another approach based on the use of an optimal conditionally linear filtering algorithm, in the development of the previous one, involves the use of a posteriori information on the false observations for each of the obtained LR images. It is shown that, in such problems, the conditionally linear filter has a significantly higher accuracy of reconstructing the HR image compared to the optimal estimate in the class of linear ones.

To obtain an estimate of the HR image, the model of a Gaussian random field \({{{\mathbf{x}}}_{1}}\sim \operatorname{N} ({{{\mathbf{x}}}_{1}},{\mathbf{\tilde {x}}}_{{1|0}}^{{}},{\mathbf{P}}_{{1|0}}^{{}})\) with the given initial estimates of the HR image \({\mathbf{\tilde {x}}}_{{1|0}}^{{}}\) and its covariance matrix \({\mathbf{P}}_{{1|0}}^{{}}\) is used here. In the case of conditionally linear filtering, the equations for updating the HR image estimate \({\mathbf{\tilde {x}}}_{{k|k}}^{{}}\) when the next frame \({\mathbf{y}}_{k}^{{}}\) arrives have the following form [9]:

$${\mathbf{\tilde {x}}}_{{k + 1|k}}^{{}} = {\mathbf{F\tilde {x}}}_{{k|k}}^{{}} = {\mathbf{\tilde {x}}}_{{k|k - 1}}^{{}} + {\mathbf{W}}_{k}^{{}}({\boldsymbol{\theta }^{k}})({\mathbf{y}}_{k}^{{}} - {\mathbf{\tilde {y}}}_{{k|k - 1}}^{{}}),\quad {\mathbf{\tilde {y}}}_{{k|k - 1}}^{{}} = {\mathbf{H}}_{k}^{{}}{\mathbf{\tilde {x}}}_{{k|k - 1}}^{{}},\quad {{{\mathbf{F}}}_{k}} = {\mathbf{I}},\quad k = \overline {1,K} ,$$
$$\begin{gathered} {\mathbf{V}}_{{k\theta }}^{{}} = {\mathbf{P}}_{{k|k - 1}}^{{}}({\boldsymbol{\theta }^{{k - 1}}}){\mathbf{H}}_{k}^{T}{\mathbf{P}}_{{{\text{A}}k}}^{T}\left( {{\boldsymbol{\theta }_{k}}} \right),\quad {\mathbf{W}}_{k}^{{}}({\boldsymbol{\theta }^{k}}) = {\mathbf{V}}_{{k\theta }}^{{}}{\mathbf{U}}_{{k\theta }}^{{ - 1}}, \\ {\mathbf{U}}_{{k\theta }}^{{}} = {\mathbf{P}}_{{{\text{AA}}k\theta }}^{{}} \circ [{\mathbf{H}}_{k}^{{}}{\mathbf{P}}_{{k|k - 1}}^{{}}({\boldsymbol{\theta }^{{k - 1}}}){\mathbf{H}}_{k}^{\operatorname{T} } + {\mathbf{R}}_{k}^{{}}] + {\mathbf{P}}_{{{\text{BB}}k\theta }}^{{}} \circ {\mathbf{S}}_{k}^{{}}, \\ \end{gathered} $$
(2.3)
$${\mathbf{P}}_{{k + 1|k}}^{{}}({\boldsymbol{\theta }^{k}}) = {\mathbf{F}}_{k}^{{}}{\mathbf{P}}_{{k|k}}^{{}}({\boldsymbol{\theta }^{k}}){\mathbf{F}}_{k}^{T} + {\mathbf{Q}}_{k}^{{}} = {\mathbf{P}}_{{k|k - 1}}^{{}}({\boldsymbol{\theta }^{{k - 1}}}) - {\mathbf{W}}_{k}^{{}}({\boldsymbol{\theta }^{k}}){\mathbf{U}}_{{k\theta }}^{{}}{\mathbf{W}}_{k}^{T}({\boldsymbol{\theta }^{k}}) + {\mathbf{Q}}_{k}^{{}},$$

where vectors \({\mathbf{\tilde {x}}}_{{k|k}}^{{}},\) and \({\mathbf{\tilde {x}}}_{{k + 1|k}}^{{}}\) are an estimate of the HR image and its forecast for the next frame; \({\mathbf{P}}_{{k|k}}^{{}}\) and \({\mathbf{P}}_{{k + 1|k}}^{{}}\) are an estimate of the covariance matrix with extrapolation to the next frame; \({\boldsymbol{\theta }^{k}} = \{ {{\theta }_{1}},...,{{\theta }_{k}}\} \) is a sequence of binary vectors obtained as a result of segmentation, containing single values in the positions of the frame pixels yk that are in the areas of closure; and a record of the form \({\mathbf{A}} \circ {\mathbf{B}}\) denotes elementwise multiplication of operators A and B.

In the process of operation, the conditionally linear filter uses an additional informational component θk, which is obtained as a result of the segmentation of each incoming frame yk and manages changes in the matrix \({{{\mathbf{W}}}_{k}}({\boldsymbol{\theta }^{k}})\) of the filter’s weight coefficients from step-to-step (2.3). In order to obtain such additional information for each processed LR image, we propose to use two-stage processing based on machine learning methods and including superpixel segmentation [16, 17] followed by the clustering of superpixels using the EM algorithm.

Each superpixel is an atomic region (fragment) of the image, and all pixels included in it are considered during further processing as a single whole. The superpixel image map has a number of advantages over the conventional regular pixel grid. This is because each superpixel is a consistent unit of data, since the pixels belonging to it have similar color, brightness, and texture properties. Such properties of superpixels determine their effective use in solving problems of segmentation of objects with both known and unknown properties.

As a result of applying the EM algorithm to the resulting superpixel map, all the original pixels of the LR image are divided into two classes (useful image and AN fragments). At the same time, in the process of executing the EM algorithm, the a posteriori probabilities of the usefulness of pixels (diagonal elements \({{{\mathbf{P}}}_{{{\text{A}}k}}}\left( {{\boldsymbol{\theta }_{k}}} \right)\) and \({{{\mathbf{P}}}_{{{\text{B}}k}}}\left( {{\boldsymbol{\theta }_{k}}} \right)\)) and the binary vector θk whose unit values correspond to pixels yk not obscured by the interference.

The numerical determination of matrix elements \({\mathbf{P}}_{{{\text{AA}}k\theta }}^{{}}\) or false \({\mathbf{P}}_{{{\text{BB}}k\theta }}^{{}}\) based on averaging elements θk, corresponding to pixels in the vicinity of each element of the LR image, for which single and zero values, respectively, are contained in θk. Let the matrix \({{{\mathbf{C}}}_{{{\text{A}}ki}}}\) have values of the elements θk in the surrounding area of the ith unit element θk; and the matrix \({{{\mathbf{C}}}_{{{\text{B}}kj}}}\), values \(1 - \theta {}_{k}\) in the surrounding area of the jth zero element θk. The matrix strings \({\mathbf{P}}_{{{\text{AA}}k\theta }}^{{}}\) and \({\mathbf{P}}_{{{\text{BB}}k\theta }}^{{}}\) correspond to vector scans of matrices \({{{\mathbf{\bar {C}}}}_{{{\text{A}}ki}}}\) and \({{{\mathbf{\bar {C}}}}_{{{\text{B}}kj}}}\), calculated over all matrices \({{{\mathbf{C}}}_{{{\text{A}}ki}}}\) and \({{{\mathbf{C}}}_{{{\text{B}}kj}}}\): \({{{\mathbf{\bar {C}}}}_{{{\text{A}}ki}}} = \mathop {\text{M}}\limits_i \left\{ {{{{\mathbf{C}}}_{{{\text{A}}ki}}}} \right\}\) and \({{{\mathbf{\bar {C}}}}_{{{\text{B}}kj}}} = \mathop {\text{M}}\limits_j \left\{ {{{{\mathbf{C}}}_{{{\text{B}}kj}}}} \right\}.\)

3. SYNTHESIS OF ALGORITHMS FOR OPTIMAL FILTERING IN AN ADAPTIVE SETTING (AOFA)

The implementation of SR algorithms involves significant computational and time costs. In order to achieve the highest processing speed, the stages of segmentation (determining parts of the useful image and areas of AN localization) and recording (determining the parameters of interframe shifts) are carried out separately, based only on the LR frames. However, it is obvious that the greatest processing efficiency can be achieved when these steps are carried out in conjunction with the procedure for increasing the resolution, taking into account the uncertainty regarding some processing parameters. This makes it possible to select the parameters of the SR algorithm that provide the best (according to the given criterion) result for evaluating the HR image.

In this approach, in models (2.3), it is proposed to introduce the dependence on the random vector δ, whose components correspond to the processing parameters of the kth frame. For example, vector δ may contain the values of the parameters of interframe shifts, the dispersion of the blur due to the scattering function of the photodetectors, or the parameters of the segmentation algorithm of the kth frame.

As an example, a variant with the selection of shift parameters is considered. Moreover, vector δ contains the parameters of the affine transform that approximate interframe shifts in HR \(\boldsymbol{\delta} = {{(\Delta x,\Delta y,\Delta \theta )}^{{\text{T}}}}\). The range of possible values for \(\Delta x,\Delta y,\) and Δθ are assumed to be known and discretized by a lattice of samples containing Nδ nodes, for each of which the probability of realization \({{p}_{{\delta i}}}\) is set. When the next frame arrives, for each possible value of the vector of unknown parameters \(\boldsymbol{\delta} = {\mathbf{d}}_{i}^{{(k)}}\), \(i = \overline {1,{{N}_{\delta }}} ,\) the conditional estimates of the HR image \({\mathbf{\tilde {x}}}_{{k|k}}^{{}}({\mathbf{d}}_{i}^{{(k)}})\) are calculated based on relations (2.1)–(2.3), where the conditional operator \({\mathbf{H}}_{k}^{{}} = {\mathbf{H}}_{k}^{{}}({\mathbf{d}}_{i}^{{(k)}})\)given based on the ith possible vector value \({\mathbf{d}}_{i}^{{(k)}}\) is used.

Using the conditional ratings \({\mathbf{\tilde {x}}}_{{k|k}}^{{}}({\mathbf{d}}_{i}^{{(k)}})\), the a posteriori densities \(P({\mathbf{d}}_{i}^{{(k)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}})\) are determined:

$$P({\mathbf{d}}_{i}^{{\left( k \right)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}}) = \frac{{P({{{\mathbf{y}}}_{k}}\,{\text{|}}\,{\mathbf{d}}_{i}^{{\left( k \right)}}){{p}_{{\delta i}}}}}{{\sum\limits_{i = 1}^{{{N}_{\delta }}} {(P({{{\mathbf{y}}}_{k}}\,{\text{|}}\,{\mathbf{d}}_{i}^{{\left( k \right)}}){{p}_{{\delta i}}})} }},$$
$$P({{{\mathbf{y}}}_{k}}\,{\text{|}}\,{\mathbf{d}}_{i}^{{\left( k \right)}}) = {{({{(2\pi )}^{{{{M}^{2}}}}}{\text{|}}{\mathbf{U}}_{k}^{{}}({\mathbf{d}}_{i}^{{\left( k \right)}}){\text{|}})}^{{ - \frac{1}{2}}}}exp\left( { - \frac{1}{2}e({\mathbf{d}}_{i}^{{\left( k \right)}})} \right),$$
(3.1)
$$e({\mathbf{d}}_{i}^{{\left( k \right)}}) = {{({\mathbf{y}}_{{k|k - 1}}^{{}} - {\mathbf{H}}_{k}^{{}}({\mathbf{d}}_{i}^{{\left( k \right)}}){\mathbf{\tilde {x}}}_{{k|k - 1}}^{{}})}^{T}}{\mathbf{U}}_{k}^{{ - 1}}({\mathbf{d}}_{i}^{{\left( k \right)}})({\mathbf{y}}_{{k|k - 1}}^{{}} - {\mathbf{H}}_{k}^{{}}({\mathbf{d}}_{i}^{{\left( k \right)}}){\mathbf{\tilde {x}}}_{{k|k - 1}}^{{}}).$$

The unconditional estimates \({\mathbf{\tilde {x}}}_{{k|k}}^{{}}\) relative to vector δ and the corresponding covariance matrices \({\mathbf{P}}_{{k|k}}^{{}}\) are calculated as the weighted sums of the conditional estimates:

$${\mathbf{\tilde {x}}}_{{k + 1|k}}^{{}} = {\mathbf{\tilde {x}}}_{{k|k}}^{{}} = \sum\limits_{i = 1}^{{{N}_{\delta }}} {{\mathbf{\tilde {x}}}_{{k|k}}^{{}}({\mathbf{d}}_{i}^{{\left( k \right)}})P({\mathbf{d}}_{i}^{{\left( k \right)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}})} ,$$
$${\mathbf{P}}_{{k|k}}^{{}} = \sum\limits_{i = 1}^{{{N}_{\delta }}} {{\mathbf{P}}_{{k|k}}^{{}}({\mathbf{d}}_{i}^{{\left( k \right)}})P({\mathbf{d}}_{i}^{{\left( k \right)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}})} + \sum\limits_{i = 1}^{{{N}_{\delta }}} {{\mathbf{C}}_{k}^{{}}P({\mathbf{d}}_{i}^{{\left( k \right)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}})} ,\quad {\mathbf{P}}_{{k + 1|k}}^{{}} = {\mathbf{P}}_{{k|k}}^{{}} + {\mathbf{Q}}_{k}^{{}},$$
(3.2)
$${\mathbf{C}}_{k}^{{}} = ({\mathbf{\tilde {x}}}_{{k|k}}^{{}}({\mathbf{d}}_{i}^{{\left( k \right)}}) - {\mathbf{\tilde {x}}}_{{k|k}}^{{}}){{({\mathbf{\tilde {x}}}_{{k|k}}^{{}}({\mathbf{d}}_{i}^{{\left( k \right)}}) - {\mathbf{\tilde {x}}}_{{k|k}}^{{}})}^{T}}.$$

In expression (3.2), the likelihood functional \(P({{{\mathbf{y}}}_{k}}\,{\text{|}}\,{\mathbf{d}}_{i}^{{(k)}})\) is calculated taking into account the error \(e({\mathbf{d}}_{i}^{{(k)}})\) between the frame yk and projections \({\mathbf{\tilde {x}}}_{{k|k - 1}}^{{}}\) on the LR grid for a specific value of the vector of the shift parameters \(\boldsymbol{\delta} = {\mathbf{d}}_{i}^{{(k)}}\). As information is accumulated, those a posteriori probabilities \(P({\mathbf{d}}_{i}^{{(k)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}})\), that are close to the true value of the undefined parameter δ, increase. At the same time, the resulting estimate with a large weight uses conditional filters that reflect the true value of parameter δ.

The features of the implementation of the proposed process of adaptive information processing are as follows. In order to increase the speed and reduce the complexity of the computations, in particular, to eliminate the need for inversion and other operations for large matrices, it is proposed to implement image processing with overlapping small blocks. Images are divided into N1 and N2 blocks vertically and horizontally, respectively. Upon completion of processing, the image is formed from the central (nonoverlapping) fragments of the blocks. The size of HR image blocks: \({{s}_{{\operatorname{HE} }}} \times {{s}_{{\operatorname{HE} }}}\), where \({{s}_{{\operatorname{HE} }}} = {{s}_{\operatorname{H} }} + 2\Delta {{s}_{\operatorname{H} }}\) and \({{s}_{\operatorname{H} }} = {{L}_{1}}{\text{/}}{{N}_{1}}\) = L2/N2; and LR blocks: \({{s}_{{\operatorname{LE} }}} \times {{s}_{{\operatorname{LE} }}}\), where \({{s}_{{\operatorname{LE} }}} = {{s}_{\operatorname{L} }} + 2\Delta {{s}_{\operatorname{L} }}\) and \({{s}_{\operatorname{L} }} = {{M}_{1}}{\text{/}}{{N}_{1}} = {{M}_{2}}{\text{/}}{{N}_{2}}\). Options \(\Delta {{s}_{\operatorname{H} }}\) and \(\Delta {{s}_{\operatorname{L} }}\) determine the size of the blocks’ overlap (the width of the area in which the pixel values are interdependent). It should be noted that superpixel segmentation is carried out for the entire image as a whole.

This approach to processing provides the ability to flexibly adjust the complexity of the computations. First, in the process of calculating the estimate of the conditionally linear filter (3.1), only those image blocks for which at least one element \(\boldsymbol{\theta } _{k}^{{pq}}\) (\(p = \overline {1,{{N}_{1}}} ,\) \(q = \overline {1,{{N}_{2}}} \)) contains a single value (at least one item is useful) are corrected. Second, in order to calculate the a posteriori densities of the adaptive filter \(P({\mathbf{d}}_{i}^{{(k)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}})\), only a few (\({{N}_{b}} \ll {{N}_{1}}{{N}_{2}}\)) informative blocks of the frame can be used—the most detailed blocks, not obscured by the AN, which can be selected, for example, according to the maximum average magnitude of the gradient of the block’s pixels. In this case, the error values \(e({\mathbf{d}}_{i}^{{(k)}})\) for different blocks of one frame are assumed to be independent [11].

Also, instead of the weighted sum of the conditional estimates, the conditional estimate that maximizes the a posteriori density \(P({\mathbf{d}}_{i}^{{(k)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}})\) can be chosen as \({\mathbf{\tilde {x}}}_{{k|k}}^{{pq}}\). Such an economical adaptive filtering scheme is slightly inferior to the classical scheme in accuracy, but significantly surpasses it in terms of speed due to the use of a directed selection of possible values of the vector of unknown parameters δ.

In the implementation of block processing and the techniques mentioned above, expressions (3.1) and (3.2) have the form

$$P({{{\mathbf{y}}}_{k}}\,{\text{|}}\,{\mathbf{d}}_{i}^{{\left( k \right)}}) = {{\left( {{{{\left( {2\pi } \right)}}^{{{{N}_{b}}s_{{\operatorname{LE} }}^{2}}}}\prod\limits_{r = 1}^{{{N}_{b}}} {\left| {{\mathbf{U}}_{k}^{r}({\mathbf{d}}_{i}^{{\left( k \right)}})} \right|} } \right)}^{{ - \frac{1}{2}}}}\exp\left( { - \frac{{e({\mathbf{d}}_{i}^{{\left( k \right)}})}}{2}} \right),$$
$$\begin{gathered} e({\mathbf{d}}_{i}^{{\left( k \right)}}) = \sum\limits_{r = 1}^{{{N}_{b}}} {{{{({\mathbf{y}}_{k}^{r} - {\mathbf{H}}_{k}^{r}({\mathbf{d}}_{i}^{{\left( k \right)}}){\mathbf{\tilde {x}}}_{{k|k - 1}}^{r})}}^{T}}{\mathbf{U}}{{{_{k}^{{(r)}}}}^{{ - 1}}}({\mathbf{d}}_{i}^{{\left( k \right)}})({\mathbf{y}}_{k}^{r} - {\mathbf{H}}_{k}^{r}({\mathbf{d}}_{i}^{{\left( k \right)}}){\mathbf{\tilde {x}}}_{{k|k - 1}}^{r})} , \\ {\mathbf{\tilde {x}}}_{{k|k}}^{{pq}} = {\mathbf{\tilde {x}}}_{{k|k}}^{{pq}}({\mathbf{\tilde {d}}}_{i}^{{\left( k \right)}}),\quad {\mathbf{P}}_{{k|k}}^{{pq}} = {\mathbf{P}}_{{k|k}}^{{pq}}({\mathbf{\tilde {d}}}_{i}^{{\left( k \right)}}),\quad {\mathbf{\tilde {d}}}_{i}^{{\left( k \right)}} = \mathop {\arg \max }\limits_i P({\mathbf{d}}_{i}^{{\left( k \right)}}\,{\text{|}}\,{{{\mathbf{y}}}_{k}}), \\ \end{gathered} $$
(3.3)
$$r = \overline {1,{{N}_{b}}} ,\quad p = \overline {1,{{N}_{1}}} ,\quad q = \overline {1,{{N}_{2}}} .$$

4. RESULTS OF EXPERIMENTAL STUDIES

Each of the algorithms considered above was implemented in the Matlab R2019b environment, taking into account the possibilities of block image processing [10], which reduces the dimension of the processed and inverted matrices.

For ABSGM, AOF, and AOFA, the initial estimate is required. In the implemented algorithms, two methods were considered for the formation of the initial estimate:

—the first LR image from the original sequence (sequential increase in resolution when new observations arrive);

—averaged LR image taking into account interframe displacements and localized AN.

In both cases, the resolution of the original LR image was increased using the VDSR one-frame upscaling algorithm [18], based on deep learning and originally built into the Matlab environment.

For an experimental comparison of the approaches and algorithms given above for constructing a SR, 12 sets of color LR images of a size of 128 × 128 from 20 frames each were formed. In order to create each of the sets, LAC of AN (spots of random shape) were generated in the form of a forming binary mask capturing the HR image. Further, the false observations themselves were formed as realizations of a random field with the given parameters and their displacement was carried out with the original HR image in the created binary mask. After that, the resulting image was cropped to create the first LR frame. During the formation of each of the next LR frames, these actions were repeated, while the movement of the camera with a random shift and rotation relative to the previously obtained image was simulated, as was the movement of the areas affected by the AN, by generating a displacement and rotation of the forming binary mask.

Examples of the initial LR images with a different AN localization and used initial estimates are shown in Fig. 1.

Fig. 1.
figure 1

Examples of the original LR image (a), the initial estimate in the form of the first image of the processed sequence (b), the averaged initial estimate with VDSR (c), and several images of the processed LR sequence exposed to AN (d).

In the course of the experiments, the resolution of the original images was doubled. Figures 2 and 3 show examples of the HR images obtained at the output. A qualitative analysis of the results shows that the ABSGM and ABSGMIM are worse at eliminating applicative interference than their analogs. ABMMRF performs this task somewhat better, but its results depend on the initial assessment and are less detailed compared to the results of AOF and AOFA.

Fig. 2.
figure 2

Examples of HR images obtained based on the initial assessment in the form of the first LR image, and their enlarged fragments: ABSGM (a); ABSGMIM (b); ABMMRF (c); AOF (d); AOFA (e).

Fig. 3.
figure 3

Examples of HR images obtained based on the averaged initial estimate with VDSR, and their enlarged fragments: ABSGM (a); ABSGMIM (b); ABMMRF (c); AOF (d); AOFA (e).

For a numerical comparison of the results in terms of the quality of the obtained HR images, the following indicators were used: the peak signal-to-noise ratio (PSNR, the higher the better), the structural similarity index (SSIM, the higher the better) [19], and the image quality index (NIQE the lower the better) [20]. The averaged values of these criteria for the entire group of processed images are given in Tables 1 and 2.

Table 1. The accuracy of restoring the original HR image in the case of using the first LR image as the initial estimate
Table 2. The accuracy of restoring the original HR image in the case of using the averaged initial estimate

The data obtained show that the algorithm based on the use of conditionally linear filtering in adaptive processing restores the original HR image more accurately than its analogs. At the same time, there is a slight superiority of almost all the algorithms when implementing the second method of forming the initial estimate based on averaging all frames and applying the VDSR algorithm. Nevertheless, the use of the initial estimate based on the first frame in combination with the use of AOFA makes it possible to obtain practically the same quality while maintaining the possibility of implementing processing in the mode of sequential information accumulation.

CONCLUSIONS

This article is devoted to the problem of constructing a multiframe SR in conditions of anomalous observations of an applicative nature. The well-known algorithms based on the use of various approaches are considered: models of spin glasses, models of Markov random fields, and models of optimal linear and conditionally linear filtration within the framework of the proposed approach. The optimal conditionally linear filter of a sequence of LR images in conditions of AN in an adaptive setting is synthesized. An experimental comparison of all the algorithms given above based on synthetic and real sets of test images, which showed the advantage of algorithms based on the use of conditionally linear optimal filtering of a sequence of LR images and a two-stage procedure for processing each such image using superpixel image segmentation, was carried out.