1. Introduction

Imaging systems (sensors) are widely used in various applications such as remote sensing, non-destructive control, medical diagnostics, photography, etc. [1]. Some acquired images need a special pre-processing (e.g., denoising, deblurring, edge detection, compression [24], etc.) for further exploitation (e.g., object recognition, visual inspection, diagnostics, etc.). In order to enhance an image using, e.g., modern image denoising methods based on wavelets [57] or discrete cosine transform (DCT) [8, 9] transforms, one has to know a noise type and its basic characteristics such as probability density function (PDF), variance, or two-dimensional (2D) spatial correlation function (if the observed noise is not independent and identically distributed (i.i.d.)). Proper thresholds in edge detection and image segmentation depend on noise statistics as well [1, 10]. In lossy image compression, a quantization step has to be adaptively adjusted depending on noise variance [11].

A practical problem is that a priori information on noise type and basic characteristics is not always available. Although there are such applications as synthetic aperture radar (SAR) imaging with known number of looks and image forming mode for which speckle characteristics can be accurately predicted [10], it appears a more practical situation when noise characteristics are fully or partly unknown (unavailable). For example, for color images acquired by digital cameras, noise properties are determined by camera settings, illumination conditions, and other factors [9, 1113]. Similarly, noise characteristics might be considerably different in sub-band images of multi- and hyperspectral remote sensing data acquired from airborne and space-borne platforms [1416].

Then, a task of blind estimation of noise characteristics for each particular image subject to further processing, in particular, denoising [9, 11, 12] becomes of prime importance. General requirements to the methods intended for blind (automatic) estimation of additive and multiplicative noise variance can be found in [8, 17]. Clearly, it is desirable to provide almost unbiased estimates with estimation variance as small as possible. It is also needed to ensure applicability of a method to images of different content (including highly textural ones) and different noise levels (intensities) including non-intensive noise. A blind estimation method is practical if it is fast enough.

Moreover, for the methods of pure additive or multiplicative noise variance estimation, it has been established that the estimation relative error in practice should not be larger than ±20% [8, 18]. If an obtained variance estimate is outside this limit, under- or oversmoothing takes place in image denoising based on blind estimation result.

With a new generation of sensors, signal-dependent noise models have become more popular since they describe noise statistics better [8, 9, 1113, 15, 16, 19, 20]. There is an essential interest in design and testing the methods for blind estimation of mixed noise parameters [9, 12, 15, 16, 21, 22]. However, to our best knowledge, it has not been studied yet how accurate have to be these estimates. It is worth mentioning that dependence of noise variance on signal (local mean in images) can be set parametrically, i.e., by some polynomial [9, 12, 15, 22]. To avoid difficulties of polynomial order choice, we consider below a simplest case of mixed noise where SD and SI components are characterized by one parameter each. This model is typical in raw data in digital photos [9, 11], sub-band images of hyperspectral remote sensing data [15, 16] and radar images formed by multi-look SARs and side-look aperture radars [21, 22].

Being applied at the first stage of image processing chain [8, 9, 11, 12], blind estimation of noise parameters provides data for next stages. Therefore, accuracy of blind estimation has to be analyzed in the combination with efficiency of further image processing. Since image filtering is one of the most common operations (stages) of image processing chain, we have decided to carry out our analysis just from the viewpoint of filtering efficiency. We consider estimation accuracy acceptable if estimation errors that always happen in practice do not produce essential reduction of image denoising efficiency. In the next sections of the paper, we give quantitative definition of what is essential. We characterize image filtering efficiency not only by the conventional criteria, such as peak signal-to-noise ratio (PSNR) (or mean square error (MSE)), but also using metrics that describe image visual quality [23]. This is important since for many considered applications, visual quality of filtered images has a great importance (e.g., digital photography, medical imaging). The main goal of this paper is to give practical recommendations (requirements) on accuracy of mixed noise parameters estimation to ensure efficient image filtering.

2. Noise models and estimated parameters

In general, noise is considered signal-dependent if its statistical characteristics (variance, PDF) depend upon information signal (image). A typical example of signal-dependent noise is Poisson noise for which variance is equal to the true value of image pixel σ sd 2 = I tr and noise PDF changes. It is close to Gaussian for large true values but considerably differs from Gaussian if Itr is small (less than 12). Other examples of signal-dependent noise are film-grain noise and speckle [24, 25]. For all these cases, a dependence of signal-dependent noise variance on true value σ sd 2 = f I tr is monotonically increasing (although other character of this dependence is, in general, possible).

The aforementioned examples relate to the cases when there is a single source of noise. However, noise can originate from several different sources, for example, image pixel generation (photon counting, coherent processing of registered signals) and circuitry (thermal) noise [19]. Then, one deals with a mixed noise, such as mixed additive and impulse noise [26, 27] where impulse noise usually originates from coding/decoding errors at image transmission via communication channels. However, here, we address another type of mixed noise that has SI and SD components. A prominent example is noise in modern imaging sensors namely dark noise, thermal noise, and photon-counting noise [19].

Here, we assume that noise sources are independent, and consider two typical models of the mixed noise consisting of two components, SI and SD. For the first model, noise variance depends on image true value as

σ sd 2 = σ si 2 + kI tr ,
(1)

where σ si 2 is variance of SI noise, e.g., of dark current noise and k is a proportionality factor for SD component [9, 15]. The model is valid for raw images acquired by digital cameras and sub-band images formed by hyperspectral sensors.

For the second model, typical for radar images [28], one has

σ sd 2 = σ si 2 + k I tr 2 .
(2)

In both cases, we have dependencies that are fully described by two parameters, σ si 2 and k, that have to be estimated in a blind manner and used in filtering (we assume that we know a priori which model, (1) or (2), fits the data). Moreover, in both cases, it is possible to find such I t tr that for I tr > I t tr the SD component is dominant and vice versa. It is also possible to determine which component is dominant for an entire considered test image. For this purpose, one has to calculate

MSE inp = σ si 2 + k i = 1 I IM j = 1 J IM I ij tr / I IM J IM
(3)

for the model (1) and

MSE inp = σ si 2 + k i = 1 I IM j = 1 J IM I ij tr 2 / I IM J IM
(4)

for the model (2), where IIM, JIM define image size. If MSEinp is twice larger than σ si 2 , the impact of SD noise can be considered prevailing (the SD noise is considered dominant) and vice versa. Note that MSEinp can be also treated as equivalent variance of the noise in original images.

Such preliminary analysis can be useful since both situations can be met in practice. For example, SI noise is usually assumed to be dominant for sub-band images formed by old generation sensors, e.g., AVIRIS [14, 15] while SI is prevailing for new generation sensors [15, 16]. Thus, it is desirable to study both situations in our further analysis.

It is also supposed that noise is spatially uncorrelated. Although this is often not true in practice, this assumption simplifies our analysis. In the future, we plan to carry out similar analysis for spatially correlated noise as well.

3. Denoising method and quantitative criteria

There is a great number of denoising methods, especially for processing color images (note that just color image case for which enhancement is of great value will be analyzed below) [27, 29, 30]. However, here, we are not interested in image denoising methods that do not exploit information on noise type and statistics in their operation. Instead, we have to focus on filtering methods able to adapt to quite complex types of signal-dependent noise [8, 9, 31] described by the models (1) and (2).

Then, there are two possible approaches. The first one is to apply a variance-stabilizing transformation [32, 33] that converts an image corrupted by a given type of signal-dependent noise to an image corrupted by an additive noise and then to apply a filter designed to suppress an additive noise. For the model (1), this can be done by, e.g., generalized Anscombe transformation [33].

The second possibility is to apply denoising methods that can be adapted to signal-dependent nature of the noise. Since the DCT-based filtering allows to do this easily [34, 35], we concentrate below on this method of denoising. The DCT-based filtering [34, 35] is carried out in blocks of a limited support, usually 8 × 8 pixels. This feature of the DCT-based filtering allows easy adaptation to non-stationary (signal-dependent) properties of the noise. Direct DCT is performed in each block. Then, a thresholding of DCT coefficients (hard, soft, or combined [36]) is performed for all coefficients except DC. Here, we focus on hard thresholding since it is simple and the most efficient in terms of provided output PSNR [36].

For signal-dependent noise, absolute values of DCT coefficients in an 8 × 8 block, D(n, m, q, s), have to be compared with a threshold T n , m = β f I ¯ nm , where indices n,m define a block upper left corner, q = 0,..,7, s = 0,…,7 are indices of DCT coefficients in 8 × 8 blocks, I ¯ nm is nm-th block local mean, and β is a constant. Hard thresholding operation is resulting in assigning to zero those coefficients (D(n, m, q, s), q = 0,…,7, s = 0,…,7 (except DC component with q = 0 and s = 0) which absolute value is below the threshold: |D(n, m, q, s)| < T(n, m), and keeping all others unchanged (for each possible position of a block). After this, an inverse DCT is applied to the thresholded DCT coefficients in each block. Here, we consider the DCT-based filtering with full overlapping (shifting by one position to the next window), thus, multiple denoised (filtered) values are obtained for each image pixel (except those ones placed at the image corners). These multiple denoised values of the same pixel coming from different windows are averaged. This procedure is similar to a translation invariant wavelet shrinkage, where instead of block DCTs, wavelet transform of an image and all possible shifted version of it are performed. This allows to improve a denoising performance and to diminish blocking artifacts with respect to the case of filtering performed in non-overlapping blocks. Note also, that in the case of fully overlapping blocks, DCT-based filtering efficiency is close to that of the state-of-the-art filters [23] such as, e.g., BM3D [37]. This is one more reason why the DCT-based filtering was selected in our analysis.

Usually, the value of β recommended in hard thresholding is 2.6 [36] although one can find slightly different recommendations [38] which we will discuss later.

Then, for the spatially uncorrelated signal-dependent noise models (1) and (2), one gets

T n , m = 2.6 σ si 2 + k I ¯ nm
(5)

and

T n , m = 2.6 σ si 2 + k I ¯ nm 2 ,
(6)

respectively. In practice, if σ si 2 and k are estimated, the obtained estimates of these parameters are to be used in (5) and (6).

A traditional approach to filter efficiency characterization and comparison to other filters is to calculate and analyze output MSE or PSNR. In this paper, we mainly consider R component of RGB color image. Results for other color components are in good agreement with those for the R component denoising (this will be shown by particular examples).

Since we deal with color images (and their components) that are usually intended for visual inspection, visual quality of processed images is of a value. Thus, alongside with the conventional output PSNR (PSNR = 10 log 10(2552/MSEout) where MSEout denotes MSE after denoising), it is worth using visual quality metrics.

In our analysis, we have used two quality metrics inspired by human visual system (HVS). A first one is the recently proposed metric PSNR human visual system masking (PSNR-HVS-M) [39] (available at [40] and defined as PSNR-HVS-M = 10 log 10(2552/MSEoutHVS), where MSEoutHVS is a specific MSE determined in DCT domain that takes into account such peculiarities of human vision system as different sensitivity to distortions in different spatial frequencies and masking effects). This metric is among the best in characterizing visual quality of images corrupted by a noise as well as images with distortions due to filtering and compression [41]. PSNR-HVS-M is intended to assess visual quality of both gray-scale and color images. Similarly to the conventional PSNR, the metric PSNR-HVS-M is expressed in decibels and larger values correspond to better visual quality. If PSNR-HVS-M exceeds 40 dB, distortions can be hardly noticed [42].

Another widely used HVS metric is multi-scale structural similarity (MSSIM) [43]. It takes into account human’s ability to adapt to a structural similarity at different scales and HVS sensitivity to distortions of luminance and contrast. The metric values vary from 0 (extremely bad quality) to 1 (perfect or ideal quality). As it is seen, this metric has another range of variation where the value 0.99 corresponds to practical invisibility of distortions if they are spread over entire image [42]. While PSNR-HVS-M exploits discrete cosine transform in blocks for its calculation, MSSIM is based on wavelets. Thus, we can expect that if analysis of both metrics leads to drawing similar conclusions, then these conclusions will be well grounded.

Note that performance analysis of filtering efficiency is usually carried out under an assumption that noise characteristics are known in advance (or accurately pre-estimated) and filter parameters are set according to certain recommendations [6, 9, 31]. Recall that, in our case, the thresholds for blocks are set to T n , m = β σ ^ si 2 + k ^ I ¯ nm or T n , m = β σ ^ si 2 + k ^ I ¯ nm 2 depending on the considered model where σ ^ si 2 and k ^ are model parameter estimates obtained by some technique. Thus, all three parameters, β, σ ^ si 2 and k ^ , can influence filter performance. Besides, image properties have also an impact on efficiency of filtering.

Assume that statistical characteristics of the noise are not known in advance. Properties of image signal component (for example, its spatial spectrum in DCT or wavelet domain) are also unknown since an observed image is noisy. Thus, to partly simplify the situation, we have, at least, to set the parameter β fixed. Let us demonstrate that setting β equal to 2.6 is a good choice for the considered hard threshold DCT filter with full overlapping of blocks.

To demonstrate this, we have considered several sets (combinations) of the parameters σ si 2 and k for the model (1) to simulate the cases of dominant additive noise and dominant signal-dependent noise with different intensities. Besides, we have carried out tests for 25 color images from the database TID2008 [41]. It contains noise-free images where there are 24 images of natural scenes (Kodak images) and one (the 25th) is artificially created (see Figure 1). This allows simulating noisy images with pre-determined statistical characteristics of the mixed noise described by the considered models (1) and (2) easily.

Figure 1
figure 1

Noise-free color images in TID2008.

The optimal values of β that provide minimal output MSE (maximal output PSNR) for the red component are presented for three sets of the parameters σ si 2 and k for the model (1) in Figure 2a (the corresponding dependences for other color components are very similar). The model parameters are set so that either additive noise is prevailing (k = 0.2; σ si 2 = 50), or signal-dependent noise is dominant (k = 1; σ si 2 = 30), or impact of both noise components is comparable (k = 0.2, σ si 2 = 10).

Figure 2
figure 2

Dependences of optimal β on image index in the TID2008 database. For three sets of mixed noise parameters according to the metrics PSNR (a) and MSSIM (b).

The observed tendencies are the following. First, the average value of β is really about 2.6 for all three sets of the noise parameters. Second, there are complex structure images (e.g., the test images #1 and #13) for which optimal β are slightly smaller than 2.6. There are also simple structure images (#3, 7, 20, 23, 25) for which the optimal β can be slightly larger than 2.6, especially if noise is quite intensive (e.g., k = 1 and σ si 2 = 30).

Dependences for the metric MSSIM (Figure 2b) are very similar. The only difference is that optimal values of β are by about 3% smaller than for the corresponding cases in Figure 2a. This tendency has been earlier observed in [38]. It can be explained by the fact that by setting a slightly smaller β one provides better edge/detail/texture preservation while noise suppression in homogeneous image regions becomes worse. This is just the case when a filtered image is perceived as having better visual quality. Dependences of optimal β for another HVS metric, PSNR-HVS-M (not presented in the paper), are similar to those ones for PSNR. The difference is again in smaller optimal β similarly to MSSIM.

For more detailed analysis, we have further concentrated on two color images from the database TID2008 [41], namely the test image #3 (one of the simplest) and the test image #13 (the most complex one) (see Figure 1) since, according to our previous experience [17, 18], just these marginal cases determine basic requirements. The test image #25 has not been chosen since it is artificial and we are more interested in enhancing natural scene images. The test image #20 has not been used for the analysis since clipping (overexposure) effects occur for it in bright upper region that corresponds to sky.

Our idea is that a joint analysis of filtering efficiency for these two images carried out for different noise parameters’ sets can produce initial insight on basic requirements to an accuracy of blind estimation of noise parameters σ si 2 and k. We assume that these requirements will be correct for the majority of real-life images. At the end of the paper, we will check validity of these requirements for two extreme cases and for entire database of images corrupted by mixed noise with different sets of parameters.

Thus, we assume that the DCT-based filter parameter β is set to be equal to 2.6 but the estimates of σ si 2 and k obtained by some technique and then used in filtering can be erroneous. To characterize these estimates, let us use the parameters

δ V = σ si 2 - σ si 2 / σ si 2 and δ k = k - k / k ,
(7)

where σ ^ si 2 and k are estimates of σ si 2 and k, respectively. Then, δ V  = 0 and δ k  = 0 correspond to true values of σ si 2 and k; δ V  = - 1 relates to the case when the additive noise component is absent. Similarly, δ k  = -1 corresponds to assumption that the signal-dependent noise component is absent and the present noise is pure additive. The case δ V  = -1 and δ k  = -1 relates to the case of no filtering applied, i.e., to a noisy image.

In our experiments, we have analyzed δ V and δ k ranging from -1 to +1 (from -100% to +100%), that is for δ V  = 1 and/or δ k  = 1, it is assumed that the estimates of σ ^ si 2 and k ^ are twice (by 100%) larger than the true values of these parameters. According to our experience [22], practical estimates are rarely outside these limits.

An important question is what is considerable (essential) impact of errors in parameter estimation on filtering efficiency? Our proposition is to consider that impact is essential if PSNR or PSNR-HVS-M reduction is more than 0.5 dB compared to PSNR or PSNR-HVS-M observed for δ V  = 0 and δ k  = 0, i.e., for perfect (recommended) settings. The value 0.5 dB is selected since such a difference in output PSNR or PSNR-HVS-M is noticeable if filtered images are visually inspected together (compared). This statement follows from our experience in creation and exploitation of the database TID2008 [41] (e.g., the difference equal to 3 dB for a given filtered image is easily recognized by any observer). To partly prove this, Figure 3 presents a fragment of the noise-free color image #13 (Figure 3a), its noisy version for the red components (model (1), σ si 2 = 50 and k = 0.2, Figure 3b), the filtered image fragment under assumption that one has accurate estimates of the noise parameters (δ V  = 0 and δ k  = 0, Figure 3c) and the same fragment in the case of erroneous estimates (δ V  = 0.4 and δ k  = 0.3, i.e., noise parameters are both overestimated, Figure 3d). For the latter two images, the values of all three metrics are presented. Reduction of PSNR and PSNR-HVS-M for the image in Figure 3d compared to the image in Figure 3c is about 0.5 dB. Due to overestimation of both parameters of the mixed noise, oversmoothing is observed. It mainly appears itself in smearing low contrast texture (bushes) in the central part of the picture in Figure 3d. The reduction of the metric MSSIM is observed as well. For the case considered in Figure 3, reduction is equal to 0.004.

Figure 3
figure 3

The fragment of noise-free test image #13 and filtered results for accurate and erroneous estimates. The fragment of noise-free test image #13 (a), its noisy (k = 0.2; σ si 2 = 50) (b) version for red component and filtered results for accurate estimates (δ V  = 0; δ k  = 0) (c) and for erroneous estimates (δ V  = 0.4; δ k  = 0.3) (d) of the noise parameters.

Besides, Figure 4 represents images for another case. The simple structure image #3 (its noise-free color version is presented in Figure 4a) is corrupted by rather intensive signal-dependent noise with σ si 2 = 30 and k = 1.0 (Figure 4b, red component). Noise is well visible, especially in image homogeneous regions. Figure 4c presents the denoised image under assumption that mixed noise parameters are estimated absolutely accurately. Meanwhile, Figure 4d represents the output image obtained in the case of underestimation of mixed noise parameters. As it is seen, underestimation leads to residual noise that appears itself clearly in image homogeneous regions. Again, reduction of PSNR and PSNR-HVS-M is close to 0.5 dB while decrease of MSSIM is close to 0.005.

Figure 4
figure 4

The fragment of noise-free test image #03 and filtered results for accurate and erroneous estimates. The fragment of noise-free test image #03 (a), its noisy (k = 1.0; σ si 2 = 30) (b) version for red component and filtered results for accurate estimates (δ V  = 0; δ k  = 0) (c) and for erroneous estimates (δ V  = -0.5; δ k  = -0.2) (d) of the noise parameters.

Thus, both underestimation and overestimation of the mixed noise parameters are undesirable. Underestimation is crucial for simple structure images and overestimation is undesired for complex structure images.

We have checked other test images and other noise parameter sets. Analysis carried out for all images in TID2008 has shown that reduction of PSNR and PSNR-HVS-M by about 0.5 dB approximately corresponds to MSSIM reduction by 0.005. Note that all three dependences are nonlinear and there is no strict relationship between them. The only observation is that if PSNR-HVS-M decreases, MSSIM usually diminishes as well and vice versa.

Thus, our approach to analysis consists in the following. The first task is to determine 2D areas of δ V and δ k where reduction is smaller than 0.5 dB according to the metrics PSNR and PSNR-HVS-M or smaller than 0.005 according to MSSIM for each considered image and each set of mixed noise parameters. It is assumed that if the mixed noise parameters’ estimates are within these areas, the estimation errors do not essentially influence the filtering accuracy.

Note that in addition to three aforementioned sets of mixed noise parameters for the model (1), we consider below one set (combination) of σ si 2 and k for the model (2) to simulate the real-life situation for which the multiplicative noise is dominant. Then, at the second stage, the obtained areas have to be aggregated by AND rule to provide final requirements to estimation accuracy under assumption that neither noise statistics nor image properties are available in advance.

4. Analysis of results

The obtained dependences of output PSNR on δ V and δ k are presented as 2D surfaces (see Figure 5) of red color. Black color horizontal surface corresponds to the levels PSNR(δ V  = 0, δ k  = 0) - 0.5, dB and PSNR-HVS-M(δ V  = 0, δ k  = 0) - 0.5, dB, respectively. Thus, it is easy to see for which area of δ V and δ k filtering efficiency is acceptable for each particular case (red color surface is over black one).

Figure 5
figure 5

Dependences of PSNR on δ V and δ k for the test image #3. For different noise cases: k = 0.2; σ si 2 = 10 (a), k = 0.2; σ si 2 = 50 (b), and k = 1; σ si 2 = 30 (c).

Three combinations of k and σ si 2 are considered, namely, k = 0.2, σ si 2 = 10; k = 0.2, σ si 2 = 50; and k = 1, σ si 2 = 30. For each combination, Table 1 presents the following data: MSEinp to discriminate the cases of dominant signal-dependent or signal-independent noise; PSNR(δ V  = -1, δ k  = -1), i.e., for original noisy image; PSNR(δ V  = 0, δ k  = 0), i.e., for the image filtered with the recommended parameter under condition of absolutely accurate estimation of σ si 2 and k; max PSNR, i.e., maximal attained value and the values δV max, δk max for which maxPSNR has been provided.

Table 1 Simulation data for the test image #3, noise model (1), PSNR metric

As it is seen, for two cases (k = 0.2, σ si 2 = 10 and k = 1, σ si 2 = 30), the SD noise component is dominant (MSEinp >2 σ si 2 ). For the case k = 0.2, σ si 2 = 50, the SI noise component is dominant. The first observation is that if noise is more intensive (compare the case k = 1, σ si 2 = 30 to the case k = 0.2, σ si 2 = 10), the DCT-based filtering is more efficient (PSNR(δ V  = 0, δ k  = 0) differs more from the corresponding PSNR(δ V  = -1, δ k  = -1)). In particular, PSNR(δ V  = 0, δ k  = 0) is larger by about 8 dB than PSNR(δ V  = -1, δ k  = -1) for the case k = 1, σ si 2 = 30.

If the SD noise component is dominant (see the plots in Figure 5a, c), it does not matter too much how accurate the estimates σ ^ si 2 are. It is considerably more important how accurate is the estimate of the SD noise parameter k. If σ ^ si 2 is quite accurate (let us say, |δ V | ≤ 0.5), then |δ k | should be smaller than about 0.4. Thus, if the SD noise component is dominant, then the requirement to accuracy of k is stricter than the requirement to accuracy of σ ^ si 2 . This appears itself in the fact that a red surface ‘strip’ that is ‘over’ the corresponding threshold (black color surface) is oriented more parallel to the axis δ V .

Another situation is observed if SI noise component is dominant (see the plot in Figure 5b). Then accuracy of estimating the parameter k is less important, but the requirement to estimation of σ si 2 is stricter. In fact, for the considered case, it is desirable to provide |δ V | less than 0.3…0.4. The red surface ‘strip’ is oriented more parallel to the axis δ k . Thus, initial conclusion is quite trivial - it is necessary to more accurately estimate the parameter for the noise component which is dominant.

If one parameter is overestimated, then it is desirable to have another parameter underestimated ( k < k ) to provide rather efficient filtering (see the values δV max and δk max in Table 1). It is better, if σ si 2 > σ si 2 , δ V > 0 . The worst case is if both parameters, σ si 2 and k, are underestimated. Then, undersmoothing is observed (look at data for δ V  → -1, δ k  → -1) and filtering efficiency is far from optimal (attainable). One more interesting observation that follows from analysis of data in Table 1 is that maxPSNR is only slightly (by 0.01…0.05 dB) larger than PSNR(δ V  = 0, δ k  = 0). This shows that the practical recommendation (5) works well enough.

Consider now the dependences for the metric PSNR-HVS-M presented in Figure 6. The corresponding data are collected in Table 2. For each combination, Table 2 presents the values of PSNR-HVS-M(δ V  = -1, δ k  = -1)for original noisy images; PSNR-HVS-M(δ V  = 0, δ k  = 0), i.e., for the images filtered with the recommended parameter under condition of absolutely accurate estimation of σ si 2 and k; maxPSNR-HVS- M, i.e., maximal reachable value and the values δV maxvis, δk maxvis for which maxPSNR- HVS- M has been attained.

Figure 6
figure 6

Dependences of PSNR-HVS-M on δ V and δ k for the test image #3. For different noise cases: k = 0.2; σ si 2 = 10 (a), k = 0.2; σ si 2 = 50 (b), and k = 1; σ si 2 = 30 (c).

Table 2 Simulation data for the test image #3, noise model ( 1), PSNR-HVS-M metric

From the very beginning, let us stress that the values PSNR-HVS-M(δ V  = -1, δ k  = -1) are larger than the corresponding PSNR(δ V  = -1, δ k  = -1). This is due to the masking effects [39]. As it is according to the metric PSNR, PSNR-HVS-M increases for the recommended setting of the DCT filter parameter β (compare PSNR-HVS-M(δ V  = 0, δ k  = 0) to PSNR-HVS-M(δ V  = -1, δ k  = -1)). Visual quality improvement due to filtering is essential - from about 4 dB for non-intensive noise (k = 0.2; σ si 2 = 10), it reaches 5.5 dB for the case k = 1, σ si 2 = 30. Note that PSNR-HVS-M(δ V  = 0, δ k  = 0) for k = 0.2; σ si 2 = 10 exceeds 40 dB, i.e., residual distortions in the filtered image are almost invisible [35].

The observed values maxPSNR- HVS- M practically do not differ from the corresponding PSNR-HVS-M(δ V  = 0, δ k  = 0). Interestingly, all δV maxvis are larger than 0 while all δk maxvis are smaller than 0. This means that it is less risky to overestimate SI noise variance and underestimate the parameter k than to fall into other possible situations. Meanwhile, all δV maxvis (Table 2) are smaller than the corresponding δV max (see Table 1). This shows that for providing higher visual quality, it is undesirable to have considerable overestimation of mixed noise parameters. In some sense, it is equivalent to the recommendation to have β slightly smaller than 2.6 in threshold setting (5) to guarantee good visual quality of filtered image [38].

Figure 7 shows dependences of the metric MSSIM on δ V and δ k for the test image #3. The conclusions that can be drawn from their analysis are similar to those presented above. Overestimation of mixed noise parameters is less risky than underestimation. It is more important to correctly estimate the parameter of mixed noise that corresponds to the dominant component of the noise.

Figure 7
figure 7

Dependences of MSSIM on δ V and δ k for the R component of the test image #3. For different noise cases: k = 0.2; σ si 2 = 10 (a), k = 0.2; σ si 2 = 50 (b), and k = 1; σ si 2 = 30 (c).

Consider now the results for the test image #13 (Figures 1 or 3a). Again, we present data only for R component processed but the dependences for other color components are very similar. The dependences obtained for PSNR and PSNR-HVS-M are represented in Figures 8 and 9, respectively. Particular data are collected in Tables 3 and 4. Note that the cases k = 0.2, σ si 2 = 10 and k = 1; σ si 2 = 30, as earlier, correspond to the dominant SD noise component while the SI noise component is prevailing if k = 0.2; σ si 2 = 50.

Figure 8
figure 8

Dependences of PSNR on δ V and δ k for the R component of the test image #13. For different noise cases: k = 0.2; σ si 2 = 10 (a), k = 0.2; σ si 2 = 50 (b), and k = 1; σ si 2 = 30 (c).

Figure 9
figure 9

Dependences of PSNR-HVS-M on δ V and δ k for the R component of the test image #13. For different noise cases: k = 0.2; σ si 2 = 10 (a), k = 0.2; σ si 2 = 50 (b), and k = 1; σ si 2 = 30 (c).

Table 3 Simulation data for the test image #13, noise model ( 1), PSNR metric
Table 4 Simulation data for the test image #13, noise model ( 1), PSNR-HVS-M metric

At the first glance, the dependences in Figures 8 and 9 are quite similar to the corresponding dependences in Figures 5 and 6. However, there are several distinctive differences. One of them is that filtering of the test image #13 is considerably less efficient than of the test image #3. Only for the most intensive noise (k = 1; σ si 2 = 30), the difference between PSNR(δ V  = 0, δ k  = 0) and PSNR for the original image (PSNR(δ V  = -1, δ k  = -1)) reaches 2 dB. This difference is only 0.83 dB for the case k = 0.2; σ si 2 = 10. The value maxPSNR is 0.05…0.13 dB larger than the corresponding PSNR(δ V  = 0, δ k  = 0), and this takes place if δV max < 0, i.e., if SI noise variance is underestimated and the parameter k is estimated properly (without error).

Overestimation of both parameters is severely undesirable since this leads to reduction of filtering efficiency and image oversmoothing (analyze data for δ V  → 1, δ k  → 1 where the dependences PSNR(δ V , δ k ) rapidly decrease if δ V and δ k increase).

Analysis of plots in Figure 9 and data in Table 4 leads to the same conclusions. Note that in the case k = 0.2; σ si 2 = 10, PSNR-HVS-M increase due to filtering is only 0.28 dB, i.e., it is practically invisible. Only for k = 1; σ si 2 = 30 the value PSNR-HVS-M(δ V  = 0, δ k  = 0) is by 0.71 dB larger than for the original image, i.e., there is a small noticeable improvement of visual quality. This means that if filtering is applied to improve visual quality of highly textural images based on mixed noise parameters estimated in a blind manner, then these estimates should not be essentially larger than true values of these parameters (i.e., in fact, it is desirable to have δ V  ≤ 0, δ k  ≤ 0).

Figure 10 presents the obtained results for the metric MSSIM. Their analysis also shows that overestimation of mixed noise parameters is severely undesirable for this (highly textural) test image. Parameter that relates to a dominant component of the mixed noise has to be estimated more accurately.

Figure 10
figure 10

Dependences of MSSIM on δ V and δ k for the R component of the test image #13. For different noise cases: k = 0.2; σ si 2 = 10 (a), k = 0.2; σ si 2 = 50 (b), and k = 1; σ si 2 = 30 (c).

Let us now aggregate the obtained results for three noise parameter combinations and two test images of sufficiently different complexity. For this purpose, we have marked the ‘allowed’ 2D areas of mixed noise parameter estimates to be acceptable (see Figure 11 and description of notations used) and determined a joint acceptable area (where all areas overlap). This area is indicated by solid (blue) line and is located in the central part (horizontal axis corresponds to δ V and vertical to δ k ). Although all particular areas are of rather large size, their intersection is, certainly, smaller. However, it is still quite large. In the worst case, |δ k | = 0.2 and |δ V | = 0.3 (see Figure 11a). The conclusions that can be drawn from analysis of the area according to the metric PSNR-HVS-M (Figure 11b) are practically the same. The intersection area obtained for the metric MSSIM (Figure 11c) almost coincides with that for the metric PSNR.

Figure 11
figure 11

Areas of acceptable δ V and δ k for different noise cases with marked boundaries for appropriate estimation accuracy. (a) For PSNR; (b) for PSNR-HVS-M and (c) for MSSIM metrics for red color component and (d) for PSNR for green color component. Notations used: black color for image #3, red color for image #13, solid line for the case k = 1; σ si 2 = 30; dashed line for the case k = 0.2; σ si 2 = 10; dashed-dotted line for the case k = 0.2; σ si 2 = 50.

To show that the results obtained for other color components are similar to the results obtained for the red component, Figure 11d presents intersection areas for the green component according to the metric PSNR. Comparison of the obtained acceptable areas to the corresponding areas presented in Figure 11a shows that there is no essential difference.

Let us consider now the data for the model (2). Only one case has been simulated: k = 0.01, σ si 2 = 20. The dependences of PSNR and PSNR-HVSM on δ V and δ k for the R component of the test image #3 are presented in Figure 12. The data are collected in Tables 5 and 6.

Figure 12
figure 12

Dependences of PSNR (a) and PSNR-HVS-M (b) on δ V and δ k for the test image #3. With noise simulated according model (2) for k = 0.01 and σ si 2 = 20.

Table 5 Simulation data for the test image #3 and 13, noise model ( 2), PSNR metric
Table 6 Simulation data for the test image #3 and 13, noise model ( 2), PSNR-HVS-M metric

As it is seen, the multiplicative noise is dominant. Because of this, the parameter k has to be estimated with higher accuracy than the parameter σ si 2 . PSNR and PSNR-HVS-M improvement due to filtering is quite large, about 8 dB for PSNR and about 5 dB for PSNR-HVS-M. Similarly, Figure 13 presents dependences of PSNR and PSNR-HVS-M on δ V and δ k for the R component of the test image #13.

Figure 13
figure 13

Dependences of PSNR (a) and PSNR-HVS-M (b) on δ V and δ k for the test image #13. With noise simulated according model (2) for k = 0.01 and σ si 2 = 20.

The data are given in Tables 5 and 6, respectively. In this case, PSNR and PSNR-HVS-M improvement for accurately estimated k and σ si 2 (see data for δ V  = 0, δ k  = 0) is considerably smaller than for the test image #3. The requirement to accuracy of the parameter k estimation is stricter than to estimation accuracy of the parameter σ si 2 although the estimation errors with |δ k | ≤ 0.2 are still acceptable. Overestimation of the parameter k is especially undesirable.

A shortcoming of the analysis carried out above is that only two test images (although with essentially different properties) and only four (totally) sets of mixed noise parameters (although quite different) have been considered. Thus, let us also study two extreme cases. According to the previous analysis, the strictest restrictions are observed for the following cases: (a) a complex structure image corrupted by non-intensive noise with obvious dominance of one component, for example, additive; (b) a simple structure image corrupted by intensive noise with one prevailing component, for example, signal dependent of the model (2).

For the case a, we have used the test image #1 from the database TID2008 (Figure 1). The model (1) noise has been simulated with k = 0.01 and σ si 2 = 25. These settings relate to, e.g., the practical case of noisy (junk) component images acquired by old generation hyperspectral sensors [8]. For the case b, the test image #23 from TID2008 (Figure 1) that is one of the simplest has been exploited. The model (2) noise has been generated with k = 0.1 and σ si 2 = 10. Such noise can be observed in images formed by multi-look synthetic aperture radars.

The obtained dependences of PSNR, PSNR-HVS-M, and MSSIM on δ V and δ k for the case a are presented in Figure 14. Obviously, σ si 2 has to be estimated with high accuracy and its overestimation is strongly undesirable. Quality improvement due to filtering is small even for the best settings.

Figure 14
figure 14

Dependences of PSNR (a), PSNR-HVS-M (b), and MSSIM (c) on δ V and δ k for the test image #1. With noise simulated according model (1) for k = 0.01 and σ si 2 = 25.

The dependences for the case b are shown in Figure 15. It is seen that the parameter k has to be estimated correctly. Its underestimation can lead to considerable undersmoothing and is undesirable. In both extreme cases, it is still enough to estimate the dominant noise parameter with relative error not exceeding 0.2.

Figure 15
figure 15

Dependences of PSNR (a), PSNR-HVS-M (b), and MSSIM (c) on δ V and δ k for the test image #23. With noise simulated according model (2) for k = 0.1 and σ si 2 = 10.

Finally, we have obtained intersection areas for two metrics (PSNR and PSNR-HVS-M) for all images (three components for each image) and all considered sets of signal-dependent noise models. These areas are shown in Figure 16. The first observation is that these acceptable areas are only slightly smaller than those ones presented earlier in Figure 11. Conclusions that follow from the analysis of these areas are the following. First, absolute values of both errors δ V and δ k should not, in general, be larger than 0.2. Second, as exceptional situation, it is possible that absolute values of one of these errors can be slightly larger than 0.2 (but in this case, another error should have the opposite sign). This property appears itself in non-circular (quasi-elliptical shape of acceptable areas). Third, the acceptable area for the metric PSNR-HVS-M is slightly shifted with respect to the acceptable area for the metric PSNR toward smaller values of both errors. In fact, this means that overestimation of mixed noise parameters is more risky for providing high visual quality of filtered images compared to the case of conventional analysis of denoising efficiency in terms of output MSE or PSNR.

Figure 16
figure 16

Areas of acceptable δ V and δ k for two metrics. (a) PSNR and (b) PSNR-HVS-M for all TID2008 database images (three components for each image) and all considered sets of signal-dependent noise models.

Clearly, the way we followed in our analysis is not the only one possible. It is also possible to study tolerance of filtering techniques to ambiguity or inaccuracy of available a priori information in other ways. One of them can be based on simulating some estimates of noise parameters and using them instead of true values in denoising with statistical assessment of filtering efficiency. Following this way, we have assumed unbiased estimation of both parameters for the noise model (1) and estimates modeled as k ^ = k + Δk , σ ^ si 2 = σ ^ si 2 + Δ σ ^ si 2 , where Δk and Δ σ ^ si 2 are mutually independent zero mean random variables supposed to be Gaussian with standard deviations δ k and δ v , respectively. Then, varying δ k , δ v , it is possible to simulate erroneous estimation for a set of realizations, to determine what are filtering criteria values (PSNR, PSNR-HVS-M, MSSIM) for each realization and to statistically process them with obtaining mean and standard deviation for each considered metric as well as confidence intervals.

Note that erroneous estimation of noise parameters leads to degradation of all metrics (reduction of their mean values compared to the corresponding optimal ones). Because of this, we have determined confidence interval width as MCW = |MO - MM| + 3MS, where MO determines optimal value of a visual quality metric for errorless noise parameters estimates; MM is mean value of a visual quality metric; MS is standard deviation of a visual quality metric.

Simulation results for the test images #3 and #13 for two sets of parameters for model (1) are presented in Table 7. Recall that we need MCW less than 0.5 dB for the metrics PSNR and PSNR-HVS-M and less than 0.005 for the metric MSSIM. Conditions for which these requirements are satisfied depend upon a test image and noise parameters. The larger values of δ k , δ v result in larger degradations of all metrics. However, the aforementioned requirements are usually satisfied if both δ k , δ v are smaller than 0.15. If these standard deviations are both equal to 0.25, requirements can be not satisfied. Thus, we come approximately to the same conclusions as in our previous analysis.

Table 7 Simulation data for the test image #3 and 13, noise model ( 1), all considered metrics

5. Conclusions and future work

The question of influence of mixed noise parameters estimation accuracy on filtering efficiency in image enhancement applications is studied. It is demonstrated that the parameter that corresponds to a dominant noise type has to be estimated with a higher accuracy. This accuracy is characterized by a relative error that should be less than 20% for the dominant noise type and less than 30% for another type of noise. Then, decrease of filtering efficiency characterized by PSNR or PSNR-HVS-M drop compared to the optimum does not exceed 0.5 dB (MSSIM drop does not exceed 0.005), and thus, it is practically not noticeable (crucial). Note that these requirements practically coincide with requirements to accuracy of pure additive or pure multiplicative noise variance estimation [17] - an estimate has to differ from a true value by less than 20%. It is also important to stress that even the most advanced modern methods of blind estimation of mixed noise parameters do not always provide the required accuracy of parameters’ estimation [44].

It is also shown that for highly textural images it is better to have underestimation of the mixed noise parameters than overestimation. Unfortunately, it happens in practice that mixed noise parameters are usually overestimated for complex structure images [22]. This motivates a design of more accurate methods for blind estimation including analysis of dependences between the estimates of components of the mixed noise.

Besides, in the future, we plan to concentrate on considering spatially correlated noise which is rather typical in practice. Other filters based on using estimated parameters of mixed noise can be studied as well since restrictions for them can differ from restrictions obtained for the DCT-based denoising.