1 Introduction

Digital images have become an important medium for people to access and transmit complex information [5, 13, 17]. Due to the limitation of imaging methods and conditions as well as the presence of outside interferences, the actual image is inevitably contaminated by the noise signals and difficult to understand, which will have a significant impact on subsequent image processing operations [10, 15, 25, 41]. The research on image denoising approach is not only important to improve the performance of image processing systems, but its progress and development will also promote the development of many related fields. Therefore, the research on image denoising approaches has important theoretical and practical values. [10, 14, 15, 41].

Image denoising has two main purposes, one is denoising, and the other is to preserve image detail as much as possible, including texture, edges, contrast and so on. Roughly, traditional image denoising approaches can be divided into two categories [10]: one is based on the frequency domain, and the other is based on the spatial domain. For the first type [39], after an image is transformed, it is filtered by selecting an appropriate bandpass filter and then obtaining the denoised image by inverse transform. For the second type [28], various image smoothing templates are applied for image convolution processing to achieve the purpose of suppressing or eliminating noise. Traditional filters based on spatial and frequency domains, such as mean, median, Butterworth, exponential filters etc., will filter a contaminated image as a whole, and however which ignores the noise distribution and details of the image texture. Even if filtering out the noises, this method will lead to blurring of the edges, and therefore it is very important to design an image denoising approach with maintaining edges and local details [18].

Usually, an image contains many kinds of regions, such as edges, textures, flat regions etc. When the traditional image denoising approaches were applied to remove noise of the images containing complex features, the implementation of these approaches is relatively simple. For example, they achieve denoising mainly through filtering high frequency information of an image. However, the noise and structural information of an image belong to high frequency information, so it is difficult to preserve the structural information of the image when denoising [14]. To overcome the limitations of existing image denoising approaches, many mathematical theories are applied for image denoising. Image den oising based on partial differential equation (PDE) is a typical approach of applying mathematical theory. In PDE-based image denoising approach, the total variation (TV) model [11] based on TV regularization, proposed by Rudin et al. [26], has been the focus of many researchers. TV model has an anisotropic diffusion characteristic of PDE, which not only may effectively remove image noise, but also protect image edges without being blurred. TV model can better solve the balance problem between denoising and protection details [30,31,32]. As a classical model based on PDE, TV model has an important influence on image denoising, and its advantages are very obvious, however which still has some shortcomings [20, 35,36,37, 40] as follows.

  1. (1)

    In TV model, all pixels of an image are processed using the same operations, which results in that the detail regions of an image are over-smoothed.

  2. (2)

    The image is smoothed using TV model along the orthogonal direction of the pixel gradient, and the denoising result tends to a piecewise constant, which is easy to generate the staircase effect in the regions of the image intensity changing slowly.

In this paper, a novel image denoising iterative approach based on TV model and weighting function is proposed. Specifically, the TV model is applied to transform the image denoising into a problem of minimizing the energy function, and a weighting function is used to calculate the gradient magnitude and local variance values of each pixel and analyze their characteristics in the different regions of an image. The main contributions of this work are as follows.

  1. (1)

    The mechanism of generating staircase effect by the traditional TV model is analyzed, which provides a theoretical basis for proposing an improved TV model.

  2. (2)

    The weighting function is employed so that different regions of the image are processed differently during image denoising.

  3. (3)

    In order to suppress the staircase effect, an improved TV model based on weighting function is proposed. Its advantages are that it is unsupervised and has lower CPU time.

The remainder of this paper is organized as follows. In Section 2, we introduced related work of the image denoising approaches based on TV model. In Section 3, an improved TV model is proposed including the traditional TV model, staircase effect analysis of TV model, analysis of variance and gradient, weighting function, improved TV model based on weighting function, etc. In Section 4, experimental setup is provided including performance evaluation and parameters selection. In Section 5, the experiment and comparison results are introduced. Finally, conclusions are given in Section 6.

2 Related work

The image denoising principle based on traditional TV model is that TV of the contaminated image is larger than that of the original image, so the problem of image denoising may be transformed into a minimization of the energy function according to TV optimal criterion of the image. In general, traditional TV model contains not only regular and data fidelity items, but also regularization parameters [35]. The regular term is used as denoising, the data fidelity term is used to calculate the approximation degree between the denoised and original images, and the regularization parameter is used to balance the relationship between the regular and data fidelity terms. The main steps of the traditional image denoising approach based on TV model are as follows.

  1. Step 1.

    According to the actual problem of image denoising, an energy function and corresponding constraints are designed.

  2. Step 2.

    Using the variational principle and gradient descent method, an Euler-Lagrange equation corresponding to the energy function is designed.

  3. Step 3.

    Solve Euler-Lagrange equation and obtain an approximate solution of the image denoising problem.

Since the traditional TV model has a good denoising effect, many improved image denoising approaches based on TV model have been proposed [2, 3, 8, 16, 22, 29, 38]. Chen et al. [2] proposed the proximity algorithm to solve the fractional order TV optimization problem, provided an effective tool for the study of the fractional order TV denoising model, and it was effective to deal with the problem of algorithm implementation. Mousavi et al. [16] proposed a TV-based shearlet shrinkage for discontinuity-preserving denoising using a combination of shearlet with a TV model. For TV denoising numerical procedure, two approaches were used, which gave very good image denoising results. He et al. [8] proposed an improved fractional differential operator for image denoising, a G-L fraction based denoising filter operator mask was further constructed, its total coefficient of filter mask was not equal to zero, and this nonlinear filter mask can preserve the detail features when denoising. Shen et al. [29] proposed a denoising model based on the combination of TV and nonlocal similarity in the wavelet domain in order to suppress the noise and keep the distinct edges of the images. TV regularization in the wavelet domain can effectively suppress the noise with the biorthogonal wavelet function and the nonlocal similarity regularization may improve the image details. Ma et al. [22] proposed an image denoising model based on the total generalized variation (TGV) regularization. In this model, in order to adaptively fit the local image features, a spatially dependent regularization parameter was utilized, and the denoising potential of the TGV regularization was further exploited. Inspired by the ability of lp-regularized algorithms and the close connection of TV to the l1 norm, a p-th power type TV denoted as TVp was proposed by Yan et al. [38] for 0 ≤ p ≤ 1. Due to the TVp-regularized problem for image denoising was non convex, and authors processed it by proposing a weighted TV (WTV) minimization through updating the weights iteratively to locally approximate the TVp-regularized problem. Du et al. [3] introduced the minmax-concave TV (MCTV), which can strongly induce the signal sparsity in gradient domain. Although MCTV was non convex, the cost function can maintain convexity by specifying parameter in a proper range.

TV model has good properties. For example, it propagates information only along the edge direction, has anisotropic etc. Therefore, it may compensate the lack of diffusing information for the traditional image denoising approach, and can better protect the image details. However, its disadvantages are also obvious, for example, it is easy to generate the false edges, its steady state numerical solution of Euler equation has a significant staircase effect etc. [38].

3 Improved TV model

3.1 Traditional TV model

Assume the original image I(x, y) is contaminated by additive noise n(x, y) with mean zero and variance σ2, the noise model of an image is

$$ {I}_0\left(x,y\right)=I\left(x,y\right)+n\left(x,y\right),\kern0.5em \left(x,y\right)\in \Omega $$
(1)

where I0(x, y) is a noisy image, and (x, y) is the position of one pixel in image whole region Ω.

Since TV of the noisy image is usually significantly larger than that of the original image [26], the image denoising can be achieved by minimizing TV. The minimum problem of image denoising is calculated as follows

$$ \min {\int}_{\Omega}\sqrt{I_x^2+{I}_y^2} dxdy=\min {\int}_{\Omega}\left|\nabla I\right| dxdy $$
(2)

where, Ix and Iy are gradients of the image in the x and y directions respectively, the gradient magnitude at the position (x, y) is calculated as follows

$$ \left|\nabla I\left(x,y\right)\right|=\sqrt{I_x^2+{I}_y^2}=\sqrt{{\left(\frac{\partial I}{\partial x}\right)}^2+{\left(\frac{\partial I}{\partial y}\right)}^2} $$
(3)

Eq. (2) satisfies the following constraints

$$ {\int}_{\Omega} Idxdy={\int}_{\Omega}{I}_0 dxdy $$
(4)
$$ \frac{1}{\left|\Omega \right|}{\int}_{\Omega}{\left(I-{I}_0\right)}^2 dxdy={\sigma}^2 $$
(5)

where |Ω| is total number of pixels in image region Ω. So, the energy function of TV model is constructed as follows

$$ \underset{I}{\min }E(I)={\int}_{\Omega}\left|\nabla I\right| dxdy+\frac{\lambda }{2}{\int}_{\Omega}{\left|I-{I}_0\right|}^2 dxdy $$
(6)

where the first term on the right side of Eq. (6) is the regular term of the energy function, which can be used to remove the noise of the image; the second term on the right side of Eq. (6) is the approximation term of the energy function, which is an approximation degree between the approximate and real solutions; λ is Lagrange multiplier, which controls the balance between the regularization and approximation terms.

According to the Euler-Lagrange equation and gradient descent method [23], the diffusion equation is calculated as follows

$$ \left\{\begin{array}{c}\frac{\partial I\left(x,y,t\right)}{\partial t}=\mathit{\operatorname{div}}\left(\frac{\nabla I}{\left|\nabla I\right|}\right)-\lambda \left(I-{I}_0\right)\\ {}{\left.I\left(x,y,t\right)\right|}_{t=0}={I}_0\left(x,y\right)\kern4em \end{array}\right.,\kern1em t>0 $$
(7)

where, div is the divergence operator. According to previous research results [2, 8, 16], although TV model can protect the edges of the image well, it may generate false edges in the non edge regions of an image, i.e., the staircase effect [33, 37].

3.2 Staircase effect analysis of TV model

In recent years, TV model has attracted the attention of many researchers and are widely used in image denoising. However, image denoising results based on TV model will be easy to generate significant staircase effect. In this subsection, the staircase effect is analyzed from mathematical theory to improve the TV model.

Let γ and ξ be the image gradient direction and gradient orthogonal direction respectively, and Iγγ and Iξξ the image edge gradient direction and direction derivative of the gradient orthogonal direction respectively, then we have

$$ \left\{\begin{array}{c}\gamma =\frac{\nabla I}{\left|\nabla I\right|}=\frac{1}{\sqrt{I_x^2+{I}_y^2}}\left(\begin{array}{c}{I}_x\\ {}{I}_y\end{array}\right)\\ {}\xi =\frac{1}{\sqrt{I_x^2+{I}_y^2}}\left(\begin{array}{c}-{I}_y\\ {}{I}_x\end{array}\right)\kern3.5em \end{array}\right. $$

and

$$ \left\{\begin{array}{c}{I}_{\gamma \gamma}=\frac{I_x^2{I}_{xx}+2{I}_x{I}_y{I}_{xy}+{I}_y^2{I}_{yy}}{I_x^2+{I}_y^2}\\ {}{I}_{\xi \xi}=\frac{I_x^2{I}_{xx}-2{I}_x{I}_y{I}_{xy}+{I}_y^2{I}_{yy}}{I_x^2+{I}_y^2}\end{array}\right.\kern0.5em \Rightarrow {I}_{\gamma \gamma}+{I}_{\xi \xi}={I}_{xx}+{I}_{yy} $$

Decompose \( \frac{\partial I\left(x,y,t\right)}{\partial t}=\mathit{\operatorname{div}}\left(\frac{\nabla I}{\left|\nabla I\right|}\right)-\lambda \left(I-{I}_0\right) \), we may get

$$ \mathit{\operatorname{div}}\left(\frac{\nabla I}{\left|\nabla I\right|}\right)=\frac{1}{\left|\nabla I\right|}\nabla \left(\nabla I\right)+\nabla I\nabla \left(\frac{1}{\left|\nabla I\right|}\right)=\frac{I_{xx}+{I}_{yy}}{\left|\nabla I\right|}+\left({I}_x,{I}_y\right){\left(\frac{\partial }{\partial x}\left(\frac{1}{\sqrt{I_x^2+{I}_y^2}}\right),\frac{\partial }{\partial y}\left(\frac{1}{\sqrt{I_x^2+{I}_y^2}}\right)\right)}^T $$
$$ =\frac{I_{xx}+{I}_{yy}}{\left|\nabla I\right|}+\left({I}_x,{I}_y\right){\left(-\frac{I_x{I}_{xx}+{I}_y{I}_{xy}}{{\left|\nabla I\right|}^3},-\frac{I_x{I}_{xy}+{I}_y{I}_{yy}}{{\left|\nabla I\right|}^3}\right)}^T=\frac{I_{xx}+{I}_{yy}}{\left|\nabla I\right|}-\frac{I_x^2{I}_{xx}+2{I}_x{I}_y{I}_{xy}+{I}_y^2{I}_{yy}}{{\left|\nabla I\right|}^3} $$
$$ =\frac{I_{xx}+{I}_{yy}}{\left|\nabla I\right|}-\frac{I_x^2{I}_{xx}+2{I}_x{I}_y{I}_{xy}+{I}_y^2{I}_{yy}}{I_x^2+{I}_y^2}\frac{1}{\left|\nabla I\right|}=\frac{I_{\gamma \gamma}+{I}_{\xi \xi}}{\left|\nabla I\right|}-\frac{I_{\gamma \gamma}}{\left|\nabla I\right|}=\frac{I_{\xi \xi}}{\left|\nabla I\right|}+0{I}_{\gamma \gamma} $$

From the above equation, we can find that TV model may better preserve edge information of the image. However, since the pixel points in the flat region (non-edge region) of an image do not have the gradient direction and gradient orthogonal direction, which will cause the false edges, i.e., staircase effects, and therefore easily occur in a flat region (non-edge region) of the image.

Fig. 1(a) is an original image. After adding Gaussian noises with mean zero and standard deviation σ = 20, the denoised image using TV model is shown in Fig. 1(b).

Fig. 1
figure 1

An example generating staircase effect

It can be seen from Fig. 1 that the denoising results using TV model may generate significant staircase effect.

3.3 Analysis of variance and gradient

3.3.1 Local variance

Variance is an important concept in mathematics, which is a deviation metric between the sample value and the overall sample mean. In the image processing process, the concept of local variance is often used.

Taking a 3 × 3 neighborhood window as an example, at the k-th iteration of an image I(x, y), the local variance \( {\sigma}_k^2\left(x,y\right) \) at the position (x, y) is calculated as follows

$$ {\sigma}_k^2\left(x,y\right)=\frac{1}{3\times 3}\sum \limits_{i=-1}^{i=1}\sum \limits_{j=-1}^{j=1}{\left({I}_k\left(x+i,y+j\right)-{\overline{I}}_k\left(x,y\right)\right)}^2 $$
(8)

where \( {\overline{I}}_k\left(x,y\right) \) is the mean of intensitys in the 3 × 3 neighborhood window at the k-th iteration, which is calculated as follows

$$ {\overline{I}}_k\left(x,y\right)=\frac{1}{3\times 3}\sum \limits_{i=-1}^{i=1}\sum \limits_{j=-1}^{j=1}{I}_k\left(x+i,y+j\right) $$
(9)

3.3.2 Gradient

At the k-th iteration of the image I(x, y), the gradient magnitude at the position (x, y) is calculated as follows

$$ {\left|\nabla I\left(x,y\right)\right|}_k=\sqrt{{\left(\frac{\partial I}{\partial x}\right)}_k^2+{\left(\frac{\partial I}{\partial y}\right)}_k^2} $$
(10)

where \( {\left(\frac{\partial I}{\partial x}\right)}_k^2 \) and \( {\left(\frac{\partial I}{\partial y}\right)}_k^2 \) is the gradients in the horizontal and vertical directions at the k-th iteration respectively. The horizontal and vertical gradients are calculated by the mean difference as follows

$$ \left\{\begin{array}{c}{\left(\frac{\partial I}{\partial x}\right)}_k=\frac{I_k\left(x+1,y\right)-{I}_k\left(x-1,y\right)}{2}\\ {}{\left(\frac{\partial I}{\partial y}\right)}_k=\frac{I_k\left(x,y+1\right)-{I}_k\left(x,y-1\right)}{2}\end{array}\right. $$
(11)

We select Camera image as an example to illustrate its gradient and local variance. In Camera image, 200 pixels are randomly selected from different regions of Camera image. In these 200 pixels, the red, blue, yellow and green pixel regions are part of the texture, building, edge and smooth regions in Camera image respectively. Figure 2.

Fig. 2
figure 2

Different regions of Camera image

The gradients magnitude and local variance values of 10 pixel points, which are randomly selected from the red, blue, yellow and green point regions respectively, are shown respectively in Tables 1, 2, 3, 4 according to Eqs. (10) and (8).

Table 1 Gradient magnitude and local variance values of red point regions in Camera image
Table 2 Gradient magnitude and local variance values of blue point regions in Camera image
Table 3 Gradient magnitude and local variance values of yellow point regions in Camera image
Table 4 Gradient magnitude and local variance values of green point regions in Camera image

As shown in Tables 1, 2, 3, and 4, the gradient magnitude and local variance values are very small in the smooth regions (i.e., green regions) of Camera image. The gradient magnitude and local variance values are very large of the edge regions (i.e., yellow regions) of Camera image. Although the gradient magnitudes of the blue and red regions are similar, the difference between their local variances is large.

In order to preserve the details of the image and effectively remove the noise, the gradient magnitude and local variance are utilized in the improved TV model. Since the local variances are usually larger than the gradient magnitudes, Eqs.(8) and (10) are rewritten as follows

$$ {\sigma}_{k, New}^2\left(x,y\right)=\frac{\sigma_k^2\left(x,y\right)-\mathit{\operatorname{Min}}{\sigma}_k^2}{\mathit{\operatorname{Max}}{\sigma}_k^2-\mathit{\operatorname{Min}}{\sigma}_k^2}\times 255 $$
(10)
$$ {\left|\nabla I\left(x,y\right)\right|}_{k, New}=\frac{{\left|\nabla I\left(x,y\right)\right|}_k-\mathit{\operatorname{Min}}{\left|\nabla I\right|}_k}{\mathit{\operatorname{Max}}{\left|\nabla I\right|}_k-\mathit{\operatorname{Min}}{\left|\nabla I\right|}_k}\times 255 $$
(11)

where \( \mathit{\operatorname{Min}}{\sigma}_k^2 \) and \( \mathit{\operatorname{Max}}{\sigma}_k^2 \) are the minimum and maximum local variances of the denoised image at the iteration k respectively; Min|∇I|k and Max|∇I|k are the minimum and maximum gradient magnitudes of the denoised image at the iteration k respectively. For simplicity, let σ2 and |∇I| be the abbreviations of \( {\sigma}_{k, New}^2\left(x,y\right) \) and |∇I(x, y)|k, New respectively.

3.4 Weighting function

The TV model can protect the edges of the image well. To continue to preserve this advantage of TV model, a weighting function is introduced to TV model. In this paper, weighting function is calculated as follows

$$ p(x)=\frac{1}{1+{\left(\frac{x}{T}\right)}^2} $$
(12)

Where, p(x) is a non negative monotonic decreasing function, which satisfies the following conditions.

0 ≤ p(x) ≤ 1, p(0) = 1, and \( \underset{x\to \infty }{\lim }p(x)=0 \).

Figure 3 shows the relationship between p(x) and x, where x = |∇I| ⋅ σ2, and T > 0 is an adjustment parameter.

Fig. 3
figure 3

The relationship between p(x) and x

According to the properties of the weighting function p(x), the gradient magnitude and the local variance, we further discuss the relationship between p(x) and the regions of an image.

  1. (1)

    In the edge regions.

In this case, the value of |∇I| ⋅ σ2 is very large. We also novice that, when |∇I| ⋅ σ2 → ∞, p(|∇I| ⋅ σ2) → 0is gotten, which indicates that the intensity diffusion is very small, therefore the edges of the image can be protected.

  1. (2)

    In the flat regions.

In this case, the value of |∇I| ⋅ σ2 is very small. We also novice that, when |∇I| ⋅ σ2 → 0, p(|∇I| ⋅ σ2) → 1 is gotten, which indicates that the intensity diffusion is very large, therefore the image noises can be removed effectively.

  1. (3)

    In the gradient and details regions.

In these cases, we have 0 < |∇I| ⋅ σ2 < 1, which indicates that the intensity diffusion is moderate and the image is relatively smooth, therefore the staircase effect can be reduced.

Through the above discussions, we notice that, if weighting function p(x) is selected according to the gradient magnitude and local variance of an image, the image denoising approach based on TV model and weighting function can achieve good results for denoising and maintaining the details of the image.

3.5 Improved TV model based on weighting function

For the problem of TV model easy generating a staircase effect, in this paper, a weighting function p(x) is introduced into the regular term of TV model, i.e., the first term on the right side of Eq. (6). The energy functional of improved TV model is rewritten as follows

$$ E(I)={\int}_{\Omega}\left(p\left(\left|\nabla I\right|\cdot {\sigma}^2\right)\left|\nabla I\right|\right) dxdy+\frac{\lambda }{2}{\int}_{\Omega}{\left|I-{I}_0\right|}^2 dxdy $$
(13)
$$ p\left(\left|\nabla I\right|\cdot {\sigma}^2\right)=\frac{1}{1+{\left(\frac{\left|\nabla I\right|\cdot {\sigma}^2}{T}\right)}^2} $$
(14)

According to the Euler-Lagrange equation and gradient descent method, PDE scheme of Eq. (13) is

$$ \left\{\begin{array}{c}\frac{\partial I\left(x,y,t\right)}{\partial t}=p\left(\left|\nabla I\right|\cdot {\sigma}^2\right)\mathit{\operatorname{div}}\left(\frac{\nabla I}{\left|\nabla I\right|}\right)-\lambda \left(I-{I}_0\right)\\ {}{\left.I\left(x,y,t\right)\right|}_{t=0}={I}_0\left(x,y\right)\kern9em \end{array}\right.\kern1.5em t>0 $$
(15)

We notice that, in the flat region, we have |∇I| = 0 and σ2 = 0, which indicates that \( \mathit{\operatorname{div}}\left(\frac{\nabla I}{\left|\nabla I\right|}\right) \) is meaningless at this time. In order to avoid this situation, the regularized gradient magnitude and local variance are generally used in practice. Where, the regularized gradient magnitude is calculated as follows

$$ {\left|\nabla I\right|}_{\varepsilon }=\sqrt{{\left|\nabla I\right|}^2+\varepsilon } $$
(16)

And the regularized local variance is calculated as follows

$$ {\left|{\sigma}^2\right|}_{\varepsilon }=\sqrt{{\left|{\sigma}^2\right|}^2+\varepsilon } $$

Where, ε is a number greater than zero.

In improved TV model, PDE is discretized by the finite difference scheme. In this paper, in order to utilize the peripheral information of each pixel in image, an eight-neighbor system is employed, as shown in Fig. 4.

Fig. 4
figure 4

Eight-neighbor system

Let the time and grid step sizes be Δt and h respectively. For convenience and simplicity, let \( I\left(x,y\right)={I}_{i,j},\kern0.36em {I}_0\left(i,j\right)={I}_{i,j}^0 \), we have

$$ \left\{\begin{array}{c}{\left({I}_x\right)}_{i,j}^k=\frac{I_{i+1,j}^k-{I}_{i-1,j}^k}{2h}\\ {}{\left({I}_y\right)}_{i,j}^k=\frac{I_{i,j+1}^k-{I}_{i,j-1}^k}{2h}\\ {}{\left({I}_{xx}\right)}_{i,j}^k=\frac{I_{i+1,j}^k-2{I}_{i,j}^k+{I}_{i-1,j}^k}{h^2}\\ {}{\left({I}_{yy}\right)}_{i,j}^k=\frac{I_{i,j+1}^k-2{I}_{i,j}^k+{I}_{i,j-1}^k}{h^2}\\ {}{\left({I}_{xy}\right)}_{i,j}^k=\frac{I_{i+1,j+1}^k-{I}_{i-1,j+1}^k-{I}_{i+1,j-1}^k+{I}_{i-1,j-1}^k}{4{h}^2}\\ {}\mathit{\operatorname{div}}\left(\frac{\nabla I}{{\left|\nabla I\right|}_{\varepsilon }}\right)=\frac{I_{xx}{I}_y^2-2{I}_x{I}_y{I}_{xy}+{I}_x^2{I}_{yy}}{{\left({I}_x^2+{I}_y^2+{\varepsilon}^2\right)}^{\frac{3}{2}}}\end{array}\right. $$
(17)

So, the discrete scheme of Eq. (15) is calculated as follows

(15’)

The discrete scheme of Eq. (15’) satisfies the following boundary conditions

$$ {I}_{0,j}^k={I}_{1,j}^k,\kern0.36em {I}_{N,j}^k={I}_{N-1,j}^k,\kern0.36em {I}_{i,0}^k={I}_{i,N}^k={I}_{i-1,N}^k $$

where, N is total number of iterations.

Until now, the pseudo-code of the improved TV model based on weighting function is summarized in Algorithm ITVWF.

figure a

4 Experimental setup

All experiments are programmed in Matlab2016A and executed on a computer with Intel Core i5–4200 CPU at 1.6GHz and 8GB physical memory.

4.1 Performance evaluation

In order to evaluate the performance of image processing technology [4, 19], peak signal-to-noise ratio (PSNR) [12, 24] and structure similarity (SSIM) index [21, 27] commonly used metrics. PSNR is calculated as follows

$$ PSNR=10\times {\log}_{10}\frac{255^2}{\frac{1}{M_1\times {N}_1}{\sum}_{i,j}{\left(I\left(i,j\right)-\hat{I}\left(i,j\right)\right)}^2} $$
(18)

where, the size of the original image I(x, y) is M1 × N1, and \( \hat{I}\left(i,j\right) \) is denoised image. PSNR is the approximation degree between the denoised and original images.

The greater PSNR indicates better denoising performance.

SSIM is another evaluation metric based on image structures, which is closer to human subjective visual features. SSIM is calculated as follows

$$ SSIM=\frac{\left(2{\mu}_I{\mu}_{\hat{I}}+{C}_1\right)\left(2{\sigma}_{I\hat{I}}+{C}_2\right)}{\left({\mu}_I^2+{\mu}_{\hat{I}}^2+{C}_1\right)\left({\sigma}_I^2+{\sigma}_{\hat{I}}^2+{C}_2\right)} $$
(19)

where μI and \( {\mu}_{\hat{I}} \) are the means of images I and \( \hat{I} \) respectively; σI and \( {\sigma}_{\hat{I}} \) are the standard deviations of the images I and \( \hat{I} \) respectively. \( {\sigma}_{I\hat{I}} \) is the covariance of I and \( \hat{I} \); C1 and C2 are positive constants to avoid a null denominator, and they are determined as follows.

$$ {C}_1={\left({k}_1L\right)}^2\kern0.5em \mathrm{and}\ {C}_2={\left({k}_2L\right)}^2 $$
(20)

where L is the dynamic range of the image (255 for 8-bit grayscale images) while k1 and k2 are two constants whose values are k1 = 0.01 and k2 = 0.03 respectively [7].

The value range of SSIM is [0, 1]. The larger SSIM indicates the more similar between the denoised and original images and the better the denoising effect.

4.2 Parameters selection

The parameters have an impact on the performance of the image processing technology [34]. During these experiments, for comparing denoising performance with other models, six parameters of ITVWF are determined in advance, they are listed in Table 5. Of course, they are not meant to be optimal.

Table 5 Used parameters

5 Experiment results and analysis

These experiments are divided into three parts: one is that three original images Camera, Boat, Plane and a test image Test to add Gaussian random noise with mean zero and standard deviation 20 are used in the experiments, the other is that four original images Bridge, Barbara, Peppers and Male with other noises such as salt-and-pepper with density (d) 0.03 or 0.1, Poisson, Speckle with deviation 0.04 noises are used in the experiments, and the third is to compare the CPU time between TV and ITVWF models for different images.

5.1 Experimental results and analysis of adding Gaussian noise

Three original images Camera, Boat, Plane and a test image Test are first applied in the experiments to add Gaussian random noise with mean zero and standard deviation 20, the size of Camera, Boat and Plane is 256 × 256, the size of Test is 98 × 256, and their grayscale is 256. These images are shown in Fig. 5. The results of image denoising are analyzed first by observation.

Fig. 5
figure 5

Original images and their noisy images

In order to verify the effectiveness of ITVWF, for four noisy images in Fig. 5, the denoising results of ITVWF are compared with that of the traditional TV model and NLM method. The denoising results are shown in Figs. 6, 7, 8, and 9 respectively.

Fig. 6
figure 6

Three different denoising results for noise image Camera with Gaussian noise σ = 20

Fig. 7
figure 7

Three different denoising results for noise image Boat with Gaussian noise σ = 20

Fig. 8
figure 8

Three different denoising results for noise image Plane with Gaussian noise σ = 20

Fig. 9
figure 9

Three different denoising results for noise image Test with Gaussian noise σ = 20

By observing Figs. 6, 7, 8, and 9, the denoising results are discussed as follows.

  1. (1)

    Observe Fig. 6(a)-(c)

Comparing Fig. 6(a) and the original Camera, we find that, although TV model can preserve the edges of original Camera, there are some staircase effects in regions where intensity changes slowly and flatly. Comparing Fig. 6(b) and the original Camera, although there is no staircase effect in Fig. 6(b), many details of the original Camera have been lost. Comparing Fig. 6(a) and (c), the denoising effect of ITVWF is better than that of TV model. For example, although there is no obvious difference between the denoising effects of ITVWF and TV models in the human body and the camera regions, the denoising effect of ITVWF is significantly better than that of TV model especially in building region.

  1. (2)

    Observe Fig. 7(a)-(c)

Comparing Fig. 7(a) and the original Boat, we find that, although TV model can preserve many details of the boat itself, the denoising effect of TV model is not ideal in regions where intensity changes slowly and flatly such as boat’s sides and sky. Comparing Fig. 7(b) and the original Boat, although there is no staircase effect in Fig. 7(b), a lot of detailed information has been lost in the entire image especially in sea and boat regions, i.e., NLM method achieves too smoothing effect than TV model and ITVWF. Comparing Fig. 7(a) and (c), the denoising effect of ITVWF is better than that of TV model. For example, although there is no obvious difference between the denoising effects of ITVWF and TV models in the boat itself region, the denoising effect of ITVWF is significantly better than that of TV model especially in the sky region.

  1. (3)

    Observe Fig. 8(a)-(c)

Comparing Fig. 8(a) and the original Plane, we find that, although TV model can preserve many details of original Plane, there are staircase effects in regions where intensity changes slowly and flatly such as outside the cabin. Comparing Fig. 8(b) and the original Plane, although there is no staircase effect in Fig. 8(b), many details of the original Plane have been lost. Comparing Fig. 8(a) and (c), there is almost no difference between the TV model and ITVWF’s denoising effects only by human eyes. In other words, by observation only, denoising effect of ITVWF is not better than that of TV models for image Plane.

  1. (4)

    Observe Fig. 9(a)-(c)

Although the original Test looks like a binary image, it is actually a grayscale image. Because the original Test has no details, which makes almost no difference between the original Test and the denoising effects of TV model, NLM method and ITVWF.

In addition to the above experiments, the local amplification images of Camera, Boat, Plane and Test are also used to further observe the denoising results in more detail, which are shown in Figs. 10, 11, 12, and 13.

Fig. 10
figure 10

Local amplification images of noising image Camera

Fig. 11
figure 11

Local amplification images of noising image Boat

Fig. 12
figure 12

Local amplification images of noising image Plane

Fig. 13
figure 13

Local amplification images of noising image Test

By observing Figs. 10, 11, 12, and 13, the denoising results are discussed as follows.

  1. (1)

    Observe Fig. 10(a-c)

From Fig. 10(a-c), we find that TV and ITVWF models can preserve more detailed information than that of NLM method, and their denoising effects are more natural in visual performance. However, through observing Fig. 10(a) and (c), there are more obvious staircase effects in the sky region of Fig. 10(a) than that of Fig. 10(c). Comparing Fig. 10(b) and Fig. 10(c), Fig. 10(c)'s detailed information is richer than that of Fig. 10(b) especially in the neck and the eyes, nose, mouth regions of the human face. These facts show that the denoising effect of ITVWF is better than that of TV model and NLM method.

  1. (2)

    Observe Fig. 11(a-c)

From Fig. 11(a-c), we find that there are obvious staircase effect in the sky region of Fig. 11(a), which indicates that the denoising performance of TV model is not very good. For Fig. 11(b), there is almost no detailed information on the sky and mast regions, which indicates that the denoising performance of NLM method is not ideal. For Fig. 11(c), there are almost no staircase effects in the sky region, and detailed information on the sky and mast regions is also well preserved, which show that the denoising performance of ITVWF is better than that of TV model and NLM method.

  1. (3)

    Observe Fig. 12(a-c)

From Fig. 12(a-c), we find that the denoising result of ITVWF is clearer and richer than that of TV model and NLM method in the letter region of the cabin, there are some staircase effects in the outside the cabin of Fig. 12(a) and (c), and there are no more details on the outside of the cabin in Fig. 12(b), these facts show that, in general, the denoising performance of ITVWF is not better than that of TV model and NLM method.

  1. (4)

    Observe g Fig. 13(a-c)

From Fig. 13(a-c), the denoising results of TV model, NLM method and ITVWF are almost similar.

Of course, for these four experimental images, ITVWF can achieve better denoising results in most cases only by observation, which shows that the denoising performance of ITVWF for Gaussian noise is better than TV model and NLM method in most cases.

Except above these experiments, we also calculated PSNR and SSIM for each denoised images when σ = 20 to analyze denoising results. Table 6 lists these results with the highest PSNR and SSIM, and the number in parentheses is the optimal times of iterations. Since NLM is not an iterative algorithm, there is no optimal number of iterations.

Table 6 Compare results of TV, NLM and ITVWF for four denoised images

From Table 6, the optimal times of iterations for the denoised images Boat and Plane obtained by ITVWF are the least. Although TV model achieves the smallest optimal number of iterations for the other two denoised images Camera and Test, the difference is not significant. Therefore, from the optimal number of iterations, the performance of ITVWF is more ideal. Further, PSNR and SSIM of ITVWF are compared to those of TV model and NLM method. We found that PSNR and SSIM of ITVWF are always higher. Therefore, from PSNR and SSIM, the performance of ITVWF is also more ideal.

To further validate the effectiveness of ITVWF, PSNR and SSIM of noised images Camera, Boat, Plane and Test by adding Gaussian noises with standard deviations σ = 5, 10, 15, 20, 25 respectively using ITVWF, TV and NLM are shown in Figs. 14 and 15.

Fig. 14
figure 14

Comparisons of PSNR for Camera, Boat, Plane and Test images

Fig. 15
figure 15

Comparisons of SSIM for Camera, Boat, Plane and Test images

From Figs. 14 and 15, we find that ITVWF always obtains the higher PSNR and SSIM for noised Camera, Boat, Plane and Test images, which show that ITVWF can achieves better denoising performance that that of TV model and NLM method for various Gaussian noises by comparing the PSNR and SSIM values.

From subjective and objective results, ITVWF can not only obtain better denoising results for adding various Gaussian noises, but also obtain more realistic visual effects between the noisy image and original image. The denoising performance of ITVWF is the best matching with human visual perception in most cases.

5.2 Experimental results and analysis of adding other noises

Although Gaussian noise is the most common, it is not all. In order to more fully examine the performance of ITVWF, other noises such as salt-and-pepper with density (d) 0.03 or 0.1, Poisson, Speckle with deviation 0.04 are added to the four images including Bridge, Barbara, Peppers and Male respectively, their size is 256 × 256 and grayscale is 256. These images are shown in Fig. 16.

Fig. 16
figure 16

Original images Bridge, Barbara, Peppers and Male

In Fig. 17, the images of each column from left to right are noisy images added salt-and-pepper with d = 0.03, d = 0.1, Poisson, and Speckle noises into original image Bridge, Barbara, Peppers and Male respectively.

Fig. 17
figure 17

Noisy images of Bridge, Barbara, Peppers and Male

In order to show the denoising performance of ITVWF, in addition to TV model, other state-of-the-art TV-based models such as FBD [6], TSM [1], TV-FF [9] and TV-FBD [31] are also compared for adding salt-and-pepper, Poisson, and Speckle noises. Tables 7 and 8 list the PSNR and SSIS of the six compared denoising models respectively. In these tables, black bold numbers indicate better values than ITVWF.

Table 7 . The PSNRs of the six denoising models for image Bridge, Barbara, Peppers and Male.
Table 8 The SSISs of the six denoising models for image Bridge, Barbara, Peppers and Male

In Table 7, for the image Peppers to add salt-and-pepper with d = 0.1 noise, TV-FBD achieved the highest PSNR, i.e. PSNR = 30.28, and PSNR of ITVWF is 30.21, the difference between them is 0.07, which is not significant.

For the image Male to add Poisson noise, FBD achieved the highest PSNR, i.e. PSNR = 30.56, and PSNR of ITVWF is 30.45, the difference between them is 0.11, which is not significant.

In addition to the above two cases, ITVWF always achieved the highest PSNR for all other cases, which obviously indicates that, in most cases, ITVWF can achieve better denoising performance than the other five compared models in terms of PSNR.

In Table 8, for the image Barbara to add Poisson noise, FBD achieved the highest SSIS, i.e. SSIS = 0.9836, and SSIS of ITVWF is 0.9831, the difference between them is 0.0005, which is not significant.

For the image Male to add salt-and-pepper with density d = 0.1, TV-FBD achieved the highest SSIS, i.e. SSIS = 0.9586, and SSIS of ITVWF are 0.9572, the difference between them is 0.0014, which is not significant.

In addition to the above two cases, ITVWF always achieved the highest SSIS for all other cases, which obviously indicates that, in most cases, ITVWF can achieve better denoising performance than the other five compared models in terms of SSIS.

From Tables 7 and 8, we notice that, for four images with Speckle noise, their PSNR and SSIS are lower than those of adding other two types of noises, i.e., salt-and-pepper with density d = 0.03 and Poisson, only are higher than some adding salt-and-pepper with density d = 0.1, which indicates that the denoising effect of ITVWF is not very satisfactory for Speckle noise with density d = 0.1.

From Tables 7 and 8, we further notice that PSNR (and SSIS) of ITVWF are always higher than that of TV model, which shows that improved TV model based on weighting function is very effective.

5.3 Comparison of CPU time

In addition to comparing the performance of different image denoising models, another important issue for image denoising is the CPU time. Table 9 shows CPU time of different image denoising models for images Bridge, Barbara, Peppers and Male with salt-and-pepper, Poisson, and Speckle noises. In Table 9, black bold numbers indicate that ITVWF has the least CPU time for specific noise.

Table 9 CPU time of different models (In seconds)

From Table 9, we find that, except for the images Bridge with salt-and-pepper noise d = 0.1, Barbara with salt-and-pepper noise d = 0.03 and Peppers with Speckle noise, although ITVWF is not a denoising model with the least CPU time, it has the least CPU time than all other four denoising models compared except TV model, which shows that IVTWF not only can provide satisfactory denoising performance, but also has a smaller computational cost in most cases, so it is a practical image denoising model.

6 Conclusions

Due to the traditional TV-based image denoising model only applies the gradient information and ignores the local variance of the image, it is easy to generate obvious staircase effects. In order to make better use of the advantages of TV model and overcome its disadvantages, in this paper, we propose a novel image denoising model ITVWF, which introduces the weighting function into the regular term of TV model. TV model is applied to transform the image denoising into a problem of minimizing the energy function, and the weighting function is used to calculate the gradient magnitude and local variance values of each pixel and analyze their characteristics in the different regions of an image. The experimental results show that the proposed image denoising model ITVWF has several main advantages.

  1. (1)

    The advantages of ITVWF are that it is unsupervised and simple to operate, and its denoising process is fully adaptive and does not require manual intervention. During image denoising, different regions of the image are processed differently, which only depends on the content of the image.

  2. (2)

    ITVWF can effectively suppress the staircase effect of the traditional TV model in most cases, and image details and local information can be better protected.

  3. (3)

    ITVWF can be used for denoising various noises with lower CPU time.