1 Introduction

In recent years, with the rapid development of various types of digital equipment, image has become an important expression form for information. Digital images can be obtained through different sensors such as photographic cameras, medical scanners and weather satellites. Images generally are at risk of contamination by noise during the acquisition, transmission and compression processes. Thus, it is fundamental to suppress the noise while preserving important features of the image. The main purpose of image denoising approaches is to recover a digital image that has been spoiled by noise.

Over the last years, a variety of methods have been proposed to deal with the denoising problem [45, 47,48,49]. Some linear filtering methods [38, 39] have been suggested to remove Gaussian and uniform noise in images. Other commonly used linear filtering are Wiener filter [18] and Mean filter [23, 44]. Nonlinear image filters [11, 12] have emerged to improve the effectiveness of linear filters, where the median filter is the most used nonlinear filtering [42]. Various wavelet-based techniques have also been proposed for image denoising [17, 21, 34, 36]. The sparse representation has received a lot of attention from the image processing, resulting in the appearance of many practical approaches [1, 13, 15, 29]. Image denoising techniques based on partial differential equation and Computational Fluid Dynamics (CFD) have been developed, such as Total Variation (TV) methods [8, 14, 26, 37], level set methods [40], essentially non oscillatory schemes [9], and nonlinear diffusion algorithms [7, 22, 24, 31].

Recently, Partial Differential Equation (PDE) approaches of image denoising, such as linear and nonlinear diffusion algorithms have become important. The linear diffusion methods have been derived from the use of the Gaussian filter in multi-scale image analysis [20]. In order to eliminate the adverse effects of linear PDE-based techniques like blurring, the nonlinear PDE-based approaches have attracted a lot of attention in image denoising and enhancement. The most frequently applied nonlinear PDE denoising method is the diffusion scheme developed by Perona and Malik [33]. Since then, many different techniques have been devised accordingly [43]. The nonlinear PDE-based approaches are able to smooth the images while preserving the edges and also preventing the localization problems of linear filtering.

There are a variety of methods to achieve the nonlinear PDEs. In image processing, it is a common practice to obtain them from the variational problems. Minimizing the energy function is believed to be the essential basis of any variational PDE techniques [4, 10]. The model which is known as the TV, has been developed by Rudin, Osher and Fatemi, and is based on the minimization of the TV norm. Many PDE approaches improving this model have also appeared and perfectly studied in the recent years [10].

For image denoising purposes, several ways have been suggested that use Genetic Algorithm (GA) [19, 25, 41]. Stochastic optimization is a generic term for optimization heuristics which include such approaches as genetic algorithms and simulated annealing [16]. These techniques apply algorithms that mimic natural processes, such as selection and mutation in natural evolution, or metallurgical processes, such as the annealing of metals, to evolve solutions for difficult and large problems.

Partial differential equations have proven to be a useful tool in image denoising procedures. The main idea is to deform an image with a PDE and achieve the expected image as a solution to this equation. Since the noise is related to high frequencies, it is difficult to remove the noise while preserving the important features, such as edges. Recently, some denoising methods by combining different PDE-based models have been proposed [46]. They often perform more diffusion in the flat areas of the image and less diffusion in the edges of the image. Different PDE-based denoising models have various manners during the time. The combination of suitable PDE-based denoising models often yields images with higher quality. In order to obtain an image with higher quality than those obtained by two PDE-based models, a novel image denoising algorithm has been proposed by using stochastic optimization algorithm for combination purposes. The algorithm highlights the role of better model in each time step. The new method has more denoising ability in terms of Peak Signal to Noise Ratio (PSNR), Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) and visual quality, compared with the used PDE-based denoising models.

The rest of this paper is organized as the following: The PDE-based approaches are briefly described in Section 2. The proposed algorithm is described in Section 3. In Section 4, we present our experimental results that confirm the efficiency of proposed method and finally some concluding remarks are presented in Section 5.

2 PDE-based models

The use of PDE for image denoising has become a main research topic in the past few years and a large number of PDE-based methods have been proposed. The Isotropic Diffusion (ID) model, the Anisotropic Diffusion (AD) such as the Perona-Malik (PM) model, and the Total Variation (TV) model are good applications based on PDEs. In this section, we briefly describe these models and also the well-balanced anisotropic scheme [5], which can be used as the PDE-based methods in Sections 3 and 4.

ID model is a linear diffusion model which is usually used to smooth an image. This model allows us to remove the noise very well, but unfortunately will blur the edge of image during removing the noise. ID model [46] is described as follows:

$$ \left\{\begin{array}{c}\hfill \frac{\partial u}{\partial t}=\nabla .\left(\nabla u\right)+\lambda \left({u}_0- u\right),\hfill \\ {}\hfill \begin{array}{cc}\hfill \frac{\partial u}{\partial n}=0\hfill & \hfill \mathrm{on}\hfill \end{array}\partial \Omega \times \left(0, T\right),\hfill \\ {}\hfill \begin{array}{cc}\hfill u\left( x, y, t\right)\left|{}_{t=0}={u}_0\left( x, y\right)\right.\hfill & \hfill \mathrm{in}\kern0.24em \Omega, \hfill \end{array}\hfill \end{array}\right. $$
(1)

where u(x, y, t)| t = 0 = u 0(x, y) is the initial condition, u(x, y, t) is the restored version of the initial degraded image u 0(x, y), ∇ is gradient operator with respect to the spatial variables x , y,and Ω is an open bounded domain in 2.

In order to avoid the blurring and localization problems of linear diffusion filtering, Perona and Malik [33] proposed a nonlinear diffusion method based on the following equation

$$ \frac{\partial u}{\partial t}=\nabla .\Big( g\left(\left|\nabla u\right|\Big)\nabla u\right). $$
(2)

In [33] the following two diffusion functions are considered

$$ \begin{array}{cc}\hfill {g}_1(s)=\frac{1}{1+{\left(\frac{s}{k}\right)}^2},\hfill & \hfill {g}_2(s)\hfill \end{array}= \exp \left(-{\left(\frac{s}{k}\right)}^2\right), $$
(3)

where k > 0 is the contrast parameter. The choice of the diffusion function g heavily influences the process of controlling the smoothing. This function is defined to satisfy\( \kern0.5em {\mathit{\lim}}_{s\to 0} g(s)=1 \) and \( {\mathit{\lim}}_{s\to \infty } g(s)=0 \), so that the diffusion is high while the gradient is small and vice-versa. As a result, the diffusion is maximal within uniform regions and stops across edges.

Using g 1 as a diffusivity function, the Perona-Malik’s model is equivalent to minimize

$$ E(u)={\int}_{\Omega}\frac{k^2}{2}\mathrm{In}\left({k}^2+\left|\nabla u\left|{}^2\right.\right.\right) dxdy, $$
(4)

where Ω ∈  2 is the image domain.

The PM model appears to be an ill-posed problem. It means the existence and uniqueness of the solution of (2) cannot be guaranteed. So when the noise and edge have the same gradient, it cannot be applied for denoising. Therefore, the PM model can cause Gibbs-type artifacts.

A general model of the anisotropic diffusion equation which was first proposed by Perona and Malik can be expressed [3] as follows:

$$ \left\{\begin{array}{c}\hfill \frac{\partial u}{\partial t}=\nabla .\left( g\left(\left|\nabla u\right|\right)\nabla u\right)+\lambda \left({u}_0- u\right),\hfill \\ {}\hfill \begin{array}{cc}\hfill \frac{\partial u}{\partial n}=0\hfill & \hfill \mathrm{on}\hfill \end{array}\partial \Omega \times \left(0, T\right),\hfill \\ {}\hfill \begin{array}{cc}\hfill u\left( x, y, t\right)\left|{}_{t=0}={u}_0\left( x, y\right)\right.\hfill & \hfill \mathrm{in}\kern0.24em \Omega, \hfill \end{array}\hfill \end{array}\right. $$
(5)

where u(x, y, t)| t = 0 = u 0(x, y) is the initial condition and Ω is an open bounded domain in 2.

The diffusion–reaction Eq. (5) consists of the PM process with an additional term λ(u 0 − u), which punishes deviations of u from u 0. This term can retain characteristics of the original image and reduce distortion.

The total variation model was proposed by Rudin et al. [37] for edge- preserving and noise removal. The authors have taken the energy function of the image as

$$ E(u)={\iint}_{\Omega}\left(\left|\nabla u\right|+\frac{\lambda}{2}{\left({u}_0- u\right)}^2\right) dxdy. $$
(6)

The first term in this equation is a smoothing term, while the second term preserves the edges and details. The total variation model has a capacity of handling edges and removing noise in a given image [6]. The TV denoising model [2] can be written as follows:

$$ \left\{\begin{array}{c}\hfill \frac{\partial u}{\partial t}=\nabla .\left(\frac{\nabla u}{\left|\nabla u\right|}\right)+\lambda \left({u}_0- u\right),\hfill \\ {}\hfill \begin{array}{cc}\hfill \frac{\partial u}{\partial n}=0\hfill & \hfill \mathrm{on}\hfill \end{array}\partial \Omega \times \left(0, T\right),\hfill \\ {}\hfill \begin{array}{cc}\hfill u\left( x, y, t\right)\left|{}_{t=0}=\right.{u}_0\left( x, y\right)\hfill & \hfill \mathrm{in}\kern0.24em \Omega, \hfill \end{array}\hfill \end{array}\right. $$
(7)

where u(x, y, t)| t = 0 = u 0(x, y) is the initial condition and Ω is an open bounded domain in 2.

The TV model is a successful approach to recover images with sharp edges. Nevertheless the TV model produces a block effect when being applied for the flat areas, thus the local details characteristics of the original image is lost [27, 28].

Anisotropic diffusion is a key concept in digital image denoising. In order to develop the idea of removing noise without losing the boundaries or edges, the authors in [5] have proposed an anisotropic nonlinear diffusion equation that has two terms: the diffusion and the forcing term. The balance between these terms has been made in a selective manner, in which boundary points and interior points of the objects that make up the image are treated differently. The following Well-Balanced Flow (WBF) equation has been considered

$$ \left\{\begin{array}{c}\hfill \begin{array}{cc}\hfill \frac{\partial u}{\partial t}= g\left|\nabla \mathrm{u}\right|\nabla .\left(\frac{\nabla u}{\left|\nabla u\right|}\right)-\lambda \left(1- g\right)\left( u-{u}_0\right),\hfill & \hfill x\in \Omega, t>0\hfill \end{array}\hfill \\ {}\hfill \begin{array}{ccc}\hfill \frac{\partial u}{\partial n}=0\hfill & \hfill x\in \partial \Omega \hfill & \hfill, t>0\hfill \end{array}\hfill \\ {}\hfill \begin{array}{cc}\hfill u\left( x, y, t\right)\left|{}_{t=0}={u}_0\left( x, y\right)\right.\hfill & \hfill \mathrm{in}\kern0.24em \Omega \hfill \end{array}\hfill \end{array}\right. $$
(8)

where g = g(|G σ u|), u 0(x, y) is an image to be processed, u(x, y, t) represents its smoothed version in the scale t, G σ is a convolution kernel (here, a Gaussian function), and G σ u is the local estimate of ∇u used for noise elimination. The function g(s) ≥ 0 is a nonincreasing function, satisfying g(0) = 1 and g(s) → 0 when s → ∞.

3 New model

From the fact that minimizing the combination of suitable denoising energy functions (or combination PDE-based models) often yields images with higher quality, a hybrid model with random weight has been presented, which highlights the role of better model in each time step. In addition, a stochastic algorithm for being able to use any two PDE-based models (not fixed models with perfect information) has been employed. The method gradually evolves a population of solutions with the goal in mind of steadily improving the best solution.

Here we assumed that a solution (individual) is a matrix of pixels, whose entries are integer values ranging the interval [0, 255]. Before describing the stages of the algorithm in detail, we introduce some notations for our discussion. u 0 denotes the input noisy image. We denote by M i a PDE-based denoising model, such as ID, PM, TV, WBF or any other suitable PDE-based model. For simplicity, we assume that i = 1, 2. \( {u}_{M_i}^k \) denotes the solution of M i model on the time level k. For small k (k = 1 , 2 , 3), we consider \( {u}_{M_i}^k \) as a neighbor of the solution \( {u}_{M_i}^0 \) which is an initial solution for M i model.

The algorithm (whose processing steps are presented in Fig. 1) starts by generating some solutions randomly, evaluating them and inserting them into the population. For constructing a new population, a population generator is produced by selecting one half the individuals of current population which have the highest fitness values. The individuals of a new population are derived from the existing solutions in the population generator in two ways. In the first, two solutions in the neighborhood of existing solution are generated and evaluated, then the best solution is inserted into the new population. The second way generates a new solution randomly, as was done during initialization. The purpose of generating a solution randomly is to introduce new solutions, possibly different from those exist in the population, in order to prevent the population from converging prematurely [32]. After the construction of the new population is completed, its best solution will be considered as the best current solution. Finally, the algorithm terminates when no improvement has taken place in the best current solution for a pre-determined number of iterations or a predefined maximum number of iterations is achieved.

Fig. 1
figure 1

Flow diagram

In detail the new algorithm has five stages:

  1. 1-

    Initialization

  2. 2-

    Evaluation

  3. 3-

    Selection

  4. 4-

    Solution generation

  5. 5-

    Termination

3.1 Initialization

The algorithm starts by constructing a group of individuals (images) known as the initial population. The population will have N pop individuals (N pop will provide by the user, for example N pop  = 30). By taking \( {u}_{M_1}^0={u}_0 \) and \( {u}_{M_2}^0={u}_0 \), the individuals of initial population will be defined as follows:

$$ \begin{array}{cc}\hfill {u}_n={w}_n\ast {u}_{M_1}^1+\left(1-{w}_n\right)\ast {u}_{M_2}^1,\hfill & \hfill n=1,2,\dots, {N}_{pop},\hfill \end{array} $$
(9)

where w n is a random number between 0 and 1.

3.2 Evaluation

In order to determine the qualities of the individuals in a population, the fitness values are computed by using the No-Reference image quality assessment model BRISQUE (http://live.ece.utexas.edu/research/quality/BRISQUE_release.zip). This evaluator uses scene statistics of locally normalized luminance coefficients to quantify possible losses of naturalness in the image, due to the presence of distortion and the BRISQUE can be considered as a holistic measure of quality.

3.3 Selection

In order to improve the current best solution, in each iteration, we construct a generator as follows: First, the fitness values and associated individuals are ranked from highest fitness value to lowest fitness value. Then, the best N pop /2 members of the population are selected for constructing the generator G. From the fact that evaluator BRISQUE is a holistic measure of quality [30], this step of algorithm tries to discard the images with low quality in the new population.

3.4 Solution generation

By considering each u s  ∈ G as an initial condition for PDE̓s models (i. e., \( {u}_{M_1}^0={u}_s \) and \( {u}_{M_2}^0={u}_s \)), two individuals will be generated for the new population, by the following ways:

  1. 1-

    The first way computes the solutions \( {u}_{M_1}^k \) and \( {u}_{M_2}^k \) (the neighbors of u s ), by using the denoising models M 1 and M 2, respectively. Then, the new individual u new1 for the new population is obtained by choosing the best solution between \( {u}_{M_1}^k \) and \( {u}_{M_2}^k \), according to the Fitness function.

  2. 2-

    The second way generates a new individual u new2 which is defined as the best solution between:

    $$ \begin{array}{ccc}\hfill w\ast {u}_{M_1}^k+\left(1- w\right)\ast {u}_{M_2}^k\hfill & \hfill \mathrm{and}\hfill & \hfill \left(1- w\right)\ast {u}_{M_1}^k\hfill \end{array}+ w\ast {u}_{M_2}^k, $$

    where w is a random number between 0 and 1. By using the weight w and (1 − w) and selecting the best solution, the image generated by the model with higher quality (in terms of BRISQUE) has more influence in constructing the new image.

3.5 Termination

Under these two conditions the process terminates:

  1. (i)

    No improvement has taken place in the best solution for a pre-determined number of iterations,

  2. (ii)

    A predefined maximum number of iterations is achieved.

The processing steps of the algorithm are presented in Fig. 1.

4 Experimental results and analysis

For the experimental results, the ID, PM, TV, and WBF models were used as the PDE-based models. In this section, we compare the experimental results obtained by the new method with the results obtained by the used PDE-based denoising models in terms of the visual quality of denoising image, PSNR according to (10), and Blind/Referenceless Image Spatial QUality Evaluator BRISQUE as described in Section 3.2. The PSNR is defined in decibels for 8-bit gray-scale images, as follows:

$$ PSNR=10\mathit{\log}\frac{255^2\times M\times N}{\sum_{i=1}^M{\sum}_{j=1}^N{\left[{I}_{or}\left( i, j\right)-{I}_{de}\left( i, j\right)\right]}^2}, $$
(10)

where M and N are the image dimensions, I or is the original image, I de is the denoised image, and 255 is the peak signal with an 8-bit resolution. A higher PSNR usually indicates that the image is of higher quality. The BRISQUE is a No-Reference image quality assessment model, which determines the quality score of the image and the score typically has a value between 0 and 100 (0 represents the best quality, 100 the worst).

The proposed method with the parameters N pop  = 30, I max  = 50, I local  = 5, k = 3, has been tested on several images; some of them are selected to illustrate the results. The commonly used 256 × 256 bit Cameraman, Lena, and House images are taken in figures.

In Fig. 2, we present the original Cameraman image, the corrupted image by speckle noise of variance 0.01, the results obtained by the TV model, the PM model, and our new model (hybrid of TV and PM). Table 1 contains the PSNRs and BRISQUEs of the PM, TV, and the new models. Figure 3 shows the PSNR and BRISQUE graphs of these models and indicates that the new model has higher PSNR and lower BRISQUE than the TV and PM models.

Fig. 2
figure 2

Denoising results of the 15th iteration of the denoising models

Table 1 The PSNRs and BRISQUEs of the 15th iteration (∆t = 0.1) of the PM, TV, and the new model (hybrid of TV and PM). Input Cameraman image is corrupted by speckle noise of variance 0.01
Fig. 3
figure 3

The PSNR and BRISQUE graphs of the TV model, PM model, and new model (hybrid of TV and PM) for various number of iterations (∆t = 0.1). Input Cameraman image is corrupted by speckle noise of variance 0.01

In Fig. 4, we present the original Lena image, the corrupted image by Gaussian noise of variance 0.01, the results obtained by the ID model, PM model, and the new model (hybrid of ID and PM). The PSNRs and the BRISQUEs of the ID model, PM model, and the new model are shown in Table 2. In Fig.5, we have plotted the PSNR and BRISQUE of these models. It can be seen that the PSNR of the new method has higher PSNR of the other models and after a few iterations the BRISQUE of the hybrid model becomes lower than the BRISQUE of the ID and PM models.

Fig. 4
figure 4

Denoising results of the 30th iteration of the denoising models

Table 2 The PSNRs and BRISQUEs of the 30th iteration (∆t = 0.1) of the ID, PM, and new model (hybrid of ID and PM). Input Lena image is corrupted by Gaussian noise of variance 0.01
Fig. 5
figure 5

The PSNR and BRISQUE graph of the ID model, PM model, and the new model (hybrid of ID and PM) for various number of iterations (∆t = 0.1). Input Lena image is corrupted by Gaussian noise of variance 0.01

Tables 3 and 4 present the PSNR and BRISQUE, respectively, for different variances of the Gaussian noise. As expected, the new method has the highest PSNR and the lowest BRISQUE for different variances of the noise.

Table 3 The PSNR of the 30th iteration(∆t = 0.1) of different algorithms with different variances of Gaussian noise
Table 4 The BRISQUE of the 30th iteration (∆t = 0.1) of different algorithms with different variances of Gaussian noise

Figure 6 demonstrates the original House image, the corrupted image by Gaussian noise of variance 0.01, the results obtained by the WBF model, PM model, and the new model (hybrid of WBF and PM). Table 5 Contains the PSNRs and BRISQUEs of the WBF model, PM model, and the new model. Figure 7 shows the PSNR and BRISQUE graphs of these models. It can be seen that the results are similar to those of the Cameraman and the Lena images.

Fig. 6
figure 6

Denoising results of the 8th iteration of the denoising models

Table 5 The PSNR and BRISQUE of the 8th iteration(∆t = 0.4) of the WBF, PM, and new model (hybrid of WBF and PM). Input House image is corrupted by Gaussian noise of variance 0.01
Fig. 7
figure 7

The PSNR and BRISQUE graphs of the PM model, WBF model, and the new model (hybrid of PM and WBF) for various number of iterations (∆t = 0.4). Input House image is corrupted by Gaussian noise of variance 0.01

For comparing the new model with different diffusion based schemes presented in [35], we used, as the noisy images, the images obtained by adding Gaussian noise of strength σ n  = 25 to the original Lena and the original House images. Table 6 shows the PSNR of some diffusion schemes presented in [35] and the new model for Lena and House images. These diffusion schemes are Anisotropic Diffusion (AD), Smoothed Gradient based anisotropic diffusion (SG), Total Variation (TV), Mean Curvature Motion (MCM), Well- Balanced Flow (WBF), Modified Smoothed Gradient (MSG) based anisotropic diffusion, Edge Enhancing Diffusion (EED), Coherence Enhancing Diffusion (CED), Slowed Anisotropic Diffusion (SAD), Adaptive TV (ATV), Adaptive Linear Diffusion (ALD), Edge detector based Anisotropic Diffusion (EAD), Weighted Linear Diffusion (WLD) and Weighted and Well- Balanced Flow (WWBF). It can be seen that the new model performs better than the most or on a par with a few other methods.

Table 6 PSNR comparison for different diffusion based schemes. Noisy images are obtained by adding Gaussian noise of strength σ n  = 25 to the original Lena and House images

So, from the experimental results, we can conclude that the new method is an efficient model in image denoising.

5 Conclusion

We proposed a new approach for image denoising by introducing a stochastic optimization algorithm for combining PDE-based denoising methods such as the ID model, the PM model, the TV model, and the WBF model. The new denoising model provides a new approach which is more efficient in image denoising than the used PDE- based denoising methods. To illustrate the superiority of the proposed model, we have used the Peak Signal to Noise Ratio (PSNR) and Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) as the subjective criterion. Numerical experiments show that our algorithm has higher PSNR and lower BRISQUE than the used denoising methods. Our experimental results confirm the high performance of the proposed model.