Introduction

Digital image processing is a crucial component within a multitude of domains in the fields of science and engineering, playing an indispensable role in the advancement and innovation of such industries. It is used in medical imaging, remote sensing, multimedia applications and computer vision [1]. However, acquisition and transmission of images in real-world scenarios are often ruined by different forms of noise which affects the image quality and subsequent analysis [1]. Among these types of noise, impulse noise poses a particular challenging problem due to its irregular and abrupt occurrence which results in a corrupt pixel value disrupting the underlying image structure.

In recent years, researchers have devoted extensive efforts in developing denoising techniques to solve impulse noise problems. Some of the classical methods include median filtering and adaptive filtering which achieved some level of success in reducing noise artifact.

An adaptive median filter (AMF) [2] can detect and filter impulse noise by utilizing two equations to predict pixels that are likely to be polluted. The real picture is indicated by \(X\), and \({\rm K} = \left\{ {1,2,3, \ldots M} \right\} \times \left\{ {1,2,3, \ldots N} \right\}\) is the index set of \(X\), \({\rm N} \subset {\rm K}\) is the set of indices of noise pixels detected in the first phase, \(P_{w,v}\) is the set of four closest neighbors of a pixel at position \(\left( {w,v} \right) \in {\rm K}\), \(y_{w,v}\) is the observed pixel value of the image at position \(\left( {w,v} \right),\) and \(u_{w,v} = \left[ {\mu_{w,v} } \right]_{{\left( {w,v} \right) \in {\rm N}}}\) is a column vector of length \(c\) ordered lexicographically, see [3]. The letter \(c\) represents the number of elements in\(N\). The following function is minimized in the second equation to recover noise pixels:

$$ f_{\alpha } \left( \mu \right) = \mathop \sum \limits_{{\left( {w,v} \right) \in {\rm N}}} \left[ {\left| {\mu_{w,v} - y_{w,v} } \right| + \frac{\beta }{2}\left( {2 \times S_{w,v}^{1} + S_{w,v}^{2} } \right)} \right] $$
(1)

where \(\beta \) denotes the regularization parameter, and: \({S}_{w,v}^{1}=2{\sum }_{(m,n)\in {\rm P}_{w,v}\cap {\rm N}^{c}}{\phi }_{\alpha }({\mu }_{w,v}-{y}_{m,n}),\) \({S}_{i,j}^{2}={\sum }_{(m,n)\in {\rm P}_{w,v}\cap {\rm N}}{\phi }_{\alpha }({\mu }_{w,v}-{y}_{m,n})\) where \({\phi }_{\alpha }=\sqrt{\alpha +{x}^{2}},\alpha >0\) is a potential function used to reduce noise in an optimization process by minimizing a smooth function without the need for a non-smooth data-fitting term:

$$ f_{\alpha } \left( \mu \right) = \mathop \sum \limits_{{\left( {w,v} \right) \in {\rm N}}} \left[ {\left( {2 \times S_{w,v}^{1} + S_{w,v}^{2} } \right)} \right]. $$
(2)

However, the traditional methods often have difficulty in striking balance between noise suppression and preservation of image details. Hence, a need for more advanced and adaptive methods that can solve the problems posed by impulse noise in digital images efficiently.

Gradient methods are known to be vital in solving these kinds of problems. One of the popular gradient methods is conjugate gradient method which is a widely used optimization algorithm with remarkable performance and nice convergence properties. Given the below problem,

$$ Minf_{\alpha } \left( \mu \right){ ,}\quad {\text{u}} \in R^{n} $$
(3)

\({f}_{\alpha }(\mu ):{R}^{n}\to R\). Conjugate gradient technique creates a sequence of iterations in the following format:

$$ \mu_{k + 1} = \mu_{k} + \alpha_{k} d_{k} . $$
(4)

where \(d_{k}\) is the search direction and \(\alpha_{k}\) is the step length obtained through line search, as in:

$$ \alpha_{k} = - \frac{{g_{k}^{T} d_{k} }}{{d_{k}^{T} Qd_{k} }}. $$
(5)

For more detailed information, interested reader can refer to [4]. The step length determined by Wolfe conditions is represented by a parameter \({\alpha }_{k}\) as follows:

$$ f\left( {\mu_{k} + \alpha_{k} d_{k} } \right) \le f\left( {\mu_{k} } \right) + \delta \alpha_{k} g_{k}^{T} d_{k} $$
(6a)
$$ d_{k}^{T} g\left( {\mu_{k} + \alpha_{k} d_{k} } \right) \ge \sigma d_{k}^{T} g_{k} $$
(6b)

where \(0<\delta <\sigma <1\), see [5,6,7] for more information. The computation of the search direction in the conjugate gradient algorithm is achieved through the following equation:

$$ d_{k + 1} = - g_{k + 1} + \beta_{k} d_{k} $$
(7)

where \({\beta }_{k}\) is a scalar called conjugate gradient parameter. There are many conjugate gradient parameters developed by different researchers such as the Dai-Yuan (DY) [8] and the Fletcher-Reeves (FR) [9] which are defined as follows:

$$ \beta_{k}^{FR} = \frac{{\left\| {g_{k + 1} } \right\|^{2} }}{{\left\| {g_{k} } \right\|^{2} }},\beta_{k + 1}^{DY} = \frac{{\left\| {g_{k + 1} } \right\|^{2} }}{{d_{k}^{T} y_{k} }}. $$
(8)

Many research have been conducted on the qualities of convergence that are shown by conjugate gradient techniques [8,9,10,11,12,13,14,15]. Among them is the Zoutendijk [16], who demonstrated the global convergence of the FR method when accurate line searching is performed. Several researchers have developed different formulae for the conjugate gradient parameter that have some numerical performance and convergence properties. Some works on the nonlinear conjugate gradient method can be found in Hideaki and Yasushi’s [17] and Basim [18], in which the parameters are defined as follows:

$$ \beta_{k}^{HY} = \frac{{\left\| {g_{k + 1} } \right\|^{2} }}{{2/\alpha_{k} \left( {f_{k} - f_{k + 1} } \right)}},\beta_{k}^{B} = \frac{{\left\| {g_{k + 1} } \right\|^{2} }}{{\left( {f_{k} - f_{k + 1} } \right)/\alpha_{k} - g_{k}^{T} d_{k} /2}}. $$
(9)

These algorithms demonstrated some nice properties and good performance [19]. Also, it is quite effective in terms of computing. In the context of unconstrained problems, the proposed update aims to optimize the benefits of the original conjugate gradient techniques by utilizing a quadratic model to enhance performance.

In this work, we propose a novel approach for impulse noise reduction in digital images using conjugate gradient method. Our idea is to utilize the conjugate gradient method’s iterative nature and adaptability and tailor it for noise reduction in images aiming to have an optimal trade-off between noise removal and image fidelity. We propose a new conjugate gradient parameter and perform convergence analysis. This is followed by numerical experiment and the experimental results are compared with an existing method.

Deriving the New Conjugate Gradient Parameter

Let us begin with the below Taylor series:

$$ f\left( \mu \right) = f\left( {\mu_{k + 1} } \right) + g_{k + 1}^{T} \left( {\mu - \mu_{k + 1} } \right) + \frac{1}{2}(\mu - \mu_{k + 1} )^{T} Q\left( {\mu_{k} } \right)\left( {\mu - \mu_{k + 1} } \right) $$
(10)

where \(Q\) is Hessian matrix. Putting \(\mu ={\mu }_{k}\) and we find the derivative as:

$$ {\text{g}}_{{{\text{k}} + 1}} = {\text{g}}_{{\text{k}}} + {\text{Q}}\left( {{\text{u}}_{{\text{k}}} } \right){\text{s}}_{{\text{k}}} $$
(11)

we can write (11) as:

$$ s_{k}^{T} Q\left( {\mu_{k} } \right)s_{k} = 2\left( {f_{k} - f_{k + 1} } \right) + 2g_{k + 1}^{T} s_{k} , $$
(12)

applying \({y}_{k}=Q{s}_{k}\) \(\mathrm{Q}({\mathrm{u}}_{\mathrm{k}}){\mathrm{s}}_{\mathrm{k}}\) \(\mathrm{Q}({\mathrm{u}}_{\mathrm{k}}){\mathrm{s}}_{\mathrm{k}}\) in (12), we have:

$$ s_{k}^{T} Q\left( {\mu_{k} } \right)s_{k} = 2\left( {f_{k} - f_{k + 1} } \right) + 2y_{k}^{T} s_{k} + 2g_{k}^{T} s_{k} . $$
(13)

Applying conjugacy condition \({d}_{k+1}^{T}{y}_{k}=0\) on search direction \({d}_{k+1}\) will lead to:

$$ \beta_{k} = \frac{{g_{k + 1}^{T} Qs_{k} }}{{d_{k}^{T} Qs_{k} }} $$
(14)

where \(Q\) is Hessian matrix. Putting (13) in (14), it holds that:

$$ \beta_{k} = \frac{{g_{k + 1}^{T} y_{k} }}{{2\left( {f_{k} - f_{k + 1} } \right)/\alpha_{k} + 2y_{k}^{T} d_{k} + 2g_{k}^{T} d_{k} }} $$
(15)

Using exact line search in (15), then it reduces to:

$$ \beta_{k} = \frac{{\left\| {g_{k + 1} } \right\|^{2} }}{{2\left( {f_{k} - f_{k + 1} } \right)/\alpha_{k} + 2y_{k}^{T} d_{k} + 2g_{k}^{T} d_{k} }} $$
(16)

which is the new proposed conjugate parameter.

Global Convergence

In order to ascertain the convergence, we use the following assumptions:

  1. 1.

    \(f(\mu )\) is bounded below on the level set \(\mathcal{H}=\left\{\mu :\mu \in {R}^{n},f(\mu )\le f({\mu }_{1})\right\}\).

  2. 2.

    The first-order derivative \(\nabla f(\mu )\) is Lipschitz continuous with \(L>0\), i.e., for all \(\tau ,\upsilon \in {R}^{n}\), the following inequality holds:

    $$ \|g\left( \tau \right) - g\left( \upsilon \right)\| \le L \|\tau - \upsilon\| ,\forall \tau ,\upsilon \in R^{n} $$
    (17)

In order to comprehensively assess the global convergence characteristics of the Conjugate Gradient method, it is of utmost importance to have a thorough understanding of the Zoutendijk criterion [16], which is the following lemma.

Lemma 1

Assuming that (1) and (2) are true, \({\alpha }_{k}\) satisfies the Wolfe conditions, and that \(d_{k}\) is the descent direction, then:

$$ \sum\limits_{k = 1}^{\infty } {\frac{{\left( {g_{k}^{T} d_{k} } \right)^{2} }}{{\left\| {d_{k} } \right\|^{2} }}} < \infty $$
(18)

Theorem 1

If the assumptions and lemma 1 are true and \(\left\{{\mu }_{k}\right\}\) is a new sequence. Then:

$$ \mathop {lim}\limits_{k \to \infty } \left( {\inf \left\| {g_{k} } \right\|} \right) = 0 $$
(19)

Theorem 2

If the step length \({\alpha }_{k}\) is obtained via relation Wolfe conditions, then:

$$ { }\mathop {lim}\limits_{k \to \infty } \,inf\left\| {g_{k} } \right\| = 0. $$
(20)

Proof

Suppose by contradiction Eq. (20) is not true. Then for all k, we can find an r > 0 for which:

$$ \left\| {g_{k + 1} } \right\| > r $$
(21)

But from search direction as \({d}_{k+1}+{g}_{k+1}={\beta }_{k}{d}_{k}\) and squaring both sides, we obtain:

$$ \left\| {d_{k + 1} } \right\|^{2} + \left\| {g_{k + 1} } \right\|^{2} + 2d_{k + 1}^{T} g_{k + 1} = (\beta_{k} )^{2} \left\| {d_{k} } \right\|^{2} . $$
(22)

Utilizing (17) to (22) we get:

$$ \left\| {d_{(k + 1)} } \right\|^{2} = {{\left( {\left( {d_{(k + 1)}^{T} g_{(k + 1)} } \right)^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\left( {d_{(k + 1)}^{T} g_{(k + 1)} } \right)^{2} } \right)} {\left( {\left( {d_{k}^{T} g_{k} } \right)^{2} } \right)\left\| {d_{k} } \right\|^{2} }}} \right. \kern-0pt} {\left( {\left( {d_{k}^{T} g_{k} } \right)^{2} } \right)\left\| {d_{k} } \right\|^{2} }} - 2d_{(k + 1)}^{T} g_{(k + 1)} - \left\| {g_{(k + 1)} } \right\|^{2} . $$
(23)

When (23) is divided by \(({d}_{k+1}^{T}{g}_{k+1}{)}^{2}\) we obtain:

$$ \begin{aligned} \frac{{\left\| {d_{k + 1} } \right\|^{2} }}{{(d_{k + 1}^{T} g_{k + 1} )^{2} }} & = \frac{{\left\| {d_{k} } \right\|^{2} }}{{(d_{k}^{T} g_{k} )^{2} }} - \frac{{\left\| {g_{k + 1} } \right\|^{2} }}{{(d_{k + 1}^{T} g_{k + 1} )^{2} }} - \frac{2}{{d_{k + 1}^{T} g_{k + 1} }} \\ & \le \frac{{\left\| {d_{k} } \right\|^{2} }}{{(d_{k}^{T} g_{k} )^{2} }} - \left( {\frac{{\left\| {g_{k + 1} } \right\|}}{{\left( {d_{k + 1}^{T} g_{k + 1} } \right)}} + \frac{1}{{\left\| {g_{k + 1} } \right\|^{2} }}} \right) + \frac{1}{{\left\| {g_{k + 1} } \right\|^{2} }} \\ & \le \frac{{\left\| {d_{k} } \right\|^{2} }}{{(d_{k}^{T} g_{k} )^{2} }} + \frac{1}{{\left\| {g_{k + 1} } \right\|^{2} }}. \\ \end{aligned} $$
(24)

as a result, we came to:

$$ \frac{{\left\| {d_{k + 1} } \right\|^{2} }}{{(d_{k + 1}^{T} g_{k + 1} )^{2} }} \le \mathop \sum \limits_{i = 1}^{k + 1} \frac{1}{{\left\| {g_{i} } \right\|^{2} }}. $$
(25)

Assume there is \({c}_{1}>0\) such that \(\Vert {g}_{k}\Vert \ge {c}_{1}\) for all \(k\in n\) exists. Then:

$$ \frac{{\left\| {d_{k + 1} } \right\|^{2} }}{{\left( {d_{k + 1}^{T} g_{k + 1} } \right)^{2} }} < \frac{k + 1}{{c_{1}^{2} }}. $$
(26)

Then finally, we get:

$$ \mathop \sum \limits_{k = 1}^{\infty } \frac{{\left( {g_{k}^{T} d_{k} } \right)^{2} }}{{\left\| {d_{k} } \right\|^{2} }} = \infty $$
(27)

Likewise, by Lemma 1, we get \(\underset{k\to \infty }{mathrm{lim}} \, inf\left(\Vert {g}_{k}\Vert \right)=0\) holds.

Numerical Experiments

In this section, we have executed some numerical experiments to show the performance of the proposed method. Our primary objective is to demonstrate the efficacy of this approach in significantly reducing the amount of salt-and-pepper noise from images. The entire experiment is performed using MATLAB r2017a. A comprehensive comparison has been made between the newly presented method and the FR method. In order to achieve this objective, we have utilized a robust and diverse set of parameters for the experiment. Our findings revealed that the proposed method is indeed highly effective in reducing the amount of salt-and-pepper noise from images, thereby demonstrating its potential to be utilized in real-world applications. We use:

$$ \frac{{\left| {f\left( {\mu_{k} } \right) - f\left( {\mu_{k - 1} } \right)} \right|}}{{\left| {f\left( {\mu_{k} } \right)} \right|}} \le 10^{ - 4} \,{\text{and}}\,\left\| {f\left( {\mu_{k} } \right)} \right\| \le 10^{ - 4} \left( {1 + \left| {f\left( {\mu_{k} } \right)} \right|} \right) $$
(28)

as a stopping criterion.

Popular benchmark images of Lena, Home, the Cameraman, and Elaine are all included in the test pictures. To evaluate the level of quality achieved by the methods, we used the peak signal-to-noise ratio, often known as PSNR as an indicator for quality restoration given by

$$ PSNR = 10\log_{10} \frac{{255^{2} }}{{\frac{1}{MN}\mathop \sum \nolimits_{i,j} \left( {\mu_{i,j}^{r} - \mu_{i,j}^{*} } \right)^{2} }} $$
(29)

in this instance, the abbreviations \({\mu }_{i,j}^{r}\) and \({\mu }_{i,j}^{*}\) stand for the pixel values of the original image and the restored image, respectively, see [20,21,22,23]. The result of the peak signal to noise ratio (PSNR), the number of function evaluations (NF), and the iterations (NI) required for the entire denoising process are provided for the restored image. As per Table 1, it is evident that the novel method outperformed the FR method in most of the test images. Furthermore, Table 1 illustrates the PSNR values generated by comparing the proposed method to the FR approach. It can be seen that the proposed method performs better than the FR method based on the number of iterations, function evaluations, and peak signal to noise ratio. Ultimately, this indicates that the proposed approach is more efficient and effective than the FR method. Thus, we can conclude that as a denoising strategy, the proposed method is superior in comparison to the FR method (Figs. 1, 2, 3 and 4).

Table 1 Numerical results of FR and proposed method
Fig. 1
figure 1

Demonstration of FR and new method on lena image

Fig. 2
figure 2

Demonstration of FR and new method on house image

Fig. 3
figure 3

Demonstration of FR and new method on elaine image

Fig. 4
figure 4

Demonstration of FR and new method on cameraman image

Conclusions

In this work, we presented a unique conjugate gradient method for impulse noise reduction in digital images. The proposed method demonstrated a remarkable denoising capabilities by effectively restoring corrupted pixel values while preserving essential image features. The result of our experiments showed efficiency of our proposed method by outperforming the existing FR method. This work contributes to the increasing interest in optimization-based methods for image restoration and motivates further exploration of the conjugate gradient method in solving different image related problems.