1 Introduction

In recent years, many interesting scientific problems have increasingly involved analyzing and manipulating structured data. Such data often consist of sampled real-valued functions defined on some irregular domain sets such as graphs. As many traditional methods for signal processing are designed for data defined on regular Euclidean spaces, such as image and video processing, the development of mathematical tools and methods that are able to accommodate complicated data domains is also an important topic. In practical applications, many data sets such as mesh surfaces, point clouds and data defined on network-like structures can naturally be modeled as scalar functions defined on the vertices of graphs, which are generally considered as a certain discretization or random samples from some Riemannian manifold [1, 10, 16, 19, 31, 37].

Image processing on the graph domain is interesting, for it arises in many practical applications and provides new insights in signal processing (see e.g. [12, 25, 38]). Graphs provide a flexible generalization of regular Euclidean domain and surface domain [2, 4, 15, 20, 28, 35, 37], and they can approximate arbitrary topology and geometry structure. For example, Niyobuhungiro et al. [25] considered an analogue of the well-known ROF denoising model [28] on a general finite directed and connected graph. Zosso et al. [38] considered a graph-based approach for image segmentation. More recently, Dong [12] introduced a tight wavelet frame transform on graphs and discussed graph data denoising.

A graph is denoted by \(G=\{V,E,\omega \}\), where \(V:=\{v_k:k=0, \ldots , K-1\}\) is the set of vertices, \(E\subset V \times V\) is the set of edges and \(\omega :E\mapsto \mathbb {R}^+\) denotes a weighted function of every two edges. Let \(u:V\mapsto \mathbb {R}\) be an image function defined on the graph G, which can be viewed as a vector in \(\mathbb {R}^K\). Due to sampling measurements and instruments, sampling noise is inevitable [29]. Thus, a fundamental problem is to remove noise to obtain high-quality images before further processing. Let \(f_G\) be an observed image on graph G which is formulated as

$$\begin{aligned} f_G=u+\eta , \end{aligned}$$
(1.1)

where u is the clean image and \(\eta \) is the perturbation noise. When \(\eta \) in (1.1) is additive white Gaussian noise, it is mostly considered in the early literature for its good characterization of system noise, whereas non-Gaussian type of noise are also encountered in many real observations due to noisy sensors and channel transmission (see e.g. [18]).

An important variant is Poisson noise, which is generally produced by low number of photons, such as fluorescence microscopy, emission tomography. The poisson noise data, i.e., the probability of receiving k particles is given by

$$\begin{aligned} \mathbf {P}(k)=\frac{e^{-\tau }\tau ^k}{k!}, k=0,1,2, \ldots , \end{aligned}$$
(1.2)

where \(\tau \) is the expected value and the variance of random counts. When images are defined on Euclidean space, many studies have been made for poisson noise removal in the past decade (see e.g. [9, 18, 21, 22, 36]). For example, based on the statistics of Poisson noise, the generalized Kullback–Leibler (KL)-divergence fidelity was used in a variational model for Poisson noise removal [11]. Luisier et al. in [23] constructed a Stein’s unbiased risk estimator (SURE) in the wavelet domain for removal of mixed Poisson–Gaussian noise. Gong et al. in [18] proposed a universal \(\ell _1 + \ell _2\) fidelity term for mixed or unknown noise removal. Recently, Staglianò et al. [33] and Li et al. [22] studied Poisson noise removal by approximating the generalized KL-divergence in terms of a weighted least-squares function.

The main focus of this paper is to extend Poisson noise removal of images on regular Euclidean space to images defined on graphs. We assume that the observation \(f_G\) on graph is corrupted by Poisson noise, i.e.

$$\begin{aligned} f_G \sim Poisson(u), \end{aligned}$$

and the noise on each pixel is independent. Inspired by some recent wavelet frames-based image restoration methodologies (see [8, 14, 18, 22] and the references therein), we propose the following variational model to remove noise:

$$\begin{aligned} \min \limits _u E(u):= \frac{1}{2}\Vert \frac{u-f_G}{\sqrt{u}}\Vert _{\ell _2(G)}^2+\lambda \Vert \mathcal {W}u\Vert _{\ell _1(G)}. \end{aligned}$$
(1.3)

Here, the first term can be rewritten as \(\frac{1}{2}\Vert u-f_G\Vert _{\Sigma ^{-1}}^2\), where \(\Sigma =\text {diag}(u)\), is a reweighed \(\ell _2\) fidelity term. This term was first introduced in [22, 33] for approximating the KL-divergence fidelity. In practical implement, in order to guarantee the numerical stability, we add a very small positive constant (the fixed background image, see [3, 6, 30]) to u in model (1.3). The second term is a regularization term, where \(\mathcal {W}\) is the tight wavelet frame transform on graphs, \(\Vert \cdot \Vert _{\ell _1 (G)}\) denotes the \(\ell _1\) vector norm on graphs, and \(\lambda \) is a positive parameter to balance the two terms. The detailed derivation of model (1.3) will be given in Sect. 3.

The regularization term is designed based on a priori assumption on u. One of the assumptions commonly used is the sparsity of the underlying solutions in the wavelet frame domain [7, 8, 14, 22]. The effectiveness of wavelet tight frames has been proved in many applications in signal and image processing [8, 14, 18, 22], since they are able to sparsely approximate piecewise smooth functions in an efficient way and provide fast decomposition and reconstruction algorithms. We will show that such a simple system can also be used to effectively restore images defined on graphs from noisy data. The extension of image denoising model on flat domain to graph data denoising is not trivial because of the nonlinear nature of the graphs and the corresponding algorithms [12, 13]. Furthermore, because of the \(\ell _1\)-norm, the regularization term \(\Vert \mathcal {W}u\Vert _{\ell _1(G)}\) gives preference to a solution u whose wavelet coefficient sequence is sparse, and to keep the important features of image data such as textures and edges while removing spurious information.

The difficulty for solving (1.3) is the nonlinear nature of the graph domain, and to either approximate or directly solve the weighted square problem involving unknown u in the fidelity term. We are interested in taking advantage of the weighted least square structure and utilizing popular efficient sparse regularization scheme, such as the alternating direction method of multipliers (ADMM) [5, 8, 17] to solve the model.

The rest of this paper is organized as follows: in Sects. 2.1 and 2.2, we give a brief review of spectral graph theory and the wavelet frame transform on graphs. Next, in Sect. 3.1, we propose a wavelet frame-based variational model for removal of Poisson noise of images on graphs. Then, we present an algorithm to solve the model. In Sect. 3.2, we consider the case of removal of mixed Poisson–Gaussian noise of images on graphs. In the last section, we present some numerical experiments and compare with other denoising methods.

2 Background

2.1 Graph Fourier transform

To understand and analyze the data on graphs, we first review the spectral graph theory, especially the graph Laplacian, which is widely used to reveal the geometric properties of the graph. Let \(G:=\{V,E,\omega \}\) be a graph, where \(V:=\{v_k:k=0, \ldots ,K-1\}\) is the set of vertices, \(E\subset V \times V\) is the set of edges, and \(\omega :E\mapsto \mathbb {R}^+\) denotes a weight function. Here, we choose the following commonly used weight function

$$\begin{aligned} \omega (v_k,v_{k'}):=e^{-\Vert v_k-v_{k'}\Vert _2^2/\rho },\quad \rho >0. \end{aligned}$$
(2.1)

Let \(A:=(a_{k,k'})\) be the adjacency matrix of G with

Let \(D:=\text {diag}\{d[0],d[1], \ldots ,d[K-1]\}\) be the degree matrix of G, where d[k] is the degree of node \(v_k\) defined by \(d[k]:=\Sigma _{k'}a_{k,k'}\). Then the (unnormalized) graph Laplacian \(\mathcal {L}\) can be defined by the following form

$$\begin{aligned} \mathcal {L}:=D-A. \end{aligned}$$

With eigenvalue decomposition, we denote the set of pairs of eigenvalues and eigenfunctions of \(\mathcal {L}\) as \(\{(\lambda _k,u_k)\}_{k=0}^{K-1}\) with \(0=\lambda _0<\lambda _1\le \lambda _2\le \cdots \le \lambda _{K-1}\). The eigenfunctions form an orthonormal basis for all functions on the graph:

$$\begin{aligned} \langle u_k,u_{k'}\rangle :=\sum _{n=0}^{K-1} u_k[n]u_{k'}[n]=\delta _{k,k'}. \end{aligned}$$

Let \(f_G:V\mapsto \mathbb {R}\) be a function on the graph G. Then its Fourier transform is defined by

$$\begin{aligned} \widehat{f_G}[k]:=\sum _{n=0}^{K-1} f_G[n]u_k[n], \quad \quad k=0, 1, \ldots , K-1. \end{aligned}$$

2.2 Wavelet frame transform on graphs

Wavelet frames have been proved over the past decade to be an exceptionally useful tool for image denoising, inpainting, function and surface reconstruction, etc. (see [7, 8, 13, 14, 22, 34, 35] and many references therein). Much of the power of wavelet methods comes from their ability to represent both smooth and/or locally bumpy functions in an efficient way and provide time and frequency localization. In this subsection, we introduce wavelet frame transform on graphs. Interested readers should consult [12] for more details.

A countable set \(X \subset L_2(\mathbb R)\) is called a tight frame of \(L_2(\mathbb R)\) if

$$\begin{aligned} f=\sum _{g \in X} \langle f, g \rangle g,\quad \forall \, f \in L_2(\mathbb R), \end{aligned}$$

where \(\langle \cdot , \cdot \rangle \) denotes the inner product of \(L_2(\mathbb R)\). A wavelet system \(X(\Psi )\) is defined to be a collection of dilations and shifts of a finite set \(\Psi :=\{\psi _1, \dots , \psi _r \} \subset L_2(\mathbb R)\),

$$\begin{aligned} X(\Psi ):=\{ \psi _{\ell ,j,k}=2^{j/2}\psi _{\ell } (2^j \cdot -k), 1\le \ell \le r, \, j, k \in \mathbb Z \}. \end{aligned}$$

When \(X(\Psi )\) forms a tight frame, it is called a wavelet tight frame.

To construct wavelet tight frames, one usually starts with a refinable function \(\phi \) with a refinement mask \(a_0\) satisfying

$$\begin{aligned} \phi =2 \sum _{k \in \mathbb Z} a_0[k] \phi (2\cdot - k). \end{aligned}$$

The idea of an MRA-based construction of a wavelet tight frame is to find masks \(a_{\ell }\), which are finite sequences, such that

$$\begin{aligned} \psi _{\ell }=2 \sum _{k \in \mathbb Z} a_{\ell }[k] \phi (2\cdot - k), \quad 1\le \ell \le r. \end{aligned}$$

The sequences \(a_1, \ldots , a_r\) are the high pass filters of the system, and \(a_0\) is the low pass filter.

The Unitary Extension Principle (UEP) of [26, 27] provides a general theory of the construction of tight wavelet frame. Roughly speaking, \(X(\Psi )\) forms a tight frame provided that

$$\begin{aligned} \sum _{\ell =0}^r |\widehat{a}_{\ell }(\xi ) |^2 =1 \quad \text{ and } \quad \sum _{\ell =0}^r \widehat{a}_{\ell }(\xi ) \overline{\widehat{a}_{\ell } (\xi +\pi )} =0, \end{aligned}$$

where the Fourier series of \(a_{\ell }\) is denoted as

$$\begin{aligned} \widehat{a}_{\ell }(\xi ):=\sum _{k\in \mathbb {Z}}a_{\ell }[k]e^{-ik\xi }. \end{aligned}$$

As an application of UEP, a family of wavelet tight frame systems is derived in [26] by using uniform B-splines as the refinable function \(\phi \). The simplest system in this family is the piecewise linear B-spline frame, where \(\phi =B_2=\max (1-|x|, 0)\) with the refinement mask \(a_0=\frac{1}{4}[1, 2, 1]\), and \(a_1=\frac{\sqrt{2}}{4} [1, 0, -1 ]\), \(a_2=\frac{1}{4} [-1, 2, -1]\). Then, the system \(X(\{\psi _1, \psi _2\})\) defined in (2.2) is a tight wavelet frame of \(L_2(\mathbb R)\).

For a graph \(G:=\{V,E,\omega \}\), we formulate function \(f_G:V\mapsto \mathbb {R}\) by a K-dimensional vector defined on the vertices. Let \(\{ \lambda _k: {k=0} \ldots , {K-1} \}\) be the eigenvalues of \(\mathcal {L}\) defined in Sect. 2.1. In the following, we introduce the discrete tight wavelet frame transform of \(f_G\) in the Fourier domain.

Let \(\{a_{\ell }: 0 \le \ell \le r \}\) be the masks of a tight frame system \(X(\Psi )\) and \(\widehat{a}_{\ell }^*\) be the complex conjugate of \(\widehat{a}_{\ell }\). The (undecimated) L-level tight wavelet frame decomposition \(\mathcal {W}\) is defined as

$$\begin{aligned} \mathcal {W}f_G := \{\mathcal {W}_{\ell , p}f_G: 0\le \ell \le r,\, 1 \le p \le L\} \end{aligned}$$

with

$$\begin{aligned}&\widehat{\mathcal {W}_{\ell , p}f_G}[k]\nonumber \\&\quad :=\left\{ \begin{aligned}&\widehat{a}_{\ell }^*(2^{-N}\lambda _k)\widehat{f_G}[k]&p=1\\&\widehat{a}_{\ell }^*(2^{-N+p-1}\lambda _k)\widehat{a}_0^*(2^{-N+p-2}\lambda _k)\cdots \widehat{a}_0^*(2^{-N}\lambda _k)\widehat{f_G}[k] \quad&2\le p \le L. \end{aligned} \right. \nonumber \\ \end{aligned}$$
(2.2)

Here, the index \(\ell , 0 \le \ell \le r\), denotes the band of the transform, and the index p denotes the level of the transform. The dilation scale N is chosen as the smallest integer such that \(\lambda _{{\max }}:=\lambda _{K-1}\le 2^N \pi \). Note that the scale N is selected such that \(2^{-N}\lambda _k\in [0,\pi ]\) for \(0 \le k \le {K-1}\).

Let \(\varvec{\alpha }:= \mathcal {W}f_G :=\{\alpha _{\ell , p}:0\le \ell \le r,\, 1 \le p \le L\}\) with \(\alpha _{\ell , p} := \mathcal {W}_{\ell , p} f_G\), be the tight wavelet frame coefficients of \(f_G\). We denote the tight wavelet frame reconstruction as \(\mathcal {W}^T\varvec{\alpha }\), which is defined by the following iterative procedure

$$\begin{aligned} \widehat{\alpha }_{0,p-1}[k]= & {} \sum _{\ell =0}^{r}\widehat{a}_{\ell }(2^{-N+p-1}\lambda _k)\widehat{\alpha }_{\ell , p}[k] \\&\text {for} \,\, p = L,L-1, \ldots , 1, \end{aligned}$$

where \(\alpha _{0,0}:= \mathcal {W}^T\varvec{\alpha }\) is the reconstructed graph data from \(\varvec{\alpha }\). By [12, Theorem 3.1], we have \(\mathcal {W}^T\mathcal {W} = \mathcal {I}\), i.e. \(\mathcal {W}^T\mathcal {W}{f}_{G}={f}_{G}\) for any function \({f}_{G}\) defined on graph G.

In practical computations, it is very expensive to obtain the full set of eigenvectors and eigenvalues of the graph Laplacian of large graphs. A solution to such challenge is to approximate masks by Chebyshev polynomials [24, 32], so that the eigenvalue decomposition of the graph Laplacian is not needed. The masks \(a_{\ell }, 0 \le \ell \le r\), that we use are finitely supported sequences, thus \(\widehat{a}_{\ell }\) are trigonometric polynomials and can be accurately approximated by low-degree Chebyshev polynomials. The Chebyshev polynomial approximation of the mask \(\widehat{a}_{\ell }(\xi )\), \(\xi \in [0,\pi ]\) can be formulated as

$$\begin{aligned} \widehat{a}_{\ell }(\xi )\approx \mathcal {T}_{\ell }^{n}(\xi )= \frac{1}{2}c_{\ell ,0}+\sum _{k=1}^{n-1}c_{\ell ,k}T_k(\xi ), \end{aligned}$$

where

$$\begin{aligned} c_{\ell ,k}=\frac{2}{\pi }\int _0^\pi \cos (k\theta ) \widehat{a}_{\ell }\left( \frac{\pi }{2}(\cos (\theta )+1)\right) \text {d}\theta , \end{aligned}$$

and

$$\begin{aligned} T_k(\xi )=\left\{ \begin{aligned}&1&\quad k=0,\\&\frac{\xi -\pi /2}{\pi /2}&\quad k=1, \\&\frac{4}{\pi }(\xi \!-\!\pi /2)T_{k\!-\!1}(\xi )\!-\!T_{k-2}(\xi )&\quad k\!=\!2,3, \ldots . \end{aligned} \right. \end{aligned}$$

Note that the graph Laplacian \({\mathcal {L}}\) admits the eigenvalue decomposition \({\mathcal L} =U \Lambda U^T\), where \(\Lambda :=\text {diag}\{\lambda _0, \lambda _1,\ldots ,\lambda _{K-1}\}\) and columns of U are the eigenvectors. Then the wavelet frame transform (2.2) can be rewritten in the matrix form in time domain:

$$\begin{aligned}&\mathcal {W}_{\ell , p}f_G[k]\nonumber \\&\quad :=\left\{ \begin{aligned}&U {{\widehat{\mathbf {a}}}_{\ell }^*} (2^{-N} \Lambda ) U^T f_G&\quad p=1\\&U {{\widehat{\mathbf {a}}}_{\ell }^*} (2^{-N+p-1} \Lambda ) {{\widehat{\mathbf {a}}}_{0}^*} (2^{-N+p-2} \Lambda ) \cdots {{\widehat{\mathbf {a}}}_{0}^*} (2^{-N} \Lambda ) U^T f_G&\quad 2\le p\le L \end{aligned} \right. , \end{aligned}$$

where \({{\widehat{\mathbf {a}}}_{\ell }^*} (\gamma \Lambda )\!\!:= \!\text {diag} \{ \widehat{a}_{\ell }^*(\gamma \lambda _0), \widehat{a}_{\ell }^*(\gamma \lambda _1), \ldots , \widehat{a}_{\ell }^*(\gamma \lambda _{K-1}) \}\). If we substitute \(\widehat{a}_{\ell }\) by polynomial \(\mathcal {T}_{\ell }^{n}\), then

$$\begin{aligned}&U {{\widehat{\mathbf {a}}}_{\ell }^*} (2^{-N} \Lambda ) U^T f_G \approx U \mathcal {T}_{\ell }^{n*} (2^{-N} \Lambda ) U^T f_G \nonumber \\&\quad =\mathcal {T}_{\ell }^{n*} (2^{-N} U \Lambda U^T ) f_G= \mathcal {T}_{\ell }^{n*} (2^{-N} \mathcal L) f_G \end{aligned}$$
(2.3)

and for \(2 \le p \le L\),

$$\begin{aligned}&U {{\widehat{\mathbf {a}}}_{\ell }^*} (2^{-N+p-1} \Lambda ) {{\widehat{\mathbf {a}}}_{0}^*} (2^{-N+p-2} \Lambda ) \cdots {{\widehat{\mathbf {a}}}_{0}^*} (2^{-N} \Lambda ) U^T f_G \nonumber \\&\quad \approx \mathcal {T}_{\ell }^{n*}(2^{-N+p-1}\mathcal {L})\mathcal {T}_0^{n*}(2^{-N+p-2}\mathcal {L})\cdots \mathcal {T}_0^{n*}(2^{-N}\mathcal {L})f_G. \end{aligned}$$
(2.4)

Thus, by the iterative definition of Chebyshev polynomials, only matrix-vector multiplications are involved for computations of (2.3) and (2.4), and we don’t need to obtain the full set of eigenvectors and eigenvalues of \(\mathcal L\). Similarly, the wavelet frame reconstruction \(\mathcal {W}^T\) can also be approximately computed.

3 Model and algorithm

3.1 Poisson noise removal of images on graphs

For a graph \(G:=\{V,E,\omega \}\), we formulate function \(f_G:V\mapsto \mathbb {R}\) by a K-dimensional vector defined on the vertices. Let \(f_G\) be the observed noisy image on graph which is formulated as

$$\begin{aligned} f_G=u+\eta , \end{aligned}$$

where \(\eta \) is the perturbation noise. We assume that \(f_G\) is corrupted by Poisson noise (see 1.2), i.e.

$$\begin{aligned} f_G\sim \mathbf {P}(u), \end{aligned}$$

and the noise on each pixel is independent.

Then, given u, we have the likelihood of observing \(f_G\)

$$\begin{aligned} \mathbf {P}(f_G|u)=\prod _{i=0}^{K-1} \frac{u_i^{f_i}e^{-u_i}}{f_i !}, \end{aligned}$$

where \(u_i\) and \({f}_i\) denote the ith element of u and \(f_G\). By the properties of Poisson distribution, we obtain the mean and variance of \(f_G\) as follows

$$\begin{aligned} \mathbf E (f_G|u)=\mathbf Var (f_G|u)=u. \end{aligned}$$

Let

$$\begin{aligned} \eta :=f_G-u \end{aligned}$$

be the “additive” random noise of the underlying image u on graph G. In the following, we approximate \(\eta \) by the normal distribution.

Given u, we have

$$\begin{aligned} \mathbf E (\eta |u)=\mathbf E (f_G|u)-u=0, \,\, \mathbf Var (\eta |u)=\mathbf Var (f_G|u)=u. \end{aligned}$$

Then, we approximate \(\eta \) by the additive Gaussian noise \(\mathcal {N}(0,u)\), i.e.,

$$\begin{aligned}&\mathbf {P}(\eta |u) \propto \exp \left\{ -\frac{1}{2} \eta ^T\Sigma ^{-1} \eta \right\} \nonumber \\&\quad = \exp \left\{ -\frac{1}{2}(f_G-u)^T\Sigma ^{-1}(f_G-u)\right\} , \end{aligned}$$
(3.1)

where \(\Sigma \) is the covariance matrix. Due to the independence of noise at each element, we have

$$\begin{aligned} \Sigma =\text {diag}(u). \end{aligned}$$

We then take negative \(\log \) of the normal distribution (3.1) and have

$$\begin{aligned} -\log \mathbf {P}(\eta |u)\propto \frac{1}{2}(f_G-u)^T\Sigma ^{-1}(f_G-u). \end{aligned}$$

Thus, by using the maximum likelihood estimates, we obtain the following fidelity term

$$\begin{aligned} \frac{1}{2}\Vert u-f_G\Vert _{\Sigma ^{-1}}^2:=\frac{1}{2}\left\| \frac{u-f_G}{\sqrt{u}}\right\| _{\ell _2(G)}^2. \end{aligned}$$
(3.2)

Here, the weighted \(\ell _2\) norm of a vector \( \text{ x }\in \mathbb {R}^{K}\) is defined as \({\Vert \text{ x }\Vert }_{\Sigma ^{-1}}^2:=\text{ x }^{T}{\Sigma ^{-1}}\text{ x }\). The fidelity term (3.2) was introduced in [22, 33] for Poisson data reconstruction. We here extend it to remove Poisson noise of images on graphs.

Under the assumption that the underlying solution is sparse in the wavelet frame domain, we propose the following model for removing Poisson noise of images on graphs

$$\begin{aligned} \min \limits _u \, \frac{1}{2}\left\| \frac{u-f_G}{\sqrt{u}}\right\| _{\ell _2(G)}^2+\lambda \Vert \mathcal {W}u\Vert _{\ell _1(G)}. \end{aligned}$$
(3.3)

Here, the first term is a fitting term based on the noise characteristic and derived by the likelihood function discussed above; the second term is a regularization term, and \(\lambda \) is a parameter to balance the two terms.

Note that the division and square root operator in the fitting term are both element wise. In case \(u=0\), in practical implement, in order to guarantee the numerical stability, we add a very small positive constant (the fixed background image [3, 6, 30]) to u in model (3.3). To handle the unknown weight in the fitting term, following the line of [22], we apply the following iteration to solve the model.

Let \(u_0=f_G\). For \(k=1,2, \ldots \), let

$$\begin{aligned} u^{k+1}:=\arg \,\min \limits _u \frac{1}{2}\left\| \frac{u-f_G}{\sqrt{u_k}}\right\| _{\ell _2(G)}^2+\lambda \Vert \mathcal {W}u\Vert _{\ell _1(G)}. \end{aligned}$$
(3.4)

The functional in (3.4) is an \(\ell _1\)-regularized least-squares problem. Then, there are iterative solvers like the alternating direction method of multipliers (ADMM) [5, 8, 17] to obtain \(u^{k+1}\), i.e.

$$\begin{aligned} \left\{ \begin{aligned} u_{k+1}&=\arg \min _u\frac{1}{2}\left\| \frac{u-f_G}{\sqrt{u_k}}\right\| _{\ell _2(G)}^2+\frac{\mu }{2}\Vert \mathcal {W}u-d_{k}+b_{k}\Vert _{\ell _2(G)}^2 \\ d_{k+1}&=\arg \min _{d}\lambda \Vert d\Vert _{\ell _1(G)}+\frac{\mu }{2}\Vert \mathcal {W}u_{k+1}-d+b_{k}\Vert _{\ell _2(G)}^2&\\ b_{k+1}&=b_{k}+\mathcal {W}u_{k+1}-d_{k+1}&\end{aligned}. \right. \end{aligned}$$
(3.5)

Let \(\Sigma _k=\text {diag}(u_k)\). Then, the solution to the first subproblem of (3.5) can be determined by solving the system of equations

$$\begin{aligned} (\Sigma _k^{-1}+\mu \mathcal {W}^{T}\mathcal {W}) u_{k+1}= \mu \mathcal {W}^T(d_k-b_k)+\Sigma _k^{-1} f_G, \end{aligned}$$
(3.6)

which, because of \(\mathcal {W}^T \mathcal {W}=\mathcal I \), can be simplified to

$$\begin{aligned} (\Sigma _k^{-1}+\mu \mathcal I) u_{k+1}= \mu \mathcal {W}^T(d_k-b_k)+\Sigma _k^{-1} f_G. \end{aligned}$$

The second subproblem of (3.5) has a closed form solution, and \(d_{k+1}\) is given by the soft-shrinkage operator (see [5, 8, 17])

$$\begin{aligned} d_{k+1}=\text {sign}(\mathcal {W}u_{k+1}+b_k)\cdot \max (|\mathcal {W}u_{k+1}+b_k|-{\lambda /\mu },0), \end{aligned}$$
(3.7)

where each operation is performed componentwisely.

Now, combining (3.6) and (3.7), we obtain Algorithm 1 for Poisson noise removal of images on graphs.

figure c
Fig. 1
figure 1

The figure shows three images (first row), ‘Slope’, ‘Eric’ and ‘Earth’, which are mapped onto the graph of the unit sphere (second row)

Fig. 2
figure 2

The figure shows two graphs, ‘flower’ and ‘cup’, which are generated by mapping the image ‘Slope’

Fig. 3
figure 3

Denoising results of the simulated noisy images on graphs corrupted by Poisson noise. From left to right noisy-free images on graphs, noisy images, denoised images. Parameters in the algorithm (Slope: \(\lambda =0.035\), \(\mu =0.35\); Eric: \(\lambda =0.035\), \(\mu =0.7\); Earth: \(\lambda =0.01\), \(\mu =0.1\); flower: \(\lambda =0.035\), \(\mu =0.875\); cup: \(\lambda =0.1\), \(\mu =1\))

3.2 Mixed Poisson–Gaussian noise

Previously, we discussed Poisson noise removal. In real graph data observation, there may exist other system-born noise such as mixed Poisson–Gaussian. The observed image \(f_G\) on graph is corrupted by mixed Poisson–Gaussian noise following the distribution

$$\begin{aligned} f_G\sim \mathcal {P}(u)+\mathcal {N}(0,\sigma ^2), \end{aligned}$$

where \(\sigma ^2\) is the variance of additive Gaussian noise. Similar to the approach in Sect. 3.1, by applying the normal distribution (3.1) with covariance matrix \(\Sigma =\text {diag}(u)+\sigma ^{2}I\), the probability density function of the observed image data \(f_G\) can be approximated again. Then, we have a new weighted \(\ell _2\) fidelity for removing mixed Poisson–Gaussian noise on graphs,

$$\begin{aligned} \frac{1}{2}\left\| \frac{u-f_G}{\sqrt{u+\sigma ^{2}}}\right\| _{\ell _2(G)}^2. \end{aligned}$$

Combining with the tight wavelet frame regularization, we propose the following denoising model

$$\begin{aligned} \min _{u}\frac{1}{2}\left\| \frac{u-f_G}{\sqrt{u+\sigma ^{2}}}\right\| _{\ell _2(G)}^2+\lambda \Vert \mathcal {W}u\Vert _{\ell _1(G)}. \end{aligned}$$
(3.8)

Compared with model (3.3), here we extended the weighted \(\ell _2\) fidelity term to mixed Poisson–Gaussian noise by a small modification of the Poisson noise version. Thus, the corresponding algorithm solving (3.8) is similar to Algorithm 1 except adding \(\sigma ^2\) in estimation and updating covariance matrix.

figure d
Table 1 Data summary and computational efficiency of Algorithm 1

4 Numerical results and discussions

In the previous section, we discussed the model and algorithm for removing Poisson and mixed Poisson–Gaussian noise of images on graphs. In this section, we provide numerical experiments to test the performance of our method and compare with certain existing methods, such as the classical Laplacian smoothing method [4], the KL-divergence model [33] and Gong et al.’ method [18]. Here, the KL-divergence model [33] and Gong et al.’ model [18] are, respectively, formulated as

$$\begin{aligned} \min \limits _u \, \mathbf 1 ^Tu+{f_G}^T\log (u)+\lambda \Vert \nabla u\Vert _{\ell _1(G)}, \end{aligned}$$

and

$$\begin{aligned} \min \limits _u \, \Vert u-f_G\Vert _{\ell _1(G)}^2+\nu \Vert u-f_G\Vert _{\ell _2(G)}^2+\lambda \Vert \mathcal {W}u\Vert _{\ell _1(G)}. \end{aligned}$$

In the first three experiments, the image functions defined on a graph are generated by mapping three images, ‘Slope’, ‘Eric’ and ‘Earth’, onto the unit sphere (see Fig. 1). Here, a unit sphere with 16,728 sampled vertices is selected as the graph. In the fourth and fifth experiments, two graphs named as ‘flower’ and ‘cup’ with 7919 and 10,840 sampled vertices are considered. The ‘Slope’ image function is mapped onto the graphs to test the performance of our approach (see Fig. 2). The two graphs are borrowed from the public 3D model database: http://3dmdb.com/.

In the computation of graph Laplacian, we choose parameter \(\rho = 10\) in the weight function (2.1). For wavelet frame transform on graphs, we use the piecewise linear B-spline tight wavelet frame [26],

$$\begin{aligned}&\widehat{a}_0(\xi )=\cos ^2(\xi /2),\quad \widehat{a}_1(\xi )=\frac{1}{\sqrt{2}}\sin (\xi ),\\&\quad \widehat{a}_2(\xi )=\sin ^2(\xi /2), \end{aligned}$$

and we use the Chebyshev polynomials of degree 7 to approximate the masks, i.e., \(n=8\) in (2.3). For simplicity, we fix the level of wavelet frame transformation to 1, since using higher decomposition levels only slightly improves the denoising quality while the computation cost is noticeably increased.

The \(\text {Algorithms 1}\) and 2 are implemented with MATLAB and experimented on a laptop with Intel Core i3-2310M (2.10 GHz) CPU and 6.0 GB RAM. The parameters in our algorithm are tuned to get the best visual outcome for the simulated noisy images on the graph. The denoised results \(\overline{u}\) can be evaluated quantitatively by the mean squared error (MSE), the signal-to-noise ratio (SNR) and the normalized cross-correlation (NCC), which are defined by

$$\begin{aligned} \text {MSE}(\overline{u}, u):= & {} \frac{\Vert \overline{u}-u\Vert _{\ell _2(G)}^2}{K},\,\, \text {SNR}(\overline{u}, u):=-20\log _{10}\\&\frac{\Vert \overline{u}-u\Vert _{\ell _2(G)}}{\Vert u\Vert _{\ell _2(G)}}, \text {NCC}(\overline{u}, u):=\frac{\overline{u}^T u}{u^T u}, \end{aligned}$$

where u denotes the noisy-free images on graph, and K is the number of vertices.

Fig. 4
figure 4

Denoising results of the noisy images on graphs corrupted by Poisson noise. From left to right noisy-free images on graphs, Laplacian, KL-divergence, Gong et al.’s results, ours

Fig. 5
figure 5

Denoising results of the noisy images on graphs corrupted by mixed Poisson–Gaussian noise. From left to right noisy-free images on graphs, noisy images on graphs, denoised images on graphs. Parameters in the algorithm (Slope: \(\lambda =0.035\), \(\mu =0.35\); Eric: \(\lambda =0.035\), \(\mu =0.7\); Earth: \(\lambda =0.01\), \(\mu =0.1\); flower: \(\lambda =0.035\), \(\mu =0.875\); cup: \(\lambda =0.1\), \(\mu =1\))

4.1 Poisson noise removal

In this subsection, we test the performance of our approach for Poisson noise removal of images on graphs. Here, the clean image u is rescaled to an intensity ranging from 0 to 1200, and then the Poisson noise is added in MATLAB using the function ‘poissrnd’. The denoised results are shown in Fig. 3. The denoised error, number of iterations and computational time for each image are given in Table 1.

Then we compare the results of our approach with other denoising methods. In Fig. 4, we visually show the difference between our method and the classical Laplacian smoothing method [4], the KL-divergence model [33] and Gong et al.’ method [18]. It can be seen that our results preserve the local features better. Furthermore, we quantitatively compare these methods by MSE, SNR and NCC, see Tables 2, 3 and 4 for the results.

Table 2 MSEs comparison for removal of Poisson noise (\(\times 10^{-4}\))
Table 3 SNRs comparison for removal of Poisson noise
Table 4 NCCs comparison for removal of Poisson noise

4.2 Mixed Poisson–Gaussian noise removal

In this subsection, we test the performance of mixed Poisson–Gaussian removal of images on graphs. Here, a Poisson noise is added to the original image first as in Sect. 4.1. Then, a Gaussian noise with distribution \(\mathcal {N}(0, \sigma ^2)\) is added to the image. We choose \(\sigma =6\). The denoised results are shown in Fig. 5, and the denoised error, number of iterations and computational time for each image are given in Table 5.

In the end, we compare the results of our approach with the Laplacian smoothing method [4], the KL-divergence model [33] and Gong et al.’ method [18] both visually and quantitatively. Figure 6 shows that our results preserve most textures and edges on graphs better. Furthermore, we compare these methods by MSE, SNR and NCC, see Tables 6, 7 and 8 for the results.

Fig. 6
figure 6

Denoising results of the noisy images on graphs corrupted by mixed Poisson–Gaussian noise. From left to right noisy-free images on graphs, Laplacian, KL-divergence, Gong et al.’s results, ours

Table 5 Data summary and computational efficiency of Algorithm 2
Table 6 MSEs comparison for removal of mixed Poisson–Gaussian noise (\(\times 10^{-4}\))
Table 7 SNRs comparison for removal of mixed Poisson–Gaussian noise
Table 8 NCCs comparison for removal of mixed Poisson–Gaussian noise

5 Conclusion

In this paper, we considered Poisson and mixed Poisson–Gaussian noise removal of images on graphs. Based on the statistical characteristic of noise, we approximated the probability density function of observed images by the normal distributions. Then, a fidelity term was derived by the likelihood function. In addition, under the assumption that the underlying image in the wavelet frame domain is sparse, we proposed a variational model to denoise the images on graphs. We then applied the popular iterative algorithm ADMM to solve the model. Finally, we demonstrated the numerical experiments to verify the practicability and effectiveness of our approach, and compared with the classical Laplacian smoothing method [4], the KL-divergence model [33] and Gong et al.’ method [18]. The results on some image denoising tasks indicate the effectiveness of our method.