1 Introduction

Image restoration refers to the recovery of a clean sharp image from a noisy, and potentially blurred, observation. In this paper, we consider the problem of restoring images corrupted by blur and additive white noise. Several image restoration algorithms such as nonlinear-diffusion partial differential equations (PDE)-based methods and TV regularized strategies, succeeded in obtaining good quality edge preserving restorations, especially for noise removal. However, they modify images towards piecewise constant functions, in such a way that important information, encoded in image features like textures and details, is often compromised in the restoration process.

We propose a novel variational formulation which exploits the relevant information on the whiteness of the noise thus resulting in a method that is particularly suitable for the restoration of partly-textured perturbed images, since it preserves fine scale features in the restoration process. In particular, we take advantage of the fact that while textures are characterized as repeated and meaningful structure of small patterns, noise is characterized as uncorrelated random patterns.

One aspect generally missing from the state-of-the-art image restoration methods is a full exploitation of all the available information about the noise. More precisely, most algorithms recover an estimate of the clean image by using prior knowledge only about the variance of the noise, but totally neglect the significantly more information-rich property of the noise being the realization of a white random process. This is to some extent surprising provided that, on the other hand, the whiteness of the residue image is often used as an a-posteriori criterion for evaluating the performance of restoration algorithms (see, e.g., [18]). In particular, by evaluating the resemblance of the residue image to a white noise realization, one can check, to some extents, the quality of the restored image. In [1, 10, 17, 18] the measures of residual spectral whiteness have been exploited for adjusting the regularization parameter and/or the number of iterations of the algorithms for deconvolution problems. Comparisons among several state of the art methods have been documented in [2].

The whiteness constraint constitutes the key novelty behind our approach. A penalty term to favor whiteness in the residue image has been proposed in [14] for the image denoising problem. One of the contribution of this paper is to incorporate such a relevant information into the restoration process when the image is both blurred and corrupted by additive white noise. The second one is the proposal of an efficient algorithm for the solution of the new model.

Without loss of generality, we consider grayscale images with a square \(d \times d\) domain. Let \(u \in {\mathbb R}^{d^2}\) be the unknown \(d \times d\) clean image concatenated into an \(d^2\)-vector, \(K \in {\mathbb R}^{d^2 \times d^2}\) be a known blurring operator and \(n \in {\mathbb R}^{d^2}\) be an unknown realization of the random noise process, which we assume white with zero-mean and standard deviation \(\sigma \). The discrete imaging model of the degradation process which relates the observed degraded image \(g \in {\mathbb R}^{d^2}\) with \(u\), can be expressed as follows:

$$\begin{aligned} g \,=\, K u \,\,{+}\,\, n \, . \end{aligned}$$
(1.1)

Given \(K\) and \(g\), our goal is to solve the inverse problem of recovering an estimate \(u^*\) of \(u\) from \(g\), which is known as deconvolution or deblurring. When \(K\) is the identity operator, recovering \(u\) is referred as denoising.

It is well known that deblurring by direct inversion of \(K\), i.e. \(u^* = K^{-1}g\), can yield meaningless results due to the fact that the linear blur operator \(K\) is typically ill-conditioned or singular [11]. To obtain a meaningful restoration, some prior information are commonly included in the formulation. A classical regularization approach consists in solving the optimization problem

$$\begin{aligned} u^* \,\,{\leftarrow }\,\, \mathrm {arg} \min _{u \in {\mathbb R}^{d^2}} \left\{ \mathcal {R}(u) + \mu \mathcal {F}(u,g) \right\} \,, \end{aligned}$$
(1.2)

where the regularization term \(\mathcal {R}(u)\) enforces certain prior constraints on the clean image \(u\), the fidelity term \(\mathcal {F}(u,g)\) measures how \(u\) fits the observed image \(g\), and \(\mu \) is a positive regularization parameter balancing the two terms. In particular, the fidelity term is typically related to the type of noise corrupting the image, while the regularization term commonly enforces smoothness of the solution.

Many different regularization functionals have been proposed, ranging from the classical Tikhonov [3, 11], to total variation (TV) [16], Mumford-Shah [12, 13], and many other variants of these. In this paper, we consider the TV regularization simply for its popularity, any other regularizers could be substituted as well. A popular and effective choice for images corrupted by additive Gaussian noise, is the TV-\(L_2\) functional formulation:

$$\begin{aligned} u^* \,\,{\leftarrow }\,\, \mathrm {arg} \min _{u \in {\mathbb R}^{d^2}} \left\{ \, TV(u) + \frac{\mu }{2} \, \Vert K u - g \Vert _2^2 \, \right\} \!, \end{aligned}$$
(1.3)

where \(TV(u)\) denotes the TV semi-norm of \(u\) defined as

$$\begin{aligned} TV(u) \,=\, \sum _{i=1}^{d^2} \Vert (\nabla u)_i \Vert _2 \, , \end{aligned}$$
(1.4)

with \(( \nabla u)_i := (D_{x,i} u,D_{y,i} u)\) denoting the discrete gradient of \(u\) at pixel \(i\) and \(D_{x,i}\), \(D_{y,i}\) representing the \(i\)-th rows of the \(x\) and \(y\)-directional first order finite difference operators \(D_x, D_y \in {\mathbb R}^{d^2 \times d^2}\), respectively. The regularization parameter \(\mu \) plays a critical role in the success of the restoration process [21].

When the standard deviation \(\sigma \) of noise is available, under the discrepancy principle, the unconstrained TV-\(L_2\) model in (1.3) can be equivalently reformulated as the following constrained optimization problem:

$$\begin{aligned} u^* \,\,{\leftarrow }\,\, \mathrm {arg} \min _{u \in U} TV(u) \, , \end{aligned}$$
(1.5)

with the feasible set defined as

$$\begin{aligned} U = \left\{ u \in {\mathbb R}^{d^2} : \Vert K u - g \Vert _2^2 \,{\le }\, \tau ^2 \, d^2 \, \sigma ^2 \right\} \,, \end{aligned}$$
(1.6)

where \(\tau \) is a pre-determined scalar parameter controlling the variance of the residue image \(K u - g\).

In this paper, we propose to explicitly incorporate the constraints on the whiteness of the residue image by modifying the feasible set defined in (1.6). The proposed TV-W model is

$$\begin{aligned} u^* \,\,{\leftarrow }\,\, \mathrm {arg} \min _{u \in W} TV(u) \, , \end{aligned}$$
(1.7)

where the new feasible set \(W \subset {\mathbb R}^{d^2}\) contains solutions \(u\) such that the corresponding residue image \(K u - g\) satisfies the discrepancy principle and resembles a white noise realization. We will propose a particular whiteness set \(W\) and, accordingly, we will present an efficient minimization method based on the ADMM strategy which was originally developed in the 1970s [6, 7], and recently applied to the image restoration problem [5].

More recently, a fast TV deconvolution algorithm called FTVd was proposed in [20] for image restoration with L\(_2\) fidelity, which is a quadratic penalty method. In [8] an augmented Lagrangian method has been successfully applied to overcome the difficulties due to the penalty parameter. Alternating direction method (ADM), a variant of the classic augmented Lagrangian method for structured optimization, has been introduced in [19]. Finally, in [5] the ADM strategy has been extended to the solution of constrained TV problems and named ADMM.

The paper is organized as follows. In Sect. 2 we present and motivate our choice for the whiteness set \(W\). In Sect. 3 we illustrate in detail the ADMM-based algorithm used to minimize the functional together with a discussion of the computational aspects. In Sect. 4 we present experimental results assessing the performance of the proposed model. In Sect. 5 we draw conclusions.

2 Imposing Whiteness Constraints

The statistical version of the degradation model in (1.1) with images in matrix-form is

$$\begin{aligned} G(i,j) = \bar{v}\,(i,j) + \bar{N}(i,j) ,\quad (i,j) \in \mathrm {\Omega } = \{1,\ldots ,d\}^2,\nonumber \\ \end{aligned}$$
(2.1)

with capital letters indicating random quantities and where \(\bar{v}(i,j) = (K \, u)(i,j)\) denotes the value of the blurred image at pixel \((i,j)\). In particular, the additive noise is modeled as a \(d\times d\) discrete random process \(\bar{\mathcal {N}} := \{\, \bar{N}(i,j):\, (i,j) \in \mathrm {\Omega }\,\}\) with \(\bar{N}(i,j)\) denoting the scalar random variable modeling noise at pixel \((i,j)\). This means that different images of the same subject under the same blurring operator will differ due to the inherently random nature of the additive noise. The ensemble auto-correlation of \(\bar{\mathcal {N}}\) is a function \(\rho _{\bar{\mathcal {N}}}\) which maps pairs of pixel locations \((i_{1},j_{1})\), \((i_{2},j_{2}) \in \mathrm {\Omega }\) into a scalar value given by

$$\begin{aligned} \rho _{\bar{\mathcal {N}}}(i_1,j_1,i_2,j_2)\,\,{:=}\,\, E\,\big [\, \bar{N}(i_1,j_1)\,\bar{N}(i_2,j_2)\,\big ] \,, \end{aligned}$$
(2.2)

where \(E\) is the expectation operator [9].

Since we assume that noise is zero-mean and white, i.e. wide-sense stationary and uncorrelated, the auto-correlation of \(\bar{\mathcal {N}}\) depends only on the lag between the two pixel locations \((l,m) = (i_2 - i_1,j_2 - j_1)\) and, with a little abuse of notation, (2.2) can be rewritten as follows

$$\begin{aligned}&\rho _{\bar{\mathcal {N}}}(i_1,j_1,i_2,j_2) \,=\, \rho _{\bar{\mathcal {N}}}(0,0,i_2 - i_1,j_2 - j_1) \nonumber \\&\quad {=}\,\, \rho _{\bar{\mathcal {N}}}(l,m) \,=\, E\,\big [\, \bar{N}(i,j)\,\bar{N}(i+l,j+m)\,\big ] \end{aligned}$$
(2.3)
$$\begin{aligned}&\quad {=}\,\, \left\{ \begin{array}{ll} \sigma ^2, &{}\quad \,\mathrm {if} \,\; (l,m) = (0,0) \\ 0, &{}\quad \,\mathrm {otherwise} \end{array} \right. \!, \quad \quad \nonumber \\&\quad \quad (l,m) \in \mathrm {\Theta } = \{ -(d -1) , \,\ldots \, , d - 1 \}^2 \,. \end{aligned}$$
(2.4)

Equation (2.3) expresses stationarity of the noise process and holds independently for every \((i,j) \in \mathrm {\Omega }\) such that \((i+l,j+m) \in \mathrm {\Omega }\), whereas (2.4) says that a zero-mean white noise process is characterized by zero values of the auto-correlation function at all non-vanishing lags.

Given a single realization \(n := \{\, n(i,j) \in {\mathbb R}:\, (i,j) \in \mathrm {\Omega }\,\}\) of the noise process \(\bar{\mathcal {N}}\), that is the series of noise values corrupting the particular observed image according to the deterministic degradation model in (1.1), the sample auto-correlation of \(n\) is a function \(r_{n}\) mapping all the possible lags \((l,m) \in \mathrm {\Theta }\) into a scalar value given by

$$\begin{aligned} r_{n}(l,m)&:= \frac{1}{\,d^{\,2}} \, \big ( \, n \,\,{\star } n \, \big )_{l,m} = \big ( \, n \,\,{*}\,\, n^{\prime } \, \big )_{l,m} \nonumber \\&= \frac{1}{\,d^{\,2}} \!\! \sum _{\,(i,j)\in \,\mathrm {\Omega }} \! n(i,j) \, n(i+l,j+m) ,\quad \, (l,m) \!\in \! \mathrm {\Theta },\nonumber \\ \end{aligned}$$
(2.5)

where \(\,\star \,\) and \(\,*\,\) denote the 2-D discrete correlation and convolution operators, respectively, and where \(n^{\prime }(i,j) = n(-i,-j)\). Clearly, for (2.5) being defined for all lags \((l,m) \in \mathrm {\Theta }\), the noise realization \(n\) must be padded with at least \(d - 1\) samples in all directions.

It can be demonstrated that, with the further assumption that the noise process is also ergodic [9], the sample auto-correlation is a good estimate of the ensamble auto-correlation. In particular, we have:

$$\begin{aligned} \lim _{d \rightarrow \infty } r_{n}(l,m) \,=\, \rho _{\bar{\mathcal {N}}}(l,m). \end{aligned}$$
(2.6)

A set of white noise realizations could thus be defined in the spatial domain by constraining the values of the sample auto-correlation function \(r_{n}\) in (2.5) to lie within a band around the theoretical limit in (2.6). In [14] the authors proposed a denoising method based on a novel fidelity term penalizing the autocorrelation function of the residual. The numerical solution of the obtained integro-differential equation was computed by an explicit finite difference method. The extension of such strategy to our setting, due to the presence of the blur operator, would lead to very strict stability conditions and consequently to a very slow procedure. In this paper we followed a new strategy considering a whiteness definition in the frequency domain and, accordingly, we propose a novel whiteness set.

The periodogram \(p_{n} \in {\mathbb R}^{d^2}\) of the noise realization \(n\) is defined as [9]

$$\begin{aligned} p_{n} = | T n | \, , \end{aligned}$$
(2.7)

where \(| \cdot |\) denotes the component-wise modulus of the complex vector \(T n\), where \(T \in {\mathbb C}^{d^2 \times d^2}\) is obtained using the unitary matrix representing the 1D discrete Fourier transform (DFT) \(T^{(1)} \in {\mathbb C}^{d \times d}\)

$$\begin{aligned} T = T^{(1)} \otimes T^{(1)} \, , \end{aligned}$$
(2.8)

and where \(\otimes \) denotes the Kronecker product operator.

It can be demonstrated that the elements of the vector \(p_{n}\) are independent random variables. Moreover, the first element of \(p_{n}\) has a \(\chi \) distribution with 1 degree of freedom and all the other elements are distributed as a \(\chi \) distribution with 2 degrees of freedom [9]. We recall that the probability density function \(p_k(x)\) and the mean value \(\mu _k\) of a \(\chi \)-distributed random variable with \(k\) degrees of freedom are, respectively:

$$\begin{aligned} p_k(x) = \frac{2^{1-k/2} x^{k-1} e^{-x^2/2}}{\Gamma (k/2)} \quad \! \mathrm {and}\! \quad \mu _k = \sqrt{2} \frac{\Gamma ( (k+1)/2 ) }{\Gamma (k/2)} , \!\!\!\!\nonumber \\ \end{aligned}$$
(2.9)

where \(\Gamma \) is the Gamma function. To obtain a vector of random variables having all mean equal to one, and following from the properties of the \(\chi \) distribution, we scale the vector \(p_{n}\) by means of a diagonal \(d^2 \times d^2\) normalization matrix \(M\), i.e. \(\hat{p}_{n} = M p_{n}\), with \(M\) defined as

$$\begin{aligned} M \,=\, \frac{1}{\tau \, \sigma \, d^2} \, \mathrm {diag} \,\left( \sqrt{2} \, \Gamma (1.5),\frac{1}{\Gamma (1.5)} , \ldots , \frac{1}{\Gamma (1.5)}\right) . \nonumber \\ \end{aligned}$$
(2.10)

Next we sort the elements of the normalized periodogram \(\hat{p}_{n}\) in increasing order of spatial frequency. At this aim, we introduce the \(d \times d\) matrix \(\Psi \) with elements \(\Psi _{i,j} = i^2+j^2\) and the \(d^2 \times d^2\) permutation matrix \(\Pi \) such that the elements of \(\Pi \xi \) appear in nondecreasing order, with \(\xi \in {\mathbb R}^{d^2}\) denoting the column-vector form of \(\Psi \). Then the vector \(\Pi \hat{p}_{n}\) holds all the elements of \(\hat{p}_{n}\) in order of increasing spatial frequency. We define the normalized cumulative periodogram (NCP) of \(n\) as the vector \(c_{n} \in {\mathbb R}^{d^2}\) given by

$$\begin{aligned} c_{n} \,\,{:=}\,\, S \, \Pi \hat{p}_{n}, \end{aligned}$$
(2.11)

where \(S\) is the \(d^2 \times d^2\) partial sums matrix, i.e. a lower triangular matrix with all nonzero entries equal to one.

In Sect. 4 we test experimentally by a Montecarlo simulation that in case that \(n\) is the realization of a zero-mean white noise process, as the image dimension \(d\) increases the associated NCP vector \(c_{n}\) tends to a limit vector \(c_w\), that we call theoretical whiteness NCP vector, given by the discretization of the straight line between \((1,d^{-2})\) and \((d^2,1)\), that is:

$$\begin{aligned} c_{w} \,=\, d^{-2} \, ( \,1, 2, \ldots , d^2 \, )^T. \end{aligned}$$
(2.12)

In this paper, we will exploit such a property of white noise to constrain the NCP vector \(c_n\) of the restored residue image \(n = K u - g\) to lie within a narrow band around the theoretical whiteness NCP vector \(c_{w}\). In particular, we define the whiteness set \(W\) to be used in our model (1.7) as follows:

$$\begin{aligned} W = \left\{ \, u \in {{\mathbb R}}^{d^2} \, : \,\, b^{-} \,\,{\le }\,\, S \Pi M \, | T (K u - g) | \,\,{\le }\,\, b^{+} \, \right\} \, ,\nonumber \\ \end{aligned}$$
(2.13)

where the inequalities must be intended component-wise and \(b^{+},b^{-} \in {\mathbb R}^{d^2}\) are vectors containing the upper and lower limits of the whiteness set, respectively. In Sect. 4, we will illustrate how the values of \(b^{+}\) and \(b^{-}\) can be selected according to probabilistic arguments. By constraining the restored image \(u\) to belong to the set \(W\) in (2.13) with appropriate set limits \(b^-\) and \(b^+\), we are thus implicitly forcing the restoration residual to resemble the realization of a white noise process.

However, from the definitions of the whiteness set in (2.13), we notice that the complicated constraint on \(u\) can be turned into a simple box-constraint on the NCP of the residue image \(c_n\), that is \(c_n\) must belong to the following whiteness set:

$$\begin{aligned} B = \left\{ \, c_n \in {{\mathbb R}}^{d^2} \, : \,\, b^{-} \,\,{\le }\,\, c_n \,\,{\le }\,\, b^{+} \, \right\} \, , \end{aligned}$$
(2.14)

where \(b^{-}\) and \(b^{+}\) are the pre-computed limits of the box-constraint.

In Fig. 1 we illustrate the capability of the set \(W\) in discriminating white signals, by showing the NCP vectors for two different 1D signals: the first signal in Fig. 1a is the realization of a white gaussian process with zero-mean, standard deviation \(\sigma = 4\), and number of samples \(d\) = 1,000, the second signal in Fig. 1b is obtained from the first signal by setting one-fifth of the samples to a constant value equal to \(4\). In Fig. 1c, d we show the NCP vectors \(c\) of the 1D sampled signals depicted in Fig. 1a, b, respectively. We note that the NCP vector \(c\) of the white signal depicted in Fig. 1c is very similar to the theoretical whiteness NCP vector \(c_w\), while a significant deviation occurs for the partially modified signal, as visible in Fig. 1d.

Fig. 1
figure 1

Sample 1D signals (top) and associated NCP vectors (bottom)

3 Applying ADMM to the Proposed Model

Constrained problems are in general much more difficult to solve than the unconstrained ones. Moreover, for constrained TV problems, the singularity of the TV functional prohibits the application of Newton-like methods [15]. Recently Chan et al. [5] successfully adapted the ADMM algorithm for solving both the constrained TV-L2 and TV-L1 problems. Thus in the following we introduce a suitable variant of the basic ADMM approach to solve the proposed constrained minimization problem in (1.7) with the whiteness set \(W\) defined in (2.13).

At this aim, we first introduce two auxiliary variables \(t\) and \(z\) to reformulate (1.7) into the following equivalent form:

$$\begin{aligned} u^*&\,{\leftarrow } \, \mathrm {arg} \min _{u,t,z} \, \left\{ \, \sum _{i=1}^{d^2} \Vert \, t_i \, \Vert _2 \,{+}\,\, \imath _{W} (z) \right\} \nonumber \\ \mathrm {s.t.}\, :&\,\;\, t = D u , \, z = u, \end{aligned}$$
(3.1)

where \(D = (D_x;D_y) \in {\mathbb R}^{2d^2 \times d^2}\), \(\imath _{W}\) is the indicator function of the whiteness set \(W\), with the convention that \(\imath _{W}(z)\) takes the value \(0\) for \(z \in W\) and \(\infty \) otherwise. The auxiliary variable \(t\) is introduced to transfer the discrete gradient operator \((\nabla u)_i\) out of the non-differentiable term \(\Vert \,\! \cdot \,\! \Vert _2\). The variable \(z\) plays the role of \(u\) within the whiteness constraint so that the constraint is now imposed on \(z\) instead of \(u\).

As we have already noticed the whiteness set \(W\) on \(u\), i.e. on \(z\), can be reformulated into a simple box-constraint set \(B\) on \(c_n\), where \(B\) is defined in (2.14). To exploit this new simple set we introduce three new auxiliary variables \(f = F(Ku -g)\), \(p = \Pi M | f |\) and \(c = S p\), where \(F = (T_{real};T_{imag}) \in {\mathbb R}^{2d^2 \times d^2}\).

Then the minimization problem replacing (3.1) is written as:

$$\begin{aligned} u^*&\,{\leftarrow } \, \mathrm {arg} \min _{u,t,f,p,c} \, \left\{ \, \sum _{i=1}^{d^2} \Vert \, t_i \, \Vert _2 \,{+}\,\, \imath _{B} (c) \right\} \nonumber \\ \mathrm {s.t.}\, :&\;\, t = D u, \,\, f = F(Ku - g), \,\, p = \Pi M |f|, \,\, c = S p ,\nonumber \\ \end{aligned}$$
(3.2)

where \(\imath _{B}\) is the indicator function of the whiteness set \(B\).

The augmented Lagrangian functional associated with (3.2) is

$$\begin{aligned}&\mathcal {L}(u,t,f,p,c;\lambda _t,\lambda _f,\lambda _p,\lambda _c) = \sum _{i=1}^{d^2} \Vert \, t_i \, \Vert _2 \,{+}\, \imath _{B} (c) \nonumber \\&\quad \,{-}\, \langle \, \lambda _{t} \, , \, t - D u \, \rangle \,\,{+}\,\, \frac{\beta _t}{2} \, \Vert \, t - D u \, \Vert _2^2 \nonumber \\&\quad \,{-}\, \langle \, \lambda _{f} \, , \, f - F(Ku - g) \, \rangle \,\,{+}\,\, \frac{\beta _f}{2} \, \Vert \, f - F(Ku - g) \, \Vert _2^2 \nonumber \\&\quad \,{-}\, \langle \, \lambda _{p} \, , \, p - \Pi M |f| \, \rangle \,\,{+}\,\, \frac{\beta _p}{2} \, \Vert \, p - \Pi M |f| \, \Vert _2^2 \nonumber \\&\quad \,{-}\, \langle \, \lambda _{c} \, , \, c - S p \, \rangle \,\,{+}\,\, \frac{\beta _c}{2} \, \Vert \, c - S p \, \Vert _2^2 \, , \end{aligned}$$
(3.3)

where \(\beta _t, \beta _f, \beta _p, \beta _c > 0\) are scalar penalty parameters and \(\lambda _t, \lambda _f \in Q\), \(\lambda _p, \lambda _c \in V\) are the vectors of Lagrangian multipliers, with \(V = {\mathbb R}^{d^2}\), \(Q = {\mathbb R}^{2 d^2}\).

Solving (3.2) is thus equivalent to search for the solutions of the following saddle point problem:

$$\begin{aligned} \mathrm {Find}&\,\, (x^*;\lambda ^*) \,\;{\in }\,\; X \times \Lambda \nonumber \\ \mathrm {s.t.}&\, \mathcal {L}(x^*;\lambda ) \,\,{\le }\,\, \mathcal {L}(x^*;\lambda ^*) \,\,{\le }\,\, \mathcal {L}(x;\lambda ^*) \nonumber \\&\, \forall \, (x;\lambda ) \,\;\,{\in }\,\; X \times \Lambda \, , \end{aligned}$$
(3.4)

with \(\mathcal {L}\) defined in (3.3) and where, for simplicity of notations, we set \(x = (u,t,f,p,c)\), \(\lambda = (\lambda _t,\lambda _f,\lambda _p,\lambda _c)\), \(X = V \times Q \times Q \times V \times V\) and \(\Lambda = Q \times Q \times V \times V\).

Starting at \(u = u^k\), \(\lambda _t = \lambda _t^k\), \(\lambda _f = \lambda _f^k\), \(\lambda _p = \lambda _p^k\), \(\lambda _c = \lambda _c^k\), the ADMM iterative scheme applied to the solution of (3.2) reads as follows:

$$\begin{aligned}&t^{k+1} {\leftarrow }\,\, \mathrm {arg} \, \min _{t \in Q} \, \mathcal {L}(u^k,t,f^k,p^k,c^k;\lambda _t^k,\lambda _f^k,\lambda _p^k,\lambda _c^k) \end{aligned}$$
(3.5)
$$\begin{aligned}&f^{k+1} {\leftarrow }\,\, \mathrm {arg} \, \min _{f \in Q} \, \mathcal {L}(u^k,t^{k+1},f,p^k,c^k;\lambda _t^k,\lambda _f^k,\lambda _p^k,\lambda _c^k) \end{aligned}$$
(3.6)
$$\begin{aligned}&p^{k+1} {\leftarrow }\,\, \mathrm {arg} \, \min _{p \in V} \, \mathcal {L}(u^k,t^{k+1},f^{k+1},p,c^k;\lambda _t^k,\lambda _f^k,\lambda _p^k,\lambda _c^k) \nonumber \\\end{aligned}$$
(3.7)
$$\begin{aligned}&c^{k+1} {\leftarrow }\,\, \mathrm {arg} \, \min _{c \in V} \, \mathcal {L}(u^k,t^{k+1},f^{k+1},p^{k+1},c;\lambda _t^k,\lambda _f^k,\lambda _p^k,\lambda _c^k) \nonumber \\\end{aligned}$$
(3.8)
$$\begin{aligned}&u^{k+1} {\leftarrow }\,\, \mathrm {arg} \, \min _{u \in V} \, \mathcal {L}(u,t^{k+1},f^{k+1},p^{k+1},c^{k+1};\lambda _t^k,\lambda _f^k,\lambda _p^k,\lambda _c^k) \nonumber \\\end{aligned}$$
(3.9)
$$\begin{aligned}&\left( \begin{array}{c} \lambda _t^{k+1} \\ \lambda _f^{k+1} \\ \lambda _p^{k+1} \\ \lambda _c^{k+1} \end{array} \right) {\leftarrow }\,\, \left( \, \begin{array}{l} \lambda _t^{k} \,\!\,\,{-}\,\,\!\,\! \gamma \, \beta _t \, \big ( \, t^{k+1} \,\,\;{-}\,\, D u^{k+1} \, \big ) \\ \lambda _f^{k} \,\!\,\,{-}\,\,\!\,\! \gamma \, \beta _f \, \big ( \, f^{k+1} \,\,\;{-}\,\, F(K u^{k+1} - g) \, \big ) \\ \lambda _p^{k} \,\!\,\,{-}\,\,\!\,\! \gamma \, \beta _p \, \big ( \, p^{k+1} \,\,\;{-}\,\, \Pi M |f^{k+1}| \, \big ) \\ \lambda _c^{k} \,\!\,\,{-}\,\,\!\,\! \gamma \, \beta _c \, \big ( \, c^{k+1} \,\,\;{-}\,\, S p^{k+1} \, \big ) \end{array} \right) ,\nonumber \\ \end{aligned}$$
(3.10)

where \(\gamma \) is a relaxation parameter chosen in the interval \((0,(\sqrt{5} + 1) / 2)\), as analyzed in [5].

In the following subsections we show in detail how to solve the five minimization sub-problems (3.5)–(3.9), then we present the overall iterative ADMM-based minimization algorithm. As we will see, the first two sub-problems admit a very efficient closed-form solution based on the following proposition, whose proof is reported in the Appendix.

Proposition 3.1

Let \(\alpha \in {\mathbb R}, \beta \in {\mathbb R}^+\) and \(\lambda , v \in {\mathbb R}^n\) with \(n \ge 1\) be given constants. Then, the solution \(x^*\) of the \(n\)-dimensional minimization problem

$$\begin{aligned} \mathrm {arg} \min _{x \in {\mathbb R}^n} \left\{ \, \alpha \, \Vert \, x \, \Vert _2 \,{-}\, \langle \, \lambda \, , \, x \, \rangle \,{+}\, \frac{\beta }{2} \left\| \, x - v \, \right\| _2^2 \,\right\} \end{aligned}$$
(3.11)

is given by:

$$\begin{aligned} \left. \begin{array}{lll} \mathrm {a)} \, &{} x^* =\, \max \left\{ 1 - \displaystyle {\frac{ \alpha }{ \beta \, \Vert \, q \, \Vert _2 }} \, , \, 0 \, \right\} q &{} \,\mathrm {if} \,\;\, \Vert \, q \, \Vert _2 \,{>}\, 0 , \forall \alpha \\ \mathrm {b)} \, &{} x^* =\,\, 0 &{} \,\mathrm {if} \,\;\, \Vert \, q \, \Vert _2 = 0 \,\;{and}\,\; \alpha \,{\ge }\, 0 \\ \mathrm {c)} \, &{} x^* \,{\in }\,\, \left\{ x \in {\mathbb R}^n : \,\, \Vert \, x \, \Vert _2 = - \displaystyle { \frac{\alpha }{\beta } } \, \right\} &{} \,\mathrm {if} \,\;\, \Vert \, q \, \Vert _2 = 0 \,\;{and}\,\; \alpha \,{<}\, 0 \\ \end{array} \right. \nonumber \\ \end{aligned}$$
(3.12)

where we set

$$\begin{aligned} q = v + \frac{1}{\beta } \lambda \, . \end{aligned}$$
(3.13)

3.1 Solving the Sub-problem for \(t\)

Given the definition of the augmented Lagrangian functional in (3.3), the minimization sub-problem for \(t\) in (3.5) can be written as follows:

$$\begin{aligned}&t^{*} \,{\leftarrow }\, \mathrm {arg} \, \min _{t \in Q}\, \left\{ \, \sum _{i=1}^{d^2} \Vert \, t_i \, \Vert _2 \,{-}\, \langle \, \lambda _{t} , t - D u \, \rangle \,{+}\, \frac{\beta _t}{2} \, \Vert \, t - D u \, \Vert _2^2 \,\right\} \nonumber \\&\quad \,{\leftarrow }\, \mathrm {arg} \, \min _{t \in Q}\, \left\{ \, \sum _{i=1}^{d^2} \Vert \, t_i \, \Vert _2 \,{-}\, \langle \, \lambda _{t} , t \, \rangle \,{+}\, \frac{\beta _t}{2} \, \Vert \, t - D u \, \Vert _2^2 \,\right\} \end{aligned}$$
(3.14)
$$\begin{aligned}&\quad \,{\leftarrow }\, \mathrm {arg} \, \min _{t \in Q}\, \left\{ \, \sum _{i=1}^{d^2} \left( \Vert \, t_i \, \Vert _2 \,{-}\, \langle \, \lambda _{t,i} \, , \, t_i \, \rangle \,{+}\, \frac{\beta _t}{2} \, \Vert \, t_i - D_i u \, \Vert _2^2 \right) \,\right\} .\nonumber \\ \end{aligned}$$
(3.15)

Note that in (3.14) we omitted the constant terms while in (3.15) the functional is written in an explicit component-wise form. The minimization in (3.15) is equivalent to the following \(d^2\) problems:

$$\begin{aligned}&t_i^* {\leftarrow }\, \mathrm {arg} \min _{t_i \in {\mathbb R}^2} \left\{ \, \Vert \, t_i \, \Vert _2 \,{-}\, \langle \, \lambda _{t,i} \, , \, t_i \, \rangle \,{+}\, \frac{\beta _t}{2} \left\| \, t_i - D_i u \, \right\| _2^2 \,\right\} ,\nonumber \\&\quad i = 1, \ldots , d^2. \end{aligned}$$
(3.16)

Based on Proposition (3.1) and setting

$$\begin{aligned} q_i \,\,{:=}\,\, D_i u + \frac{1}{\beta _t} \lambda _{t,i} , \quad i = 1, \ldots , d^2, \end{aligned}$$
(3.17)

the solution of (3.16) is given explicitly by the following \(d^2\) operations:

$$\begin{aligned} t_i^* = \max \left\{ \Vert \, q_i \, \Vert _2 - \frac{1}{\beta _t} \, , \, 0 \right\} \frac{q_i}{\Vert \, q_i \, \Vert _2} \, , \quad i = 1, \ldots , d^2 .\nonumber \\ \end{aligned}$$
(3.18)

where \(0 \cdot ( 0 / 0) = 0\) is assumed. We notice that the computational cost of (3.17)–(3.18) is linear with respect to the number of pixels \(d^2\).

3.2 Solving the Sub-problem for \(f\)

The minimization sub-problem for \(f\) in (3.6) is as follows:

$$\begin{aligned} f^*&{\leftarrow } \mathrm {arg} \, \min _{f \in Q}\, \Big \{ \, \,{-}\, \langle \, \lambda _{f} , f - F(K u - g) \, \rangle \nonumber \\&{-}\langle \, \lambda _{p} , p - \Pi M |f| \, \rangle \,\,{+}\, \frac{\beta _f}{2} \, \Vert \, f - F(K u - g) \, \Vert _2^2 \,\nonumber \\&{+}\, \frac{\beta _p}{2} \, \Vert \, p - \Pi M |f| \, \Vert _2^2 \,\Big \} \nonumber \\&{\leftarrow } \mathrm {arg} \, \min _{f \in Q}\, \Big \{ \, \,{-}\, \langle \, \lambda _{f} , f \, \rangle \,{+}\, \langle \, \lambda _{p} , \Pi M |f| \, \rangle \nonumber \\&{+}\, \frac{\beta _f}{2} \, \Vert \, f - F(K u - g) \, \Vert _2^2 \,{+}\, \frac{\beta _p}{2} \, \Vert \, p- \Pi M |f| \, \Vert _2^2\,\Big \} \nonumber \\ \end{aligned}$$
(3.19)
$$\begin{aligned}&{\leftarrow } \mathrm {arg} \, \min _{f \in Q}\, \Big \{ \, \,{-}\, \langle \, \lambda _{f} , f \, \rangle \,{+}\, \langle \, \Pi ^T \lambda _{p} , M |f| \, \rangle \nonumber \\&{+}\, \frac{\beta _f}{2} \, \Vert \, f \!-\! F(K u \!-\! g) \, \Vert _2^2 \,{+}\, \frac{\beta _p}{2} \, \Vert \, \Pi ^T p\!-\! M |f| \, \Vert _2^2 \Big \} .\nonumber \\ \end{aligned}$$
(3.20)

In (3.19) we omitted the constant terms while in (3.20) we rearranged the second and fourth term of the functional by exploiting the orthogonality of \(\Pi \).

To simplify notations, we denote by \(\tilde{\lambda }_p = \Pi ^T \lambda _p\) and \(\tilde{p} = \Pi ^T p\) the inversely-permuted versions of vectors \(\lambda _p\) and \(p\), respectively, by \(M_i\) the \(i\)-th element on the main diagonal of \(M\) in (2.10) and by \(w_i = F_i(Ku - g)\) the two-dimensional vector containing the real and imaginary parts of the \(i\)th component of the 2D-DFT of the residue image. The component-wise form of (3.20) reads thus as follows:

$$\begin{aligned} f^*&{\leftarrow } \mathrm {arg} \, \min _{f \in Q}\, \sum _{i=1}^{d^2} \Big \{ \, \,{-}\, \langle \, \lambda _{f,i} , f_i \, \rangle \,{+}\, \tilde{\lambda }_{p,i} M_i \left\| \, f_i \, \right\| _2 \nonumber \\&{+}\, \frac{\beta _f}{2} \, \Vert \, f_i - w_i \, \Vert _2^2 \,{+}\, \frac{\beta _p}{2} \,\, \left( \tilde{p}_i - M_i \left\| \, f_i \, \right\| _2 \right) ^2 \, \Big \} \nonumber \\\end{aligned}$$
(3.21)
$$\begin{aligned}&{\leftarrow } \mathrm {arg} \, \min _{f \in Q}\, \sum _{i=1}^{d^2} \Big \{ \, \,{-}\, \langle \, \lambda _{f,i} , f_i \, \rangle \,{+}\, \tilde{\lambda }_{p,i} M_i \left\| \, f_i \, \right\| _2 \nonumber \\&{+}\, \frac{\beta _f}{2} \, \Big ( \, \Vert \, f_i \, \Vert _2^2 \,{+}\, \Vert \, w_i \, \Vert _2^2 \,{-}\, 2 \langle \, f_i \, , \, w_i \, \rangle \, \Big ) \nonumber \\&{+}\, \frac{\beta _p}{2} \, \Big ( \, \tilde{p}_i^2 + M_i^2 \left\| \, f_i \, \right\| _2^2 - 2 \tilde{p}_i M_i \left\| \, f_i \, \right\| _2 \, \Big ) \, \Big \} \end{aligned}$$
(3.22)
$$\begin{aligned}&{\leftarrow } \mathrm {arg} \, \min _{f \in Q}\, \sum _{i=1}^{d^2} \Big \{ \, M_i \big ( \, \tilde{\lambda }_{p,i} \,{-}\, \beta _p \tilde{p}_i \, \big ) \left\| \, f_i \, \right\| _2 \nonumber \\&{-}\,\langle \, \lambda _{f,i} + \beta _f w_i , f_i \, \rangle {+}\, \frac{\beta _f + M_i^2 \beta _p}{2} \, \Vert \, f_i \, \Vert _2^2 \, \Big \} ,\nonumber \\ \end{aligned}$$
(3.23)

where the constant terms have been omitted in (3.23). The minimization in (3.23) is equivalent to the following \(d^2\) problems:

$$\begin{aligned}&f_i^* {\leftarrow }\, \mathrm {arg} \min _{f_i \in {\mathbb R}^2} \left\{ \, \alpha _i \Vert \, f_i \, \Vert _2 \,{-}\, \langle \, \rho _i \, , \, f_i \, \rangle \,{+}\, \frac{\kappa _i}{2} \left\| \, f_i \, \right\| _2^2 \,\right\} \, , \nonumber \\&\,\; i = 1, \ldots , d^2 \, , \end{aligned}$$
(3.24)

where the constants \(\alpha _i \in {\mathbb R}\), \(\rho _i \in {\mathbb R}^2\) and \(\kappa _i \in {\mathbb R}\) in (3.24) are defined as

$$\begin{aligned}&\left\{ \begin{array}{rcl} \alpha _i &{} = &{} M_i ( \tilde{\lambda }_{p,i} - \beta _p \tilde{p}_i ) \\ \rho _i &{} = &{} \lambda _{f,i} + \beta _f w_i\\ \kappa _i &{} = &{} \beta _f + \beta _p M_i^2 \end{array} \right. , \quad i = 1, \ldots , d^2 \, . \end{aligned}$$
(3.25)

Similar to the solution of (3.16), based on Proposition (3.1) and setting

$$\begin{aligned} q_i \,\,{:=}\,\, \frac{1}{\kappa _i} \, \rho _i \,\, , \quad i = 1, \ldots , d^2 \, , \end{aligned}$$
(3.26)

the solution of (3.24) is given explicitly by the following \(d^2\) operations:

$$\begin{aligned} f_i^* = \max \left\{ \Vert \, q_i \, \Vert _2 - \frac{\alpha _i}{\kappa _i} \, , \, 0 \right\} \frac{q_i}{\Vert \, q_i \, \Vert _2} \, , \quad i = 1, \ldots , d^2. \nonumber \\ \end{aligned}$$
(3.27)

where \(0 \cdot ( 0 / 0) = 0\) is assumed. The computational cost of (3.25)–(3.27) is dominated by the 2D-DFT of the residue image \(F(K u - g)\) required to compute \(w_i\).

3.3 Solving the Sub-problem for \(p\)

Given \(f\) and \(c\), the minimization sub-problem for \(p\) in (3.7) is as follows:

$$\begin{aligned} p^*&{\leftarrow } \mathrm {arg} \, \min _{p \in V}\, \left\{ \, \,{-}\, \langle \, \lambda _{p} , p - \Pi M |f| \, \rangle \,{-}\, \langle \, \lambda _{c} , c - S p \, \rangle \right. \nonumber \\&\left. \,{+}\, \frac{\beta _p}{2} \, \Vert \, p - \Pi M |f| \, \Vert _2^2 \,{+}\, \frac{\beta _c}{2} \, \Vert \, c - S p \, \Vert _2^2\,\right\} \nonumber \\&{\leftarrow } \mathrm {arg} \, \min _{p \in V}\, \left\{ \, \,{-}\, \langle \, \lambda _{p} , p \, \rangle \,{+}\, \langle \, \lambda _{c} , S p \, \rangle \right. \nonumber \\&\left. \,{+}\, \frac{\beta _p}{2} \, \Vert \, p - \Pi M |f| \, \Vert _2^2 \,{+}\, \frac{\beta _c}{2} \, \Vert \, c - S p \, \Vert _2^2 \,\right\} .\nonumber \\ \end{aligned}$$
(3.28)

The problem (3.28) is a quadratic optimization problem whose optimality conditions read as follows:

$$\begin{aligned} \,{-}\, \lambda _p \,{+}\, S^T \lambda _c \,{+}\, \beta _p \big ( p - \Pi M | f | \big ) \,{-}\, \beta _c S^T \big ( c - S p \big ) = 0\nonumber \\ \end{aligned}$$
(3.29)

that is

$$\begin{aligned} \bigg ( S^T S + \frac{\beta _p}{\beta _c} I \bigg ) p = \frac{1}{\beta _c} \, \Big ( \lambda _p \,{+}\, \beta _p \Pi M | f | \,{+}\, S^T \big ( \beta _c c \!-\! \lambda _c \big ) \Big ).\!\!\!\nonumber \\ \end{aligned}$$
(3.30)

Recalling that \(S\) is a lower triangular matrix with all nonzero entries equal to one, the expressions for the \(d^2 \times d^2\) matrix \(S^T S\) and its inverse \(\big (S^T S\big )^{-1}\) is quite straightforward to derive:

$$\begin{aligned} S^T \! S&= \left[ \begin{array}{cccccc} d^2 &{} \quad \ldots &{} \quad 4 &{} \quad 3 &{} \quad 2 &{} \quad 1 \\ \vdots &{} \quad \ddots &{} \quad \vdots &{} \quad \vdots &{} \quad \vdots &{} \quad \vdots \\ 4 &{} \quad \ldots &{} \quad 4 &{} \quad 3 &{} \quad 2 &{} \quad 1 \\ 3 &{} \quad \ldots &{} \quad 3 &{} \quad 3 &{} \quad 2 &{} \quad 1 \\ 2 &{}\quad \ldots &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 1 \\ \,\,1\;\, &{} \quad \ldots &{} \quad \,\,1\;\, &{} \quad \,\,1\;\, &{} \quad \,\,1\;\, &{} \quad \,\,1\;\, \end{array} \right] \quad \big (S^T \! S\big )^{-1}\nonumber \\&= \left[ \begin{array}{ccccccc} 1 &{} \quad -1\,\;\, &{} \quad 0 &{} \quad \ldots &{} \quad \ldots &{} \quad \ldots &{} 0 \\ -1\,\;\, &{} \quad 2 &{} \quad -1\,\;\, &{} \quad \ddots &{} \quad &{} &{} \quad \vdots \\ 0 &{} \quad -1\,\;\, &{} \quad 2 &{} \quad \ddots &{} \quad 0 &{} &{} \quad \vdots \\ \vdots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \ddots &{} \quad \vdots \\ \vdots &{} &{} \quad 0 &{} \quad \ddots &{} \quad 2 &{} \quad -1\,\;\, &{} \quad 0 \\ \vdots &{} &{} &{} \quad \ddots &{} \quad -1\,\;\, &{} \quad 2 &{} \quad -1\,\;\, \\ 0 &{} \quad \ldots &{} \quad \ldots &{} \quad \ldots &{} \quad 0 &{} \quad -1\,\;\, &{} \quad 2 \end{array}\right] \nonumber \\ \end{aligned}$$
(3.31)

Since \(S^T S\) is symmetric positive definite and \(\frac{\beta _p}{\beta _c} > 0\), the coefficient matrix of the linear system in (3.30) is symmetric positive definite. Moreover, to exploit the sparsity of the matrix \(\big ( S^T S \big )^{-1} \) both sides of (3.30) are multiplied by \(\frac{\beta _c}{\beta _p}\big ( S^T S \big )^{-1}\) thus obtaining the following equivalent linear system:

$$\begin{aligned}&\bigg ( \big ( S^T S \big )^{-1} \,{+}\, \frac{\beta _c}{\beta _p} I \bigg ) p \,\;\nonumber \\&\quad = \frac{1}{\beta _p} \, S^{-1} \Big ( S^{-T} \big ( \lambda _p \,{+}\, \beta _p \Pi M | f | \big ) \,{+}\, \beta _c c - \lambda _c \Big ) \,. \end{aligned}$$
(3.32)

The coefficient matrix in (3.32) is a tridiagonal symmetric positive definite matrix and is constant with the iterations, hence its bidiagonal Cholesky factorization can be computed as a preliminary step. At each iteration, the computational cost for solving (3.32) is twice the cost for solving a bidiagonal system.

3.4 Solving the Sub-problem for \(c\)

Given \(p\), the minimization sub-problem for \(c\) in (3.8) is as follows:

$$\begin{aligned} c^*&{\leftarrow } \mathrm {arg} \, \min _{c \in V}\, \left\{ \, \imath _{B} (c) \,{-}\, \langle \, \lambda _{c} , c - S p \, \rangle \,{+}\, \frac{\beta _c}{2} \, \Vert \, c - S p \, \Vert _2^2 \,\right\} \nonumber \\&{\leftarrow } \mathrm {arg} \, \min _{c \in B}\, \left\{ \, \frac{\beta _c}{2} \left\| c - \left( S p + \frac{\lambda _c}{\beta _c} \right) \right\| _2^2 \,\right\} \, . \end{aligned}$$
(3.33)

The solution of (3.33) is thus given by a simple Euclidean projection of the vector \(Sp + \lambda _c / \beta _c \) onto the box-constraints defined by the whiteness set \(B\) in (2.14):

$$\begin{aligned} c^* \,=\, \mathcal {P}_{B} \! \left( S p + \frac{\lambda _c}{\beta _c} \right) \, . \end{aligned}$$
(3.34)

Such a projection can be obtained straightforwardly by computing the following \(d^2\) component-wise projections:

$$\begin{aligned} c^*_i&= \mathrm {max} \Bigg \{ \mathrm {min} \left\{ \left( S_i p + \frac{\lambda _{c,i}}{\beta _c} \right) \, , \, b_{i}^+ \right\} \, , \, b_{i}^- \Bigg \},\nonumber \\&i = 1, \ldots , d^2 \, , \end{aligned}$$
(3.35)

where \(S_i p\) is the \(i\)-th component of the vector of partial sums of \(p\), and where \(b_{i}^+,b_{i}^- \in {\mathbb R}\) represent the selected box-constraint limits of the \(i\)-th component of the NCP vector.

The computational complexity of this sub-problem is linear in the number of pixels \(d^2\). In fact, the matrix-vector product \(S p\) in (3.34) can be efficiently obtained by computing the partial sums of the vector \(p\) incrementally.

3.5 Solving the Sub-problem for \(u\)

The minimization sub-problem for \(u\) in (3.9) can be re-written as follows:

$$\begin{aligned} u^*&{\leftarrow }\mathrm {arg} \min _{u \in V}\, \Big \{ \, \,{-}\, \langle \, \lambda _{t} , t - D u \, \rangle \,\,\;{-}\, \langle \, \lambda _{f} , f - F(K u - g) \, \rangle \nonumber \\&{+}\, \frac{\beta _t}{2} \, \Vert \, t - D u \, \Vert _2^2 \,{+}\, \frac{\beta _f}{2} \, \Vert \, f - F(K u - g) \, \Vert _2^2\,\Big \} \nonumber \\&{\leftarrow } \mathrm {arg} \, \min _{u \in V}\, \Big \{ \, \,{+}\, \langle \, \lambda _{t} , D u \, \rangle \,\,\;{+}\, \langle \, \lambda _{f} , F(K u) \, \rangle \nonumber \\&{+}\, \frac{\beta _t}{2} \, \Vert \, t - D u \, \Vert _2^2 \,{+}\, \frac{\beta _f}{2} \, \Vert \, f - F(K u - g) \, \Vert _2^2 \,\Big \} .\nonumber \\ \end{aligned}$$
(3.36)

The problem (3.36) is a quadratic optimization problem whose optimality conditions read as follows:

$$\begin{aligned} D^T \lambda _t + K^T F^T \lambda _f - \beta _t D^T (t - D u)\nonumber \\ - \beta _f K^T F^T \big ( f - F (K u - g) \big ) = 0 \end{aligned}$$
(3.37)

that is

$$\begin{aligned}&\left( \beta _t D^T D +\beta _f K^T F^T F K \right) u\nonumber \\&\quad = -D^T \lambda _t- K^T F^T \lambda _f + \beta _t D^T t + \beta _f K^T F^T (f + F g)\nonumber \\ \end{aligned}$$
(3.38)

by dividing by \(\beta _t\) and considering that \(F^TF=I\), we have

$$\begin{aligned}&\left( D^T D + \frac{\beta _f}{\beta _t} K^T K \right) u \nonumber \\&\quad = D^T \left( t - \frac{1}{\beta _t} \lambda _t \right) + \frac{\beta _f}{\beta _t} K^T \left( g + F^T \left( f - \frac{1}{\beta _f} \lambda _f \right) \right) .\nonumber \\ \end{aligned}$$
(3.39)

The coefficient matrix of the linear system (3.39) does not change with iterations. Moreover, since under periodic boundary conditions for \(u\) both \(D^T D\) and \(K^T K\) are block circulant matrices with circulant blocks, the coefficient matrix can be diagonalized once for all by the 2D-DFT using the FFT implementation. Therefore, at each iteration the linear system (3.39) can be solved by one forward FFT and one inverse FFT, each at a cost of \(O(d^2 \log d)\). For the computation of the right-hand side, we point out that the matrix-vector product \(F^T x\) can be obtained by taking the real part of the inverse FFT of the vector \(x\).

3.6 ADMM Iterative Scheme

To solve the proposed whiteness constrained minimization problem (3.2), we use the ADMM iterative scheme reported in Algorithm 1.

figure d

The overall computational cost of the algorithm is dominated by step 5 which computes the \(u\) update by solving the linear system (3.39) of dimension \(d^2\). However, we should notice that the cost of the steps 2, 3 and 4 is strongly reduced in our implementation thanks to the symmetry of the 2D-DFT. In particular, the dimension of the variables \(f,p,c\) is reduced from \(d^2\) to \(s \simeq \frac{d^2}{2}\). Clearly, in step 5, we then need to expand the \(f\) variable in order to cover the entire frequency range according to the symmetry of the DFT of real 2D images.

4 Numerical Examples

In this section we first discuss the approach we followed to select the limits \(b^+,b^-\) for the box-constraint in (2.14), then we present experimental results to test the proposed algorithm.

For the selection of the whiteness set limits \(b^+,b^- \in {\mathbb R}^{s}\), we propose the following simple Montecarlo approach. Given a standard deviation \(\sigma \) for the noise and a selected image dimension \(d\), we generate a large number of \(d \times d\) images containing different realizations of a zero-mean white Gaussian noise with standard deviation \(\sigma \). For each realization, we compute the NCP vector \(c\) according to the definition in (2.11). Finally, we compute the minimum (\(b^-\)) and the maximum (\(b^+\)) values within all the realizations for all the components of the NCP vector. In Fig. 2 we show the results obtained by using four different image dimensions \(d = 10, 20, 50, 100\). It is worth also pointing out that, thanks to the normalization adopted for the computation of the NCP vector, see (2.10), the upper and lower limits \(b^+,b^-\) do not depend on the standard deviation \(\sigma \) of the noise.

Fig. 2
figure 2

Some results of Montecarlo simulations: in dashed blue we depict the computed upper and lower limits \(b^+,b^- \in {\mathbb R}^{s}\) of the whiteness box-constraints on the NCP vector \(c \in {\mathbb R}^{s}\) for zero-mean white Gaussian noise with standard deviation \(\sigma = 4\), obtained for \(d \times d\) images with dimension a \(d = 10\), b \(d = 20\), c \(d = 50\), d \(d = 100\). In solid red the theoretical whiteness NCP vector \(c_w\) defined in (2.12) is shown (Color figure online)

We notice how the size of the band gets smaller as the image dimension \(d\) increases. In other words, the bigger the image dimension \(d\) is, the narrower the band of the whiteness constraint will be. In the limit, as \(d\) tends to infinity, the NCP vector \(c\) converges to the theoretical whiteness NCP vector \(c_w\) defined in (2.12).

In the rest of this section we evaluate the performance of the proposed restoration algorithm when applied to images synthetically corrupted by blur and white Gaussian noise .

We compare the proposed algorithm, referred to as TV-W, with the well-known Rudin-Osher-Fatemi (ROF) model [16], based on the minimization of the TV-L\(_2\) functional (1.3). The TV-L\(_2\) approach is implemented by the ADM, a variant of the classic augmented Lagrangian method for structured optimization which reformulates a TV problem as a linear equality constrained problem. The ADM algorithm is stable, efficient and in particular faster than most of the state-of-the-art restoration algorithms [19]. The software package is freely available at http://www.caam.rice.edu/ optimization/L1/ftvd/v4.0/ and described in detail in [19].

For the proposed TV-W model, we used the ADMM minimization procedure illustrated in Algorithm 1) which requires a number of tuning parameters to be estimated. We used the following parameters setting: \(\beta _t = \beta _c=\beta _p =\beta _f = 10\), \(\gamma = 1.618\), which is a pretty standard choice. In fact we observed that the restoration results are relatively insensitive to the choice of these parameters. However, for a faster convergence the \(\beta _t\) parameter should be tuned towards higher values.

In the extensive experiments we run to test the iterative algorithms for TV-L\(_2\) and TV-W, we noticed that after a few iterations they both approach toward a given stable solution, thus we decided to report in this section the results of the two algorithms obtained as soon as the relative difference between two successive iterates satisfies \(\, e^k:= \Vert u^k - u^{k-1} \Vert _{2} \,{/}\, \Vert u^{k-1}\Vert _{2} \,\,{<}\,\,10^{-4}\) or after a maximum of 100 iterations.

For a fair comparison between the TV-L\(_2\) and TV-W algorithms which are implemented as constrained models, we set \(\tau =1\) both in (1.6) and in (2.10), where \(\sigma \) is the available standard deviation of the noise.

The accuracy of the methods is evaluated by the improved signal to noise ratio (ISNR) and blurred signal to-noise ratio (BSNR), defined by

$$\begin{aligned}&\text{ ISNR }(u^*,u)= 10\log _{10}\frac{\Vert g-u\Vert _2^2}{\Vert u^*-u\Vert _2^2}\,\text{ dB },\nonumber \\&\quad \text{ BSNR }(u^*,u)= 10\log _{10}\frac{\Vert g\Vert _2^2}{\Vert n\Vert _2^2}\,\text{ dB }, \end{aligned}$$
(4.1)

where \(u^* \,{\in }\, {{\mathbb R}}^{d^{2}}\) is the computed estimate of the uncorrupted image \(u \,{\in }\, {{\mathbb R}}^{d^{2}}\). The ISNR quantity provides a quantitative measure of the improvement in the quality of the restored image: a high ISNR value indicates that \(u^*\) is an accurate approximation of \(u\).

The corrupted images taken as input by the algorithms have been obtained as follows. First, the original \(d \times d\) image \(u\) is blurred by a Gaussian kernel characterized by two parameters band and sigma. The former specifies the half-bandwidth of the Toeplitz blocks and the latter the variance of the Gaussian point spread function. The larger the sigma is, the more the blurring will be. The kernel is generated through the MATLAB command fspecial(’Gaussian’,band,sigma). Then, the blurred image \(K u\) is corrupted by additive zero-mean white Gaussian noise with standard deviation \(\sigma \).

Example 1. We consider the restoration of three different images: barbara (\(d=512\)), skyscraper (\(d=256\)) and geometry (\(d=256\)), which present interesting mixtures of textures, flat regions and shaded areas; the noise-free versions of the images are depicted in Figs. 3a, 4a, and 5a, respectively.

Fig. 3
figure 3

Example 1. Restoration results obtained by TV-L\(_2\) and TV-W models on the original barbara image degraded by Gaussian blur with \(\mathtt{band } = 5, \mathtt sigma = 1.0\) and additive zero-mean white Gaussian noise with standard deviation \(\sigma = 5\)

Fig. 4
figure 4

Example 1. Restoration results obtained by TV-L\(_2\) and TV-W models on the original skyscraper image degraded by Gaussian blur with \(\mathtt{band } = 5, \mathtt sigma = 1.5\) and additive zero-mean white Gaussian noise with standard deviation \(\sigma = 10\)

Fig. 5
figure 5

Example 1. Restoration results obtained by TV-L\(_2\) and TV-W models on the original geometry image degraded by Gaussian blur with \(\mathtt{band } = 7, \mathtt sigma = 1.5\) and additive zero-mean white Gaussian noise with standard deviation \(\sigma = 10\)

In the first part of this example we investigate the comparison between the restoration by TV-L\(2\) and by TV-W algorithms for a fixed additive Gaussian noise with standard deviation \(\sigma =5\) and variable Gaussian blur degradations, characterized by \(\mathtt{band }\) and \(\mathtt{sigma }\), which significantly worsen the ill-conditioning of the image restoration problems. Table 1 reports in the first and second columns the parameters \(\mathtt{band }\) and \(\mathtt{sigma }\) of the Gaussian blur kernel, and the third column, labeled BSNR, reports the BSNR-values of the available corrupted image \(g\). The columns labeled TV-L\(_2\) and TV-W display ISNR-values for the restored images obtained by the two related algorithms. For increasing \(\mathtt{band }\) and \(\mathtt{sigma }\) parameters, the textured parts of the images tend to disappear, consequently the benefits of the whiteness constraint decrease until the worst case with \(\mathtt{band }=7\) and \(\mathtt{sigma }=2.5\) where the ISNR-values for the two algorithms are almost the same.

Table 1 Example 1. ISNR values obtained by restoring the images barbara, skyscraper and geometry corrupted by additive zero-mean white Gaussian noise with fixed standard deviation \(\sigma = 5\) and Gaussian blur with increasing strength

In the second part of this example, we evaluate quantitatively the robustness of the restoration performance for a fixed Gaussian blur and increasing standard deviation \(\sigma \) of the additive noise when the restoration methods are applied to the same images. Table 2 shows the resulting ISNR values for \(\sigma \) ranging from 5 to 40. For each image we chose different blur intensities in such a way that textured parts are not completely filtered out, and noise effect can be better highlighted.

Table 2 Example 1. ISNR values obtained by restoring the images barbara, skyscraper and geometry corrupted by fixed Gaussian blur and additive zero-mean white Gaussian noise with increasing standard deviation \(\sigma \)

In Figs. 3, 4, 5 we display the degraded images and the results by applying the two different image restoration methods.

A visual inspection of the figures of this example together with the results in the Tables 1, 2 allow us to conclude that by constraining the whiteness of the residual, we can go beyond the use of just the residual norm or the variance of the noise model, thus providing better restorations. In all the tables the highest ISNR values are in boldface.

A proof of theoretical convergence for the ADMM algorithm proposed for the TV-W model is beyond the scope of this paper, and will be investigated in future work, since the convergence of a several block variables ADMM is still an open problem to the best of our knowledge. However, in Fig. 6 we provide empirical evidence of the numerical convergence of the proposed algorithm for the example reported in the third row of Table 1 (band = 5, sigma = 1.0). In particular, in Fig. 6 the relative change of the approximate solution computed as \(e_k\) is shown as a function of the iteration count \(k\). The same convergent behavior can be observed for all the other tested examples.

Fig. 6
figure 6

Plot of the relative change of the approximate solution \(e^k\) computed by the proposed TV-W algorithm as a function of iterations count \(k\) for the example reported in the third row of Table 1 (band = 5, sigma = 1.0)

Example 2. In this example we show the good performance of Algorithm 1) with whiteness constraints in recovering perturbed textured images with increasing size of the texture pattern. In [4] a strategy for the restoration of textured images has been proposed, where a texture detection preliminary procedure was required to set the fractional order of the regularization. Unlike, in TV-W method, the textured parts are automatically handled by the constraint in the variational formulation.

Synthetic \(200 \times 200\) checkboard images are created with increasing pattern size as depicted in Fig. 7 (first column). The checkboard characterized by the finest pattern size can be considered as a classical textured image, while for increasing pattern size any automatic texture detection algorithm could even not succeed in classifying them as textured images. Therefore, all those algorithms that take advantage of a preliminary detection of textured regions to improve the restoration would fail in these cases. Moreover, even for the checkboard image characterized by the largest pattern size which is notoriously well recovered by TV-\(L_2\) algorithm, TV-W algorithm provides significantly better results.

Fig. 7
figure 7

Restoration results obtained by TV-L\(_2\) (third column) and TV-W (fourth column) models on the degraded images shown in the second column, obtained by corrupting the original images in the first column by a Gaussian blur with \(\mathtt{band } = 7, \mathtt sigma = 1.5\), and additive white Gaussian noise with zero-mean and \(\sigma =20\)

The checkboard images corrupted by white Gaussian noise with \(\sigma =20\) and Gaussian blur with \(\mathtt{band }=7\) and \(\mathtt{sigma }=1.5\) are displayed in Fig. 7 (second column), the restored images obtained by the TV-L\(_2\) and TV-W algorithms are depicted in Fig. 7 third and fourth columns, respectively.

In Table 3 we report the results of the restorations for images depicted in Fig. 7 (second column). The initial BSNR values (labeled as \(BSNR\)) and those obtained by each algorithm for different noise levels (\(\sigma =10,20\)), are reported. The ISNR values reported in Table 3 show that TV-W algorithm significantly improves the restoration quality compared to TV-L\(_2\), and the improvement increases for large pattern size.

Table 3 Example 2. ISNR values obtained by restoring the synthetic checkboard image corrupted by additive zero-mean white Gaussian noise of standard deviation \(\sigma \) for different checkboard-sized images

As previously stated, in all the experiments reported, we used the parameter value \(\tau =1\) for both approaches, that is we imposed that the variance of the residue image is equal to the true noise variance.

However, it is known that using the discrepancy principle with \(\tau =1\) in the TV-L\(_2\) model tends to lead to over-regularized solutions. Therefore the value of the \(\tau \) parameter is usually adjusted for a better performance.

In Table 4 the examples shown in Table 3 has been rerun, where the \(\tau \) parameters of the both approaches have been tuned. The corresponding \(\tau \) values are reported in columns fourth and sixth for the TV-L\(_2\) and TV-W, respectively.

Table 4 Example 2. ISNR values obtained by restoring the synthetic checkboard image corrupted by additive zero-mean white Gaussian noise of standard deviation \(\sigma \) for different checkboard-sized images, using properly tuned parameter values \(\tau \)

We observe that the proposed model leads to better results than the TV-L\(_2\) model after parameter tuning of both approaches, moreover it’s worth noticing that the optimal \(\tau \) values are less than \(1\) for TV-L\(_2\) as expected, while they are greater than \(1\) in our model.

5 Conclusions

We have presented a new variational model for the restoration of images corrupted by blur and zero-mean white Gaussian noise. The proposed model takes advantage of residual components, rather than only the norm, to determine when the approximate solution is adequately separated from noise. In particular, the whiteness of the residual image has been integrated in the model as a new constraint on the solution. The constrained TV-W model has been efficientlysolved by combining the effective ADMM method with an accurately chosen variable splitting strategy. Experimental comparison with the classical TV-L\(_2\) model for image restoration demonstrates the effectiveness of the proposed model in particular for textured images. However, we notice that the present implementation is not optimized as the downloaded code for TV-L\(_2\) is, therefore the TV-W presents an overhead with respect to the TV-L\(_2\) between \(30 \%\) and \(50 \%\). The method can also be used in conjunction with other fidelity terms or different regularization operators provided that the noise is not correlated. Future work will investigate how this could improve the overall performance. Moreover, the present approach could be easily extended to non-white noise perturbations in which the periodogram curve is known or can be inferred, without any particular modifications of the algorithm.