1 Introduction

Image denoising is a fundamental low level computer vision task and has a long history. In this paper, we focus on the classical additive Gaussian white noise removing problem. Mathematically, the observed noisy image can be modeled as \(g=f+n\), where f is the latent clean image and n is the Gaussian noise with 0 mean. To restore f from g, thousands of methods have been proposed in the past several decades.

To keep the non-continuity of f, TV was proposed in [1]. The bounded variation space admits piece-wise constant functions and thus TV regularization is very efficient in denoising cartoon images. However, small structures such as textures can not be identified by TV and these repetitive structures can be removed together with the noise. To make a distinction between textures and noise, self-similarity information was introduced to the denoising methods. The nonlocal means method [2] used the image blocks’s self-similarity to average the pixels and texture preserving was greatly improved . The nonlocal means method triggered the self-similarity study in recent years and a variety of methods based on different mathematical tools were designed, such as nonlocal TV [3], block nonlocal TV [4] , BM3D [5] and so on.

Sparsity regularization is a hot research topic in the recent years. The strong assumption of the sparsity method is that the image signal can be represented sparsely under some proper basis function. Usually, the basis vectors or functions can be chosen as some well-known orthogonal basis such as FFT, DCT, wavelets, SVD and so on. One of the representative methods for the sparsity regularization is BM3D [5]. In this method, similar image blocks were stacked together into a 3D array to enforce the image information and sparsity. By choosing proper basis vectors, this 3D matrix would have a very sparse representation. Then each image block can be estimated by thresholding the sparse coefficients. BM3D can produce state-of-the-art denoising results due to its nonlocal self-similarity. However, there are two main flaws in this method. The first one is that the basis vectors are fixed and some image blocks might not have enough sparse coefficients. We will design a numerical experiment to show this below. The other is that artificial ringing effects will occur in the restoration due to the image block stacking method. In fact, these phenomena can be reduced by using global TV.

Low-rank methods for image denoising have received much attention in recent years. It is not difficult to observe that a matrix formed by the columns of some nonlocal similar patches in a natural image is low-rank. By integrating the self-similarity property, this method can produce good restorations [6]. In fact, low-rank is a variant of sparsity regularization and is associated with an \(l_0\) minimization problem for a matrix’s singular values. In low rank methods, the basis vectors are chosen as the SVD of a matrix. However, the general low-rank problem is non convex and difficult to optimize. A good choice to approximate the low-rank is to use the nuclear norm, which is defined by a sum of the singular values of a matrix and it can be easily solved by a singular values thresholding method [7]. It is well-known that the \(l_1\) norm is the tightest convex relaxation of \(l_0\). Thus in matrix completion, the nuclear norm minimization can be regarded as a convex relaxation of low-rank problem since the nuclear norm is defined by an \(l_1\) norm of a vector which is composed by a matrix’s singular values. Many methods such as reweighting [8], truncated nuclear norm [9], weighted nuclear norm [10] and Schatten Norm [11] and other methods [12] have been proposed to enforce the sparsity. However, these methods just enhance the sparsity and do not consider the basis. In fact, a proper basis functions system is very important in a sparse representation [13].

In this paper, we propose a local SVD operators based sparsity and TV regularization method. This method is developed by formulating the local nuclear norm denoising method [6] with a variational problem, which is easily extended to many other problems. In fact, the local image block processing is closely related to the Kronecker product of matrices. By using vectorization, we can get local SVD based operators. These local SVD operators can help us to find good basis functions and get sparse and redundant representations for image blocks. To reduce the artificial ringing caused by the image blocks processing, a global TV regularization is integrated into the cost functional. The proposed minimization problem can be efficiently solved by splitting schemes. Experimental results show that it can provide some impressive denoising results. It has better performance than BM3D in both PSNR and visually, and BM3D is often regarded as a state-of-the-art denoising algorithm.

The main contributions of this paper are as follows:

  • We build a variational formulation for the block matching based low rank method. This new formulation can be easily extended to many other image processing problems such as deblurring.

  • We introduce the TV to the block matching based method and elegantly integrate it to the cost functional.

  • We propose a splitting denoising optimization algorithm and achieve state-of-the-art performance. The convergence of the proposed algorithms can be obtained under some mild conditions.

Let us point out that the proposed method is self-contained and do not need training, which is different from the data driven machine learning type algorithms such as convolutional neural network based techniques [14,15,16].

The rest of the paper is organized as follows: In Sect. 2, we will briefly introduce related work such as BM3D and some SVD based methods. We present our proposed model in Sect. 3. In Sect. 4, the optimization scheme to solve the proposed model and some of the related details will be described. The convergence results are contained in Sect. 5. Section 6 includes some experimental results and comparisons with related methods. Finally, we will conclude the paper in Sect. 7.

2 Related Work

2.1 Notations

We first give some notation. Throughout this paper, we write matrices as bold letters such as \(\varvec{U}, \varvec{V}, \varvec{P}\). The lowercase letters stand for column vectors. Let \(f,g\in \mathbb {R}^{N}\) be images. We sometimes use the lowercase letters f or \(vec(\varvec{F})\) to represent a column vector by stacking the columns of a matrix \(\varvec{F}\), and the inverse operator of vec is defined as array(f), i.e \(array(f)=\varvec{F}\). The superscripts ij in matrices such as \(\varvec{P}^{ij}\) always stand for different matrices. Similarly to [13], \(\varvec{P}^{ij}\) is an extract matrix whose elements of each row are zeros except for one with value 1. The symbol \(\otimes \) stands for the Kronecker product.

2.2 BM3D and Low Rank

The BM3D method [5] is a famous denoising method. It has become a baseline algorithm to test the performance of denoising algorithms. BM3D includes the following steps: the first step is block matching: for each image block located at j, the similar image blocks with size \(\sqrt{n}\times \sqrt{n}\) are collected in groups with member number \(I_j\). Image blocks in each group are stacked together to form a \(\sqrt{n}\times \sqrt{n}\times I_j\) 3-D data array. Second, the sparsity regularization step, the 3-D arrays are decorrelated by using an invertible 3D transform such as DCT and then are filtered by thresholding. Finally, the restoration is obtained by aggregating all the estimated image patches. These steps are formulated as a nice variational problem by Danielyan et al. in [17]. In this paper, we will borrow representations from it. Based on the block matching method, in Ji et al. [6] proposed a nuclear norm based method for a video denoising algorithm. The method in [6] decorrelates the redundant information by thresholding the singular values of the SVD. Since there is no a good SVD for an array whose dimension is more than 2 [18], in the SVD based method, a common choice is to create \(n\times I_j\) 2-D matrices by concatenating all columns of the patch. i.e.

$$\begin{aligned} \varvec{M}^j=\left[ \varvec{P}^{1j} g,\varvec{P}^{2j} g,\ldots ,\varvec{P}^{I_jj} g\right] , \end{aligned}$$

where \(\varvec{P}^{ij}\in \mathbb {R}^{n\times N},i=1,2\cdots ,I_j\) are some extract matrices whose elements are binary and \(\varvec{P}^{ij} g\in \mathbb {R}^{n}\) stands for a vectorized i-th most similar image block to the image patch at j. Locally, the restoration problem can be written as

$$\begin{aligned} \underset{\varvec{X}^j}{\min }\left\{ \frac{1}{2}||\varvec{M}^j-\varvec{X}^j||_F^2+\mu _j||\varvec{X}^j||_{*}\right\} , \end{aligned}$$
(1)

where \(||\cdot ||_{*}\) is the nuclear norm defined as the sum of the singular values of \(\varvec{X}^j\) [6] and \(\mu _j\) is a regularization parameter. In order to extend this method, we will reformulate the problem (1) with basis functions and sparse representations. Let the SVD of \(\varvec{M}^j\) be

$$\begin{aligned} \varvec{M}^j =\varvec{U}^j\Sigma _{\varvec{M}^j}(\varvec{V}^j)^{'}, \end{aligned}$$

where \(\Sigma _{\varvec{M}^j}\in \mathbb {R}^{n\times I_j}\) is a diagonal matrix and \(\varvec{U}^j\in \mathbb {R}^{n\times n}, \varvec{V}^j\in \mathbb {R}^{I_j\times I_j}\) are orthogonal unitary matrices. If we chose \(\varvec{U}^j\) and \(\varvec{V}^j\) as bases, then we get for any \(\varvec{X}^j\in \mathbb {R}^{n\times I_j}\),

$$\begin{aligned} \varvec{X}^j=\varvec{U}^j\Sigma _{\varvec{X}^j}(\varvec{V}^j)^{'}. \end{aligned}$$

Note that the coefficient matrix \(\Sigma _{\varvec{X}^j}\) may not be a diagonal matrix. If we require this representation to be sparse under the condition \(||\varvec{M}^j-\varvec{X}^j||_F^2=c\sigma ^2\), then we can get the Lagrangian version of this problem

$$\begin{aligned} \underset{\varvec{\Sigma }_{\varvec{X}^j}}{\min }\left\{ \frac{1}{2}||\varvec{\Sigma }_{\varvec{M}^j}-\varvec{\Sigma }_{\varvec{X}^j}||_F^2+\mu _j||\varvec{\Sigma }_{\varvec{X}^j}||_1\right\} . \end{aligned}$$
(2)

Here \(||\cdot ||_1\) represents the “entry-wise” \(l_1\) norm, i.e. \(||\varvec{A}||_1=\sum _{ij}|a_{ij}|\) for any matrix \(\varvec{A}=(a_{ij}).\)

It is not difficult to show that the problem (1) is equivalent to the sparsity regularization problem (2).

Proposition 1

Suppose \(\varvec{U}^j,\varvec{V}^j\) are the left and right singular matrices of \(\varvec{M}^j\), i.e. \((\varvec{U}^j)^{'} \varvec{M}^j \varvec{V}^j =\varvec{\Sigma }_{\varvec{M}^j},\) then we have that \(\varvec{\Sigma }_{\varvec{X}^j}^{*}\) is a minimizer of (2) if and only if \((\varvec{X}^j)^*=\varvec{U}^j\varvec{\Sigma }_{\varvec{X}^j}^{*}(\varvec{V}^j)^{'}\) is a minimizer of (1).

Proof

To simplify notations, we omit the superscript and subscript j for all the matrices and variables in (1), (2) and Proposition 1. Let us suppose \(\varvec{\Sigma }_{\varvec{M}}=diag(\{\sigma _i\}_{1\leqslant i\leqslant \min (n,I)})\). Then the minimizer \(\varvec{X}^*\) of (1) is related to a singular value shrinkage operator according to [7], i.e. \(\varvec{X}^*=\varvec{U} diag(\{\max (0,\sigma _i-\mu )\}) \varvec{V}^{'}\). We just need to prove \(diag(\{\max (0,\sigma _i-\mu )\})\) is a minimizer of (2). Denote \(\varvec{\Sigma }_{\varvec{M}}=((\sigma _M)_{ij})_{n\times I}, \varvec{\Sigma }_{\varvec{X}}=((\sigma _X)_{ij})_{n\times I}\), then \((\sigma _M)_{ij}=\delta _{ij}\sigma _i\), and problem (2) becomes a \(l_1-l_2\) minimization problem

$$\begin{aligned} \underset{(\sigma _X)_{ij}}{\min }\left\{ \sum _{i=1}^n\sum _{j=1}^I\left[ \frac{1}{2}((\sigma _X)_{ij}-(\sigma _M)_{ij})^2+\mu |(\sigma _X)_{ij}|\right] \right\} . \end{aligned}$$

It is well-known that the above problem has a close-formed minimizer \((\sigma ^*_X)_{ij}=\frac{(\sigma _M)_{ij}}{|(\sigma _M)_{ij}|}\max (0,|(\sigma _M)_{ij}|-\mu )\). Please note \((\sigma _M)_{ij}=\delta _{ij}\sigma _i\geqslant 0\) and \(\mu >0\), then we have \((\sigma ^*_X)_{ij}=max(0,(\sigma _M)_{ij}-\mu )=\delta _{ij}max(0,\sigma _i-\mu ).\) Thus \(\varvec{\Sigma }^*_{\varvec{X}}=((\sigma ^*_X)_{ij})_{n\times I}=diag(\{\max (0,\sigma _i-\mu )\})\) is a minimizer of problem (2). Please recall that these two problems (1) and (2) are strictly convex and both of them have a unique minimizer, and thus \(\varvec{X}^*,\varvec{\Sigma }_{\varvec{X}}^{*}\) are their unique minimizers, respectively. \(\square \)

From the above analysis, we can see that the nuclear norm minimization problem is closely related to an \(l_1\) minimization in a transformed domain with the transformation basis functions \(\varvec{U}^j\) and \(\varvec{V}^j\). However, we will show that the basis functions \(\varvec{U}^j,\varvec{V}^j\) are not good enough to ensure that \(\varvec{X}^j\) has a sparse representation. For simple comparison, we let \(\varvec{M}^j\) be a noisy image displayed in Fig. 1b, the restoration produced by the solution of the problem (2) with \(\mu _j=2230\) are showed in Fig. 1c. We test several \(\mu _j\) and choose the result with the highest PSNR 16.26 dB for comparison. Next, we change the basis functions \(\varvec{U}^j , \varvec{V}^j\) as \( \hat{\varvec{U}}^j , \hat{\varvec{V}}^j\) and \( \hat{ \varvec{U}}^j , \hat{\varvec{V}}^j\) are set as the singular matrices of the clean image displayed in Fig. 1a. i.e. \(\varvec{F}=\hat{\varvec{U}}^j\Sigma _{\varvec{F}}(\hat{\varvec{V}}^j)^{'}.\) Then we calculate \(\varvec{\Sigma }_{M^j}=(\hat{\varvec{U}}^j)^{'}\varvec{M}^j\hat{\varvec{V}}^j\) in problem (2), we solve this problem again with \(\mu _j=260\), and get the result demonstrated in Fig. 1d. We see that there is much improvement in both PSNR and visual effect in Fig. 1d. The reason is very simple: we use better basis vectors for the transformation. In the denoising problem, the \(\hat{\varvec{U}}^j, \hat{\varvec{V}}^j\) are not available since the latent clean image \(\varvec{F}\) is unknown. However, this inspires us to update the basis function by using an iteration method. We will formulated a local SVD operator based method in the next section.

Fig. 1
figure 1

Denoising results with different basis functions. a Clean, b Noisy, 8.10 dB, c 16.26 dB, d 27.01 dB

3 Proposed Model

Our method is mainly based on (2). First, We reformulate (2) as a linear operator representation. For an image \(g\in \mathbb {R}^N\), please recall \(\varvec{M}^j=[\varvec{P}^{1j} g,\varvec{P}^{2j} g,\ldots ,\varvec{P}^{I_jj} g],\) and \( \varvec{\Sigma }_{\varvec{M}^j} =(\varvec{U}^j)' \varvec{M}^j \varvec{V}^j\). Let \((\varvec{V}^j)^{'}=[v_1^j,v_2^j,\ldots ,v_{I_j}^j],\) then

$$\begin{aligned} \begin{array}{rl} \varvec{\Sigma }_{M^j}&{}=(\varvec{U}^j)'\left[ \varvec{P}^{1j} g,\varvec{P}^{2j} g,\ldots ,\varvec{P}^{I_jj} g\right] \left[ v_1^j,v_2^j,\ldots ,v_{I_j}^j\right] ^{'}\\ &{}=\displaystyle \biggl .\sum _{i=1}^{I_j}(\varvec{U}^j)'\varvec{P}^{ij}g(v_{i}^j)^{'}. \end{array} \end{aligned}$$

and obtain

$$\begin{aligned} \begin{array}{rl} vec(\varvec{\Sigma }_{\varvec{M}^j})&=\displaystyle \biggl .\sum _{i=1}^{I_j}\left( v_{i}^j\otimes \left( (\varvec{U}^j)'\varvec{P}^{ij}\right) \right) g. \end{array} \end{aligned}$$

We denote the local SVD operator as

$$\begin{aligned} \varvec{T}^j=\sum _{i=1}^{I_j} v_i^j \otimes ((\varvec{U}^j)^{'}\varvec{P}^{ij}), \end{aligned}$$
(3)

thus we have

$$\begin{aligned} vec(\varvec{\Sigma }_{\varvec{M}^j})=\varvec{T}^jg. \end{aligned}$$

Based on the above analysis, we propose the following general model for denoising:

$$\begin{aligned} \underset{f}{\min } \left\{ \frac{1}{2}\sum _{j=1}^{J}||\varvec{T}^j(f-g)||_2^2+\sum _{j=1}^{J}\mu _{j}||\varvec{T}^j f||_p+\mu TV(f)\right\} , \end{aligned}$$
(4)

where \(\varvec{T}^j\) is a block matching local SVD operator defined in (3). \(j\in \{1,2,\ldots ,J\}\) are locations of pixels. \(||\cdot ||_p\) is the p norm in which p can be chosen as 0 (low rank) or 1 (nuclear norm). TV is the discrete isotropic TV operator which has the following discrete expression

$$\begin{aligned} TV(f)=||\nabla f||_{2,1}, \end{aligned}$$

where \(\nabla f=((\varvec{I}\otimes \varvec{D}^1)f,(\varvec{D}^2\otimes \varvec{I})f)\) and \(\varvec{D}^1\), \(\varvec{D}^2\) are two 1D difference matrices with respect to x-direction and y-direction. While \(||\cdot ||_{2,1}\) has the precise representation \(||\varvec{A}||_{2,1}=\sum _{i=1}^N\sqrt{a_{i1}^2+a_{i2}^2 }\) when \(\varvec{A}=(a_{ij})_{N\times 2}\in \mathbb {R}^{N\times 2}\). \(\mu _i\geqslant 0, \mu \geqslant 0\) are regularization parameters.

The first term in (4) is a fidelity term in transformation domain, which requires the transform coefficients of clean image f and noisy image g are similar. The second term in (4) is a priority term which controls the sparsity of the transform coefficients of f. Finally, the third term is a constraint in spatial domain which controls the smoothness of the reconstructed small image patches.

Theoretically, the \(\varvec{U}^j, \varvec{V}^j\) in the local SVD operator \(\varvec{T}^j\) can be set as any orthogonal matrices. However, as mentioned earlier, inappropriate basis matrices may lead the image patch groups to not have a sparse representation.

On the block local SVD operator \(\varvec{T}^j\), we have the following properties:

Proposition 2

\((\varvec{T}^j)^{'}\varvec{T}^j=\sum _{i=1}^{I_j}(\varvec{P}^{ij})^{'}\varvec{P}^{ij},\) and \(\sum _{j=1}^J(\varvec{T}^j)^{'}\varvec{T}^j=\sum _{j=1}^J\sum _{i=1}^{I_j}(\varvec{P}^{ij})^{'}\varvec{P}^{ij}\) is invertible.

Proposition 3

For any \( y\in \mathbb {R}^{nI_j},\)\((\varvec{T}^j)^{'} y=\sum _{i=1}^{I_j}(\varvec{P}^{ij})^{'}\varvec{U}^{j} array(y) v^j_i.\)

Proposition 4

\(ker(\sum _{j=1}^J(\varvec{T}^j)^{'}\varvec{T}^j)=\{0\}.\)

The proofs of these propositions are not difficult. Here we omit the proof details and leave them to the readers.

Equation (4) is not smoothed and to solve it directly would often be slow. However, it can be efficiently optimized by popular splitting methods such as ALM [19], ADMM and Split Bregman [20].

4 Algorithm

To solve (4), we can introduce auxiliary variables \(\alpha _j\in \mathbb {R}^{nI_j}\) and \(\varvec{\alpha }=[\alpha _1,\alpha _2,\ldots ,\alpha _J]\in \mathbb {R}^{nI_j\times J}\), then one can get the following constrained minimization problem

$$\begin{aligned} \begin{array}{l} \underset{\varvec{\alpha },f}{\min } \left\{ \frac{1}{2}\sum _{j=1}^{J}||\alpha _j-\varvec{T}^jg||^2+\sum _{j=1}^{J}\mu _{j}||\alpha _j||_p+\mu TV(f)\right\} ,\\ s.t. ~~\alpha _j=\varvec{T}^j f,\;\; for\;\; j=1,2,\ldots ,J. \end{array} \end{aligned}$$

The standard Augmented Lagrangian method produces the following scheme:

$$\begin{aligned} \left\{ \begin{array}{ll} &{}(\varvec{\alpha }^k,f^k)=\underset{f,\varvec{\alpha }}{\arg \min } \left\{ \displaystyle \biggl .\frac{1}{2}\sum _{j=1}^{J}||\alpha _j-\varvec{T}^jg||^2+\sum _{j=1}^{J}\mu _{j}||\alpha _j||_p\right. \\ &{}\qquad \qquad \qquad \left. +\,\mu TV(f)+\displaystyle \biggl .\frac{\eta }{2}\sum _{j=1}^J \left| \left| \alpha _j-\varvec{T}^j f-\lambda _j^{k-1}\right| \right| ^2\right\} ,\\ &{} \lambda _j^{k}=\lambda _j^{k-1}+\delta \left( \varvec{T}^j f^{k}-\alpha _j^k\right) .\\ \end{array} \right. \end{aligned}$$
(5)

By applying alternating minimization, it becomes

$$\begin{aligned} \left\{ \begin{array}{rl} \varvec{\alpha }^k&{}=\underset{\varvec{\alpha }}{\arg \min } \left\{ \frac{1}{2}\sum _{j=1}^{J}||\alpha _j-\varvec{T}^jg||^2+\sum _{j=1}^{J}\mu _{j}||\alpha _j||_p\right. \\ &{}\left. \quad +\,\frac{\eta }{2}\sum _{j=1}^J \left| \left| \alpha _j-\varvec{T}^j f^{k-1}-\lambda _j^{k-1}\right| \right| ^2\right\} ,\\ f^k&{}=\underset{f}{\arg \min } \left\{ \mu TV(f)+\frac{\eta }{2}\sum _{j=1}^J \left| \left| \alpha _j^k-\varvec{T}^j f-\lambda _j^{k-1}\right| \right| ^2\right\} ,\\ \lambda _j^{k}&{}=\lambda _j^{k-1}+\delta \left( \varvec{T}^j f^{k}-\alpha _j^k\right) ,\quad j=1,2,\ldots ,J. \end{array} \right. \end{aligned}$$
(6)

We note that if the local SVD operator \(\varvec{T}^j\) is fixed as a patch based SVD of g, then \(\varvec{T}^j f^{k-1}\) may not be sparse enough. This will lead to \(\varvec{\alpha }\) not being sparse. In order to make the transformation coefficients \(\varvec{\alpha }\) be more sparse, we must adjust the local SVD operator \(\varvec{T}^j\). Inspired by the solution of \(\varvec{\alpha }^k\) [Propositions (5), (6)], we choose a good \(\varvec{T}^j\) as following

$$\begin{aligned} \varvec{T}^{j,k}=\sum _{i=1}^{I_j} v_i^{j,k} \otimes \left( (\varvec{U}^{j,k})^{'}\varvec{P}^{ij}\right) \end{aligned}$$

Here, the local left-singular matrix \(\varvec{U}^{j,k}\in \mathbb {R}^{n\times n}\) and right-singular matrix \((\varvec{V}^{j,k})^{'}\in \mathbb {R}^{I_j\times I_j}\) are the SVD

$$\begin{aligned} \left[ \varvec{P}^{1j} \hat{g},\varvec{P}^{2j} \hat{g},\ldots ,\varvec{P}^{I_jj} \hat{g}\right]= & {} \varvec{U}^{j,k}\Sigma ^j_{\hat{g}}(\varvec{V}^{j,k})^{'},\\ (\varvec{V}^{j,k})^{'}= & {} \left[ v_1^{j,k},v_2^{j,k},\ldots ,v_{I_j}^{j,k}\right] . \end{aligned}$$

where \(\hat{g}=(\frac{g}{1+\eta }+\frac{\eta f^{k-1}}{1+\eta })\). Thus in our method, the basis functions \(\varvec{U}^{j,k}\) and \(\varvec{V}^{j,k}\) are updated during the iteration. This updating process can make the transformation coefficients more sparse, and we can get improved restoration results. This is different from the BM3D method. In that algorithm, the basis functions are chosen as DCT or wavelet basis functions and they are all fixed.

Therefore, the iteration becomes

$$\begin{aligned} \varvec{T}^{j,k}&=\sum _{i=1}^{I_j} v_i^{j,k} \otimes \left( (\varvec{U}^{j,k})^{'}\varvec{P}^{ij}\right) , \end{aligned}$$
(7)
$$\begin{aligned} \varvec{\alpha }^k&=\underset{\varvec{\alpha }}{\arg \min } \left\{ \frac{1}{2}\sum _{j=1}^{J}||\alpha _j-\varvec{T}^{j,k}g||^2+\sum _{j=1}^{J}\mu _{j}||\alpha _j||_p\right. \end{aligned}$$
(8)
$$\begin{aligned}&\left. \quad +\,\frac{\eta }{2}\sum _{j=1}^J \left| \left| \alpha _j-\varvec{T}^{j,k} f^{k-1}-\lambda _j^{k-1}\right| \right| ^2\right\} , \end{aligned}$$
(9)
$$\begin{aligned} f^k&=\underset{f}{\arg \min } \left\{ \mu TV(f)+\frac{\eta }{2}\sum _{j=1}^J \left| \left| \alpha _j^k-\varvec{T}^{j,k} f-\lambda _j^{k-1}\right| \right| ^2\right\} , \end{aligned}$$
(10)
$$\begin{aligned} \lambda _j^{k}&=\lambda _j^{k-1}+\delta \left( \varvec{T}^{j,k} f^{k}-\alpha _j^k\right) ,\quad j=1,2,\ldots ,J. \end{aligned}$$
(11)

Both of the two minimization subproblems can be efficiently solved. For subproblem \(\varvec{\alpha }\), there is a closed-form solution represented by the soft and hard thresholding operators when \(p=1\) and \(p=0\), respectively.

Proposition 5

For \(p=1\) in subproblem (8), \(\alpha _j^k=\mathcal {S}(\varvec{T}^{j,k} \hat{g}+\frac{\eta \lambda _j^{k-1}}{1+\eta },\frac{\mu _j}{1+\eta }),\) where \(\mathcal {S}\) is a shrink operator and \(\mathcal {S}(f,\mu )=\frac{f}{|f|}\max \{|f|-\mu ,0\}.\)

Proposition 6

For \(p=0\) in subproblem (8), then \(\alpha _j^k=\mathcal {H}(\varvec{T}^{j,k} \hat{g}+\frac{\eta \lambda _j^{k-1}}{1+\eta },\sqrt{\frac{2\mu _j}{1+\eta }})\) is a minimizer of subproblem (8), where \(\mathcal {H}\) is a hard thresholding operator and \(\mathcal {H}(f,\mu )=\left\{ \begin{array}{rl} 0,&{}\quad |f|\leqslant \mu ,\\ f,&{}\quad |f|>\mu . \end{array} \right. \)

Remark: the subproblem (8) is strictly convex when \(p=1\) and thus the minimizer is unique. However, when \(p=0\), this subproblem is non-convex and it may have many minimizers. One can see this from the proof.

As to the subproblem of f, it is a ROF model [1] with a local SVD operator. It can be solved quickly by Split Bregman iteration [20]. We list the iteration scheme in the following:

$$\begin{aligned} \left\{ \begin{array}{ll} &{} (f^{k,l},\varvec{q}^{l})=\underset{f}{\arg \min } \left\{ \begin{array}{l} \frac{\eta }{2}\sum _{j=1}^J \left| \left| \alpha _j^k-\varvec{T}^{j,k} f-\lambda _j^{k-1}\right| \right| ^2\\ \quad +\,\frac{\eta _1}{2}\left| \left| q_1-(\varvec{I}\otimes \varvec{D}^1)f- b_1^{l-1}\right| \right| ^2\\ \quad +\,\frac{\eta _1}{2}\left| \left| q_2-(\varvec{D}^2\otimes \varvec{I})f- b_2^{l-1}\right| \right| ^2\\ \quad +\,\mu ||\varvec{q}||_2\\ \end{array} \right\} ,\\ &{} b_1^{l+1}=b_1^{l}+(\varvec{I}\otimes \varvec{D}^1)f^{k,l}-q_1^{l},\\ &{} b_2^{l+1}=b_2^{l}+(\varvec{D}^2\otimes \varvec{I})f^{k,l}-q_2^{l}.\\ \end{array} \right. \end{aligned}$$

By applying the alternating algorithm again, we have the TV inner iteration:

$$\begin{aligned} \left\{ \begin{array}{l} \left( \eta \sum _{j=1}^{J} (\varvec{T}^{j,k}){'}\varvec{T}^{j,k}+\eta _1 \varvec{\triangle }\right) f^{k,l}=h^{k,l},\\ q_1^{l}=\mathcal {S}_1\left( (\varvec{I}\otimes \varvec{D}^1)f+ b_1^{l-1},\;\;(\varvec{D}^2\otimes \varvec{I})f+ b_2^{l-1},\;\;\frac{\mu }{\eta _1}\right) ,\\ q_2^{l}=\mathcal {S}_1\left( (\varvec{D}^2\otimes \varvec{I})f+ b_2^{l-1},(\varvec{I}\otimes \varvec{D}^1)f+ b_1^{l-1},\frac{\mu }{\eta _1}\right) ,\\ b_1^{l+1}=b_1^{l}+(\varvec{I}\otimes \varvec{D}^1)f^{k,l}-q_1^{l},\\ b_2^{l+1}=b_2^{l}+(\varvec{D}^2\otimes \varvec{I})f^{k,l}-q_2^{l},\\ \end{array} \right. \end{aligned}$$
(12)

where \(\varvec{\triangle }=\varvec{I}\otimes ((\varvec{D}^1)^{'}\varvec{D}^1)+((\varvec{D}^2)^{'}\varvec{D}^2)\otimes \varvec{I}\) is the discrete Laplacian matrix, and

$$\begin{aligned} \begin{array}{rl} h^{k,l}&{}=\eta \sum _{j=1}^{J} (\varvec{T}^{j,k}){'}\left( \alpha _j^k-\lambda _j^{k-1}\right) +\eta _1\left( \left( \varvec{I}\otimes (\varvec{D}^1)^{'}\right) \right. \\ &{}\left. \left( q_1^{l-1}-b_1^{l-1}\right) +\left( (\varvec{D}^2)^{'}\otimes \varvec{I}\right) \left( q_2^{l-1}-b_2^{l-1}\right) \right) , \end{array} \end{aligned}$$

\(\mathcal {S}_1\) is an isotropic soft thresholding operator which has the expression

$$\begin{aligned} \mathcal {S}_1(x,y,\mu )=\frac{x}{\sqrt{x^2+y^2}}\max \left\{ \sqrt{x^2+y^2}-\mu ,0\right\} . \end{aligned}$$

The linear equation of \(f^{k,l}\) in (12) can be efficiently solved by Gauss–Seidel iteration since \(\sum _{j=1}^{J} (\varvec{T}^{j,k}){'}\varvec{T}^{j,k}=\sum _{j=1}^J\sum _{i=1}^{I_j}(\varvec{P}^{ij})^{'}\varvec{P}^{ij}\) is a invertible diagonal matrix.

We summary the proposed BMLSVDTV method in Algorithm (1)

Algorithm 1

BMLSVDTV denoising algorithm

Set the initial \(f^0=g\) and some regularization parameters \(\eta , \eta _1,\mu , \mu _j\), for \(k=1,2,\ldots \)

Step 1,:

  Block Matching: for each image block of \(\hat{g}=\frac{g}{1+\eta }+\frac{\eta f^k}{1+\eta }\) at j, find the \(I_j\) most similar image block. This is equivalent to obtaining the extract matrix \(\varvec{P}^{ij}, j=1,2,\ldots ,J,~i=1,2,\ldots ,I_j.\)

Step 2,:

  Basis updating: Get the local SVD transform operator \(\varvec{T}^{j,k}\) by (7).

Step 3,:

  Sparsity Regularization: Compute \(\varvec{\alpha }^k\) with soft or hard thresholding operator by Propositions (5) or (6).

Step 4,:

  TV Regularization: Solve the TV subproblem (10) with several Bregman iterations (12) to get \(f^k\). If \(f^k\) satisfies the stopping criterion \(\frac{||f^k-f^{k-1}||}{||f^{k-1}||}<\epsilon \) or reaches the maximum iteration number, then stop and get the restoration result, else, go to the next step.

Step 5,:

   Lagrangian multiplier updating Calculate (11) and go to step 1.

5 Convergence Analysis

If we set \(p=1\), then the proposed model (4) is strictly convex since the first term is strictly convex by the Proposition 2. With a fixed basis \(\varvec{T}^j\), we can show the convergence of the iteration scheme (5) and (6) as follows.

Theorem 1

For \(p=1\), suppose \(f^*\) is the minimizer of problem (4), \(\forall 0<\delta <2\), then the sequence \({f^k}\) produced by the iteration scheme (5) is converged and \(\lim _{k\rightarrow \infty }f^k=f^*.\)

Proof

The proof can be followed by some standard discussions of augmented Lagrangian method with operator splitting such as [21,22,23]. \(\square \)

The difference between the iteration scheme (5) and (6) is that \(\alpha ^k\) is updated by a given \(f^{k-1}\) in alternating minimization scheme (6) while \(\alpha ^k,f^k\) in scheme (5) are updated simultaneously. Similarly, we have the following convergence result for this alternating scheme:

Theorem 2

. For \(p=1\), suppose \(f^*\) is the minimizer of problem (4), let \(\delta =1\), then the sequence \({f^k}\) produced by the iteration scheme (6) is converged and \(\lim _{k\rightarrow \infty } f^k=f^*.\)

Proof

The proof is very similar as Theorem 1. \(\square \)

Similar convergence results also can be found in [24] for splitting Bregman method, Douglas-Rachford Splitting [25] and augmented Lagrangian method [23] etc.

Let us point out if the basis \(\varvec{T}^j\) is updated, then this problem becomes a non-convex, and we have not got its convergence yet. However, numerical experiments show that one can obtain some better restorations with updating \(\varvec{T}^j\). Thus in practical computing, we use some updating bases.

6 Experiments

6.1 Implementation Details

Let us mention that both the sparsity regularization and the basis updating in the proposed algorithm can be implemented block by block because of the linear structure of the energy.

There are some parameters in the proposed method. Generally speaking, the reconstructed results are affected by these parameters. For all the cases, we let the penalty parameter \(\eta =9,\) and the time step be \(\delta =0.01\). The parameter \(\sqrt{n}\) which is the size of image blocks are empirically set according to the levels of the noise. For \(\sigma =10,20,30,50,75,100\), we set \(\sqrt{n}=6,6,7,8,8,9\), respectively. For ease of computation, we set the number of each image block group \(I_j\) as a constant. We simply let \(I_j=40,40,50,60,70,90\) for noise with \(\sigma =10,20,30,50,75,100\). The TV regularization parameters \(\mu \) are set as \(\mu =0.1\) when \(\sigma <=20\) and \(\mu =0.2\) when \(\sigma >20\). As for the sparsity regularization parameters \(\mu _j\), we let \(\mu _j=n\sigma _{local}^2(1+\eta )\) when \(p=0\). Here the local noise variance \(\sigma _{local}^2\) can be determined by some simple local noise estimation technique [26, 27]. As for the parameter of the sparse norm p, we have tested some images under different noise levels and it appears that \(l_0\) norm can produce better results than \(l_1\). Thus in the following experiments, we just list the results produced by \(l_0\).

Table 1 The PSNR values for different methods under several levels Gaussian noise

6.2 Numerical Results

We take 8 images to test the performance of the proposed algorithm. To show its impressive denoising results, the most related TV, KSVD and BM3D are used to make some comparisons. All the PSNR values for these four methods can be found in Table 1. It is well-known that the denoising result may be affected by some parameters in these regularization methods. For KSVD and BM3D, we use the codes provided by the authors and adopt the suggested parameters. For TV, we test several regularization parameters under each noise level for one image such as ’Monarch’ and take the results with the highest PSNR for comparison, then we use the same parameters for all other 7 images. One can see that the proposed method outperforms the other methods in almost all the cases. Compared to the BM3D, which is a state-of-the-art denoising method, the improved PSNR is about 0.35 dB better on average (Fig. 2).

Fig. 2
figure 2

The 8 test images. From left to right and top to bottom, Peppers, Monarch, Camera man, Square, House, Barbara, Boat, and Hill

Fig. 3
figure 3

The comparison of the performance on the geometry structure image ’Square’ contaminated by the low level Gaussian noise with standard deviation \(\sigma =10\). For the details in the blue and red square areas, please see Fig. 4. a Clean, b Noisy, 28.10 dB, c TV, 39.55 dB, d KSVD, 42.76 dB, e BM3D, 44.45 dB, f Proposed, 47.17 dB (Color figure online)

Fig. 4
figure 4

The local details of the blue and red square areas in Fig. 3. a Clean, b Noisy, c TV, d KSVD, e BM3D, f Proposed, g Clean, h Noisy, i TV, j KSVD, k BM3D, l Proposed

Fig. 5
figure 5

The comparison of performance on the geometry structure image ’Square’ contaminated by the high level Gaussian noise with standard deviation \(\sigma =100\). For the details in the blue and red square areas, please see Fig. 6. a Clean, b Noisy, 8.10 dB, cTV, 27.15 dB, d KSVD, 25.91 dB, e BM3D, 28.38 dB, f Proposed, 29.11 dB (Color figure online)

Fig. 6
figure 6

The local details of the blue and red square areas in Fig. 5. a Clean, b Noisy, c TV, d KSVD, e BM3D, f Proposed, g Clean, h Noisy, i TV, j KSVD, k BM3D, l Proposed

Fig. 7
figure 7

The first column of the final basis functions \(\varvec{U}\) in the proposed method

Fig. 8
figure 8

The performance comparison on the texture structure image ’Barbara’ contaminated by Gaussian noise with standard deviation \(\sigma =100\). For the details in the blue and red square areas, please see Fig. 9. a Clean, b Noisy, 8.14 dB, c TV, 21.75 dB, d KSVD, 21.87 dB, e BM3D, 23.62 dB, f Proposed, 24.28 dB (Color figure online)

Fig. 9
figure 9

The local details of the blue and red square areas in Fig. 8. a Clean, b Noisy, c TV, d KSVD, e BM3D, f Proposed, g Clean, h Noisy, i TV, j KSVD, k BM3D, l Proposed

Fig. 10
figure 10

The removed noise in Fig. 8 by the four methods. a TV, b KSVD, c BM3D, d Proposed

As to the visual effects, our method can reduce the artificial ringings effect which is caused by stacking the image blocks in the BM3D method, thanks to the existence of TV in the proposed method. Also, similar to the KSVD, the basis functions in our method are adaptive and thus can make the sparse representation better than the fixed ones. Our model can then keep the texture better. We take 2 example images, one is the simple image ’Square’, which is almost piece-wise constant and has strong geometric structures, the other is ’Barbara’, which contains much repeated texture. To show the performance under different levels of noise, for ’Square’ image, we show the result under low level noise with standard deviation \(\sigma =10\) and high level with \(\sigma =100\). For ’Barbara’ image, we just list the image contaminated by heavy Gaussian noise with standard deviation \(\sigma =100\). The Figs. 3, 5 and 8 demonstrate the restored results produced by the four methods. The TV in Fig. 3c can keep the strong edges well under low level noise, but produces some false edges due to the heavy noise (see Figs. 6c and 9c). Also, the repeated texture details are almost be removed due to its weak texture preserving ability. See Figs. 8c and 9i for examples. The KSVD and BM3D have better performance on texture restoration. However, in the smooth areas, the BM3D may produce some ringing effects as displayed in Figs. 6e, k and 9e. These artificial effects can be well controlled in the proposed method by adding the TV regularization, see the enlarged areas image in Figs. 6f, l and 9f. One can find that the restorations in these figures contain the cleanest strong edges and smooth areas. Moreover, our model can improve the texture preserving ability by basis updating, one can take Fig. 9k, l for comparison. It is easy to find that the restored textures in Fig. 9l are cleaner than the ones in Fig. 9k. For better comparison, in Fig. 10, we show the removed noise for Fig. 8. Both of the removed noise by the BM3D and proposed method have little information, in fact, ours has less than BM3D’s (Figs. 4, 5, 6).

As mentioned early, the basis functions \(\varvec{U}^{j,k},\varvec{V}^{j,k}\) in the proposed method are adaptive. Thus, similar to KSVD, we can show the basis functions for each image. However, the number of the basis functions \(\varvec{U}^{j,k}=(u^{j,k}_{1},u^{j,k}_{2},\ldots ,u^{j,k}_{n})\) in our method is very large and we can not show all of them. From SVD, we can see that the eigenvector \(u^{j,k}_{1}\) related to the first largest eigenvalue plays the most important role in sparse representation. Thus, we just show this eigenvector \(u^{j,k}_{1}\) for each image block. For \(256\times 256\) image shown in Fig. 5, by using the sliding window technique which is adopted by KSVD and BM3D with 3 pixels steps, we get \(J=86\times 86\) images patches with size \(9\times 9\), for each \(9\times 9\) image patch and its similar block groups, we display \(u^{j,k}_{1}\in \mathbb {R}^{81}, j=1,2,\ldots ,J\) in Fig. 7. In this figure, each red image block is a \(9\times 9\) array of \(u^{j,k}_{1}\), and the total number of the image blocks is \(86\times 86\). From this basis functions, one can find near the strong edges, the basis are almost binary and it can represent the non-continuous edges well. Meanwhile, the basis functions in the tiny edges or smooth areas have many oscillations and they can represent the textures well. This is totally different from BM3D, which employs a fixed dictionary such as DCT or wavelets (Fig. 8, 9, 10).

7 Conclusion and Discussion

We have proposed a local SVD operators based sparsity and TV regularization method for image denoising. Sparsity and TV are naturally unified in a variational energy and can produce some impressive restoration results. The local SVD basis functions can improve the texture recovering ability and the global TV can reduce some artificial ringing effects in the restoration. However, the computational cost is heavy due to the existence of block matching and local SVD. Generally speaking, with our unoptimized matlab codes on a PC equipped with 3.2 GHz CPU, for images with size \(256\times 256\), for each outer iteration, the block matching step will take about 16 s, the basis updating and sparsity regularization will take about 20 s and the TV step is fast and will take less than 0.3 s. But the efficiency of our codes can be greatly improved by optimization since the same block matching step in BM3D just costs less than 1 s. It also can be further improved by parallel processing with a GPU. We do not discuss the implementation efficiency in this paper.

The proposed method can be easily extended to image deblurring, inpainting and even segmentation. Due to the limited page space, we could not include these. We will consider these in an upcoming paper.