Keywords

1 Introduction

Image Restoration is considered as one of the crucial ingredient of Medical Image Analysis systems. The possible sources for addition of noise are various parameters of the acquisition process such as flip angle, scan time, coil resistance, dielectric and inductive losses in sample, patient movement etc. [12]. MRI, being a non-invasive technique, offers many advantages in clinical analysis but the disturbances or noise induced in acquisition process degrade the quality of the signal. In Medical Image Denoising problem, the noise model is found to be Rician in nature which is different from commonly used distributions such as Gaussian, Poisson, etc. [8].

It has been shown that the intensities of MR images represent magnitude of underlying complex data which follows Rice distribution [7]. The real and imaginary parts are modeled as independently distributed Gaussian with means \(a_r\) and \(a_i\) respectively, with same variance \(\sigma ^2\). The probability density function (pdf) of Rician random variable \(y\) is defined as follows:

$$\begin{aligned} f_Y(y|a,\sigma )=\frac{y}{\sigma ^2}e^{\left( -\frac{y^2+a^2}{2\sigma ^2} \right) }I_0\left( \frac{ya}{\sigma ^2}\right) , y>0 \end{aligned}$$
(1)

where \(a=\sqrt{a_r^2+a_i^2}\) is underlying noise free signal amplitude and \(I_n(z)\) is \(n^{th}\) order modified bessel function of first kind. When Signal to Noise Ratio (SNR, here it is \(a/\sigma \)) is high, the Rician distribution approaches a Gaussian; when SNR approaches to zero (that is only noise is present, \(a\rightarrow 0\)) the Rician distribution becomes Rayleigh distribution and the pdf turns out to be

$$\begin{aligned} f_Y(y|a\rightarrow 0,\sigma )=\frac{y}{\sigma ^2}e^{\left( -\frac{y^2}{2\sigma ^2} \right) } \end{aligned}$$
(2)

Hence, the conventional methods for Rician noise removal first try to find the background portion in the medical images where no signal is assumed. Hence, one can use Rayleigh distribution in background portion and Gaussian distribution in the rest (where SNR is assumed to be high enough) [9, 16]. However under the noisy condition, it is difficult to find proper background in the image.

Recent methods use the principle of non-local self similarity for image restoration task, where the first step involves finding out the similar patches (in terms of some predefined criteria such as Euclidean distance) that are similar to a given reference patch from the image [1]. Thereafter, an orthonormal basis is inferred for each patch and shrinkage is performed on the coefficients when the patch is projected on that basis, coefficients are sparse in nature as described in [4, 6, 14].

Out of recently proposed techniques, BM3D [4] is most popular. BM3D technique creates a 3D stack of similar patches, projects it onto a 3D basis (tensor product of 2D-DCT and 1D-Haar), and performs hard thresholding of these coefficients followed by basis inversion, thereby allowing a coupled update of the coefficients [4]. Another class of methods such as [5, 13], first to cluster similar patches and then learn basis for each cluster instead of searching the similar patches for each underlying reference patch. However, due to nature of noise, straight forward implication of natural image denoising methods has not been advocated for medical images. The NLM method has been extended for Medical Image denoising problem in [11] where bias correction needs to be considered. BM3D has been extended using a suitable invertible transformation of the medical data into another domain where data behaves like Gaussian distributed in resultant domain. The most commonly known such kind of transformation for this purpose is Anacombe’s Transformation, also known as Variance Stabilization Technique (VST). Recently, VST has been proposed in [7] for Rician distributed data and BM3D method is referred as BM3D+VST method. The BM3D+VST method can be summarized mathematically as follows:

$$\begin{aligned} \hat{y} = VST^{-1}(BM3D(VST(z,\sigma ),\sigma _{VST}),\sigma ) \end{aligned}$$
(3)

where \(VST^{-1}\) denotes the inverse VST, \(\sigma _{VST}\) is the stabilized standard deviation induced by VST and \(z\) denotes the additive white Gaussian noise whose true intensity is represented by \(y\). However, BM3D+VST is extended to 3D medical data as BM4D method in [10]. This manuscript focuses on 2D data denoising methods only.

The aim of this article is to explore a direct technique that can handle Rician noise suitably giving rise to noise removal as good as BM3D+VST, if not better. We have extended PCA based method using Rough Set based clustering proposed in [13] to Rician noise model and bias term correction is also made, referred as ER-PCA in the paper. We have proposed a new Kernel based PCA (KPCA) method for Rician noise. However, we have adopted the clustering strategy used in [13], which is non-local approach in true-sense. As per our knowledge, KPCA has not been applied for Rician noise removal in medical image yet. The kernel based methods can find non-linearity of data in Feature Space. Recently, kernel based methods have been used in Medical imaging in [2, 15, 19]. However, choice of appropriate kernel for given data is undecidable. In the current proposal, Gaussian kernel is used and the performance of noise removal technique is at par with the state-of-the-art methods.

The paper has been arranged in following manner: Sect. 2 presents proposed method using KPCA. Section 3 compares proposed method with other state-of-the-art methods. The manuscript is concluded in Sect. 4.

2 Proposed Method Using KPCA

A non-parametric variant of PCA, known as Kernal Principal Component Analysis (KPCA) has been explored for Rician noise removal. The KPCA tries to explore structure in the data in Feature Space instead of Image Space itself and tries to capture higher-order dependencies in the data. In Fig. 1, two class data is shown in circular form and transformed to higher dimension for classification purpose, where transformation is \(\phi (x) : (x_1,x_2) \rightarrow (x_1,x_2,x^2_1+x^2_2)\). Hence, one can find a discriminating plane (linear surface) in higher dimensions which is not possible in two dimensions for given data points.

Fig. 1.
figure 1

Transformation of two circular data sets into higher dimension space using kernel method where separation between them is more prominent and can be classified using linear hyper-surface.

Fig. 2.
figure 2

Reconstruction using PCA and KPCA over synthetic data with Rician noise. (a) Synthetic Data, (b) Rician Noisy Data, (c) Reconstruction using PCA and (d) Reconstruction using KPCA.

In KPCA, this nonlinearity is introduced by first mapping the data into another space F using a nonlinear map \(\phi : R^N \rightarrow F\), before standard linear PCA is carried out in F using the mapped samples \(\phi (x_k)\). The map \(\phi \) and the space \(F\) are determined implicitly by the choice of a kernel function \(k\), which acts as a similarity measure. This mapping computes the dot product between two input samples \(x\) and \(y\) mapped into \(F\):

$$\begin{aligned} k(x; y) = \phi (x).\phi (y) \end{aligned}$$
(4)

One can show that if \(k\) is a positive definite kernel, then there exists a map \(\phi \) into a dot product space \(F\) such that Eq. 4 holds. The space \(F\) then has the structure of a so-called Reproducing Kernel Hilbert Space (RKHS) [2].

The identity Eq. 4 is important for KPCA since PCA in \(F\) can be formulated entirely in terms of inner products of the mapped samples. Thus, we can replace all inner products by evaluations of the kernel function. This has two important consequences: first, inner products in \(F\) can be evaluated without computing \(\phi (x)\) explicitly. This allows to work with a very high-dimensional, possibly infinite-dimensional RKHS \(F\). Second, if a positive definite kernel function is specified, we need to know neither \(\phi \) nor \(F\) explicitly to perform KPCA since only inner products are used in the computations. Commonly used positive definite kernel functions are polynomial kernel of degree \(d \in N, k(\mathbf x ,\mathbf y )=(\mathbf x .\mathbf y )^d\) or \(k(\mathbf x ,\mathbf y )=(\mathbf x .\mathbf y +1)^d\) or Gaussian kernel of width \(\sigma > 0\), \(k(\mathbf x ,\mathbf y ) = exp\left( -\left\| \mathbf x -\mathbf y \right\| ^2 /2\sigma ^2 \right) \). In all the experiments, Gaussian kernel has been used which is isotropic stationary in nature and also satisfies Mercer’s Theorem [19].

A synthetic experiment has been performed as shown in Fig. 2 where Rician noise added in the synthetic data. However, KPCA (with Gaussian kernel) is able to preserve orientation of the data in a better way as compared to PCA based reconstruction.

Fig. 3.
figure 3

Difference comparison of KPCA with reference to BM3D+VST method (at zero level vertically) for 50 slices for noise standard deviation equal to 15 (a) T1 images with PSNR difference values, (b) T1 images with MSSIM difference values, (c) T2 images with PSNR difference values and (d) T2 images with MSSIM difference values.

The outline of present work can be described as follows:

  1. 1.

    Get the clusters of patches from the given noisy image using Rough set based method (as described in [13]).

  2. 2.

    For each cluster, get the basis vectors using KPCA method along pixel positions. For patches of size \(p \times p\), kernel matrix would be of size \(p^2 \times p^2\). Hence, the method is data adaptive in nature.

  3. 3.

    Project the noisy image patches on the obtained basis vectors in the KPCA domain.

  4. 4.

    Apply coefficient shrinkage method on these projected patches to get the denoised patches. Transform them back to image space.

  5. 5.

    Remove the bias term from each pixel of the denoised image.

    $$\begin{aligned} I_{unbiased} = \sqrt{max(\hat{I}(i,j)^2-2h^2,0)} \end{aligned}$$
    (5)

    where \(h\) is the standard deviation of noise and \(\hat{I}\) is the image obtained by step (4).

Fig. 4.
figure 4

(a) Synthetic Noisy T1 Image with Rician noise standard deviation=15 and PSNR =22.7220 dB, Denoised image using (b) UNLM method, PSNR = 34.4622 dB, (c) BM3D+VST method, PSNR = 34.2393 dB, (d) RS-NLM method, PSNR = 32.5856 dB, (e) ER-PCA method, PSNR = 33.8155 dB, (f) KPCA method, PSNR = 34.0241 dB.

Table 1. Performance comparison of proposed denoising strategy with different approaches on various quantitative measures under Rician Noise assumption in Brain Web database (slice = 70 & 100, Modality = T1, image size \(=181 \times 217\) and patch size \(= 5 \times 5\)). Best figures are shown in Bold.
Table 2. Performance comparison of proposed denoising strategy with different approaches on various quantitative measures under Rician Noise assumption in Brain Web database (slice = 70 & 100, Modality = T2, image size \(=181 \times 217\) and patch size \(= 5 \times 5\)). Best figures are shown in Bold.

3 Experimental Results

This Section encompasses the qualitative and quantitative evaluations of the proposed method along with some of the state-of-the-art methods. The experiments have been carried out on 2D monochrome phantom human brain MRI images obtained from Brain Web Database [3]. The parameters are as follows: RF = 20, protocol = ICBM, slice thickness = 1 mm, volume size = \(181 \times 217 \times 181\). The experimental set up considers Rician noise model at different noise levels along with two modalities, namely T1 and T2. The simulated database provides the ground truth image for evaluating denoising performance which most of the time is unavailable with real database. The Rician noise addition and bias correction are done as suggested in [10] and [11] respectively. The evaluation measures used are Peak-Signal-to-Noise Ratio (PSNR), Root Mean Square Error (RMSE), Mean Structural Similarity Index (MSSIM) [17] and Feature Similarity Index (FSIM) [18].

For comparison purpose, several state-of-the-art methods are considered: Unbiased Non Local Means (UNLM method) presented in [11], BM3D+VST method proposed in [4], Rough Set based Non Local Means (RS-NLM) method proposed in [13] and PCA based method proposed in the [13] has been extended in this work for Rician noise, referred as Extended Rough set based PCA method (ER-PCA). The parameters of all methods are kept default as suggested by respective authors. In all the experiments, patch size is kept as \(5 \times 5\). The proposed KPCA method does not use VST method. Tables 1 and 2 represent quantitative results for two slices 70 and 100 of T1 MR and T2 MR images respectively. The ER-PCA performance is comparable to UNLM and BM3D+VST methods. The proposed KPCA method outperforms ER-PCA and preserves structure better than other state-of-the-art method. Figure 3 shows difference of PSNR and MSSIM measure for KPCA method with reference to BM3D+PCA (zero level on vertical axis) of 50 slices (from \(61^{st}\) to \(110^{th}\) slice of database mentioned above) with noise standard deviation equal to 15 for both T1 and T2 modalities. Negative value indicates BM3D+VST performs better and, in reverse, positive value is indicator of better performance of KPCA method. From Fig. 3, PSNR of KPCA fall below BM3D+VST method whereas it better preserves structure of the image in terms of MSSIM measure. This is also visually evident in Fig. 4 for the slice 100 of T1 modality at noise level 15.

4 Conclusion

In this paper, an approach for removal of Rician noise from brain MR images using Kernel PCA has been proposed. Being a manifold learning method, KPCA explores a suitable transformation for image representation through sparse bases. This method learns basis vectors from data itself unlike BM3D+VST method where basis vectors are kept fixed. The limitation of KPCA method is the selection of suitable kernel which is yet unanswered. If the nature of data is not known a-prior than one can try various kernels to find a suitable one. However, commonly used Gaussian kernel in KPCA, found to perform comparable with other state-of-the-art methods. The PCA based method proposed in [13] has also been implemented to remove Rician noise, but it fails to attain superior performance over KPCA. The proposed method is implemented on synthetic data for quantitative evaluation since ground truth data is available for the same.