1 Introduction

Magnetic resonance imaging (MRI) is a powerful diagnostic technique. However, amalgamation of noise during image acquisition degrades the image quality and makes it difficult for human interpretation as well as computer-aided analysis of the images. Recently, because of technological development in image acquisition systems, magnetic resonance imaging (MRI) get benefited and now we can get MR images of increased resolution, better signal-to- noise ratio (SNR), and higher acquisition speed. However, there are many factors like resolution, acquisition speed, and SNR which are combined with scientific, clinical, and financial pressures to obtain resulting data more quickly. The researchers have to make tradeoffs among all these factors. For instance, the need for shorter acquisition times for patients in certain clinical studies often undermines the ability to obtain images having both high resolution and high SNR in MRIs. As the magnitude of the MRI signal is the square root of the sum of the squares of Gaussian distributed real and imaginary parts, it follows a Rician distribution [34]. In low intensity regions of the image, the Rician distribution tends to a Rayleigh distribution while in high intensity regions it approaches a Gaussian distribution which results in reduction of image contrast [23]. The effects of Rician noise on MRIs are more dominant because of the inherent nature of the process; as the higher tissue anisotropy produces progressively lower intensities in MR images which increase the possibility of rician noise [6]. There are few procedures exist in which higher-level post processing of MR images, e.g. segmentation and tractography were used but that assume specific models on regions of interest, e.g. homogeneous region. But these techniques are impaired by even moderate noise levels. However, denoising MR images remains an important problem. Hence denoising should be performed to improve the image quality for more accurate diagnosis.

Literature shows that Gaussian filters have been widely used for noise removal [28], however, they do not perform good on edges because of blurring effect due to averaging non similar patterns. In order to address the problem, many edge preserving filters have been proposed. One example is Anisotropic Diffusion Filter (ADF) [3, 24, 27]. This type of filters preserves edges by averaging pixels in the orthogonal direction of the local gradient. However, such kind of filters usually erases small features and image statistics are also changes due to its edge enhancement effect. This results as unnatural images.

Many wavelet-based techniques are also applied to denoise MR images [17, 25, 29, 36]. However, these methods are prone to produce significant artifacts in the processed images because of their structure of the underlying wavelets that can hamper the image analysis process. These methods referred as local because they exploit the spatial redundancy in a local neighborhood.

Recently, another very important type of filters has been developed for denoising the images, is based on Non-Local Means [10, 11]. The Non Local Means (NLM) filter is comparatively more robust against noise but has few limitations like both the objective quality and visual quality are somewhat inferior to the other recent techniques and attain quadratic complexity [18], which makes the technique computationally intensive and even impractical in real applications. Therefore, it got the attention of researchers who worked for improvements of enhancing the visual quality and for reducing the computation time. For example, for time efficiency [16, 21, 32], adaptive local neighbourhoods are used in [20], refine the similarity estimates in different iterations [9], acceleration techniques [7, 21, 22]. Many authors proposed techniques to successfully apply NL-Means filter on rician noise in magnetic resonance images [13, 14, 23].

In [33] a technique is proposed to measure the similarity between visual data. The method of correlation of image patch around the central pixel is pretty much close to the similarity measuring method of NLM filter. In [5] another fully patch-based denoising algorithm presented. The quality of denoising is measured by a confidence term provided with denoised patches.

We proposed a novel selective non-local means filter to suppress the rician noise while preserving important image features. It works adaptively on different frequency regions of the image. Figure 3 (3rd column) shows the difference between the uncorrupted and the corrupted images. The positive bias in the intensity PDF introduced by Rician noise is evident in the lighter background region (higher intensity on the average)—the background corresponds to low signal intensities. The nature of rician noise (Fig. 3, 3rd column) suggests a different nature of filtering in smooth and featured regions. For this purpose, we have to enhance the high frequency events in the image. We have to separate high frequency regions of the image from smooth regions in the presence of rician noise. For rician noise the classical gradients produce very unwanted results. The morphological gradients are used to determine high frequency areas in the presence of rician noise. However, a contrast enhancement operation is required when noise intensity is large. Then novel selective NLM filter is applied on featured areas and adaptive ANLM filter is applied on smooth areas. A method to select weights matrix is also proposed to preserve the image features against smoothing.

Fig. 1
figure 1

Synoptic Representation of proposed Method

Fig. 2
figure 2

a Original image b Noisy image 10 % rician noise c Flag image using traditional gradients d Flag image using morphological gradients

Fig. 3
figure 3

(From Left to Right) 1st column is original images, 2nd column is noisy image with 20 % added rician noise, 3rd column is the difference of other two (i.e. added rician noise)

Fig. 4
figure 4

An (11 × 11) sample weight window computed by NLmeans algorithm with optimized parameter values (in high frequency region)

Fig. 5
figure 5

An (11 × 11) sample weight window computed by NSNLmeans algorithm (in high frequency region)

The rest of this paper is organized as follows. A brief description of morphological gradients and NL-means algorithm is discussed in Section 2.1. Section 2.2 explains the concepts of separating high and low frequency regions of MR image. Section 2.3 describes the proposed novel selective NL-means algorithm. Section 2.4 is about the filter parameters’ optimization. Performance evaluation and results are discussed in Section 3. Finally the conclusion is drawn in Section 4.

2 Proposed method

The method proposed in this paper is rooted on the application of a novel selective non-local means filter which is a modification of NLM filter. Synoptic Representation of proposed Method is shown in Fig. 1. The main contributions can be envisaged as three fold. The separation of high frequency regions from low frequency regions is a very difficult task because of the presence of rician noise in MR image. Since high frequency events in images may be due to edges/image features or noise. Beucher’s morphological gradients along with a contrast enhancement technique are used for this purpose. Secondly we suggest a weight window selective method and change the internal structure of the NLM filter to preserve important image features. Modified filter used adaptively for low and high frequency regions. Then we optimize the NLM filter parameters for low frequency regions and for high frequency regions separately. For this we run our novel selective NLM algorithm on datasets available online and real data sets as well. A pseudocode algorithm for the proposed method is presented in Fig. 6. The details of proposed method are depicted under the following sections.

Fig. 6
figure 6

Pseudocode algorithm for proposed method

2.1 Non local mean filter

The intensity variations may be due to edges/image features or noise. There are many potentially useful gradients presented by researchers. Gradient operators are used to enhance intensity variations (like edges) in images. Classical gradients are very sensitive to noise and cannot be applied in the presence of noise see Fig. 2(c). We found that morphological gradients are very useful to determine high frequency areas in the presence of rician noise see Figs. 2(d) and 3. A morphological gradient approach consists in determining a grey level variation within a given neighborhood using extensive and anti-extensive operators [30]. Let f be a differentiable function defined on ℝ2. The gradient vector of f in two orthogonal directions x1 and x2:

$$ \nabla f = \left( {\frac{{\delta f}}{{\delta {x_1}}},\frac{{\delta f}}{{\delta {x_2}}}} \right) $$
(1)

In image processing gradients are handled through their modulus and azimuth (direction) representations. Let f be a function defined on ℝ2 and ρβ be a disk of radius ρ. The morphological gradient of f is defined as:

$$ g(f) = \mathop{{\lim }}\limits_{{\rho \to 0}} \frac{{{\delta_{{\rho \beta }}}(f) - {\varepsilon_{{\rho \beta }}}(f)}}{{2\rho }} $$
(2)

Where \( {\delta_{{\rho \beta }}}(f) \) and \( {\varepsilon_{{\rho \beta }}}(f) \) are dilation and erosion [19] of f with a disk β of radius ρ respectively. This gradient is often called Beucher gradient [30]. It can be easily shown that:

$$ g\left( {f(x)} \right) = \left| {\nabla f(x)} \right| $$
(3)

Although Eq. (2) can be directly applied to discrete images but we do not have access to the \( \mathop{{\lim }}\limits_{{\rho \to 0}} \). In discrete case the smallest accessible value of ρ is 1. Therefore, the morphological gradient is defined by Beucher [30] as:

$$ g(f) = {\delta_{\beta }}(f) - {\varepsilon_{\beta }}(f) $$
(4)

The denominator can be eliminated now as it’s a constant and does not directly correspond to a distance for finite structuring elements. This distance can only be estimated using statistical models of images; if they are available. By Eq. (4) we can compute maximum variation of the grey level intensities within an elementary neighbourhood rather than a local slope. We applied Eq. (4) followed by an erosion to compute flag image.

The NLM filter [11] is an updated form of the Yaroslavsky filter [37] which averages similar image pixels defined according to their local intensity similarity. The main difference between the NLM and Yaroslavsky filter is that the similarity between pixels has been made more robust to noise by using a region comparison, rather than pixel comparison and also that matching patterns are not restricted to be local. That is, pixels far from the pixel being filtered are not penalized.

Consider an image Y, the filtered value at a point p using the NLM method is calculated as a weighted average of all the pixels in the image:

$$ NLM\left( {Y(p)} \right) = \sum\limits_{{\forall q \in \eta }} {w\left( {p,q} \right)Y(q),\quad 0 \leqslant w\left( {p,q} \right) \leqslant 1,\sum\limits_{{\forall q \in \eta }} {w\left( {p,q} \right) = 1} } $$
(5)

Where p is the point being filtered and q represents each one of the pixels in the neighbourhood η of radius R search . Although the original method [11] claims to use all the pixels in the image by taking the weighted average of every pixel, it is very inefficient and, therefore, the search window has to be reduced to a window of size η. The weights w(p,q) are based on the similarity between the neighborhoods N p and N q of pixels p and q. N i is defined as a square neighborhood window centered around pixel i with a user-defined radius R sim . The similarity w(p,q) is then calculated as:

$$ w(p,q) = \frac{1}{{Z(p)}}{e^{{ - \frac{{d(p,q)}}{{{h^2}}}}}} $$
(6)

Where Z(p) is called normalizing constant and can be calculated as:

$$ Z(p) = \sum\limits_{{\forall q}} {{e^{{ - \frac{{d\left( {p,q} \right)}}{{{h^2}}}}}}} $$
(7)

and h is an exponential decay control parameter and d is a Gaussian weighted Euclidian distance of all the pixels of each neighbourhood:

$$ d(p,q) = {G_{\alpha }}{\left\| {Y({N_p}) - Y({N_q})} \right\|^2}_{{{R_{{sim}}}}} $$
(8)

Where G α is a normalized Gaussian weighting function with zero mean and α standard deviation (usually set to 1) that penalizes pixels far from the center of the neighborhood window by giving more weight to pixels near the center. The central pixel of the Gaussian weighting window is set to the value equal to the pixels at a distance 1 to avoid over-weighting effects. In Eq. (6) there is a special case when p = q. As the self similarity is very high, it can produce an over-weighting effect. To solve this situation w(p,p) is calculated as follows:

$$ w(p,p) = \max (w(p,q)\forall q \ne p) $$
(9)

2.2 Separating high and low frequency regions of MR image

There are many kinds of gradients like Beucher’s gradient, internal and external, thick, regularized, directional, and thinning/thickening gradients etc [30], presented in mathematical morphology. There are very few theoretical arguments in favor of a given gradient. This is because theoretical comparisons generally use approximations too crude as compare to real cases. There are mostly qualitative reasons to choose a particular gradient operator. We used Beucher’s gradients to separate the high and low frequency regions in the noisy MR images. These are simple to compute and perform better in our case. A binary flag image is computed by applying threshold on the gradient image computed by Eq. (4) followed by an erosion operation. This technique performs well for low and average density noise but produce some unwanted results when noise intensity is large. For large noise density only morphological gradients are not sufficient. MR magnitude images are corrupted by Rician distributed noise; these images suffer from a contrast-reducing signal-dependent bias. Also the noise is often assumed to be white, however a widely used acquisition technique to decrease the acquisition time gives rise to correlated noise. We can increase the image contrast by subtracting the bias from each pixel in the squared magnitude image [2].

$$ Y = \sqrt {{{Y^2} - 2{\sigma^2}}} $$
(10)

Where \( \widehat{Y} \) is a bias free image, Y is noisy image and σ is standard deviation of added noise. Although, this simple operation does not remove the bias on the magnitude image completely [35], however, there is a clear contrast enhancement in processed image. The standard deviation σ can be computed from the background of the squared magnitude image [25] as follows:

$$ \sigma = \sqrt {{\frac{\mu }{2}}} $$
(11)

Where μ is the mean value of the background of the squared magnitude image which can be selected using an Otsu thresholding method [26].

2.3 Applying adapted selective NL-means

NLM filter estimates the value of a pixel p by taking a weighted average of the neighboring pixels within a window of size η. The weight window used in Eq. (5) and computed by Eq. (6) indicates a similarity measure of each pixel q within the window with the central pixel p. We suggest a novel change into the internal structure of the NLM filter which provides more naturalness in an image after the noise removal operation. Achieving a naturally looking image is indeed an important factor, anyway, probably the more important issue is the objective of keeping diagnostic credibility of the processed image. The ability to preserve edge/ feature details of the image will surly more credible for diagnostic accuracy. The weight window computed by NLM filter is shown in Fig. 4. NLM filter assigns some weight to almost every pixel in the neighbourhood even in high frequency regions. Since, it is impossible in high frequency region that every pixel will be similar to the pixel being evaluated. Although, it suppresses the noise however it is not a good choice for feature preserving point of view. So we suggest a weight window selective method based on thresholding so that weights are assigned to only highly similar pixels. This selective window is shown in Figs. 5 and 6.

The nature of rician noise suggests a different nature of filtering in smooth (low spatial frequency) and featured regions (high spatial frequency) as can be seen in 3rd column of Fig. 3. The intensities in this difference image do not appear correlated because Rician noise corrupts each pixel independently. Antoni Buades in [11] stated that, a similarity window of size 7 × 7 or 9 × 9 can be taken for grey level images with little noise, but these fixed size windows will not yield good results for all kinds of images.

The problem with the fixed size search and similarity windows is that, in case of large windows, some details could be removed from the image, blurring singular points (i.e. pixels with no similar patches, like image corners and peaks or valleys) by averaging them with non similar patches, otherwise, in case of small similarity window, there will be a lot of patches similar to the current patch, resulting in non accurate estimation. This means that, in flat regions (low variance regions), large windows are needed to properly remove the noise effects, in other regions containing a lot of details (high variance), small similarity window size is needed in order to find similar patches and to estimate the current pixel more accurately. Moreover, a small search window size could be more reasonable for efficiency reasons.

In the comparative provided by the authors, it is shown that NLM algorithm outperforms clearly other classic methods like Anisotropic Diffusion Filter (ADF), Total Variation (TV) [31] or wavelet thresholding methods [12] among others. But NLM algorithm has three parameters and the filter results depend highly on their setting. The first parameter ‘R search ’ is the radius of a search window. The second parameter ‘R sim ’ is the radius of the neighborhood window used to find the similarity between two pixels. The third parameter, h, is related to the decay of the exponential curve and controls the degree of smoothing. If h is too small, little noise will be removed while if h is set too high, the image will become blurry.

The optimal values of these parameters can be different depending upon different spatial frequency regions and noise intensity. Therefore, globally optimal values as computed in [23] may not produce optimal results. Here globally optimal means optimal values for the whole image. But it’s possible that some other values for different spatial regions can produce better results. We optimized filter parameters for high and low spatial frequency regions and noise intensity.

2.4 Filter parameters optimization

The optimal parameter estimation is performed for both smooth/homogeneous and featured/textured areas of MR images. To exploit the local property and reduce noise in different regions, we adaptively choose the similarity window size based on the previous classification result. For edged/featured region, we employ a small similarity window since the local structure existing within a neighborhood can be effectively used for similarity matching. On contrary, the larger similarity window is required for smooth region in order to reduce the influence of misinterpreting noise as local structure during the matching process. The filtering parameter h controls the decay of the exponential expression in the weighting scheme as discussed in Eqs. (6) and (7). Choosing a very small h parameter tends to produce noisy results similar to the input, while very large h gives a very smoothed image, this means that, h controls the smoothing degree of the filtered image. The power of the decaying function varies according to the window size for the same pixel as given in Eq. (7). This means by changing the window size, h parameter will change indirectly. In [15], the authors stated that h must be independent of the choice of the window size. To achieve this, Euclidean distance ||d||2 must be normalized. Since a small similarity window size hardly contains image details. In this case signal to noise ratio is very low, h needs to be high to do a hard smoothing of noise and to estimate the correct value of the pixel. On the other hand, in a large similarity window size, the signal to noise ratio is relatively high, because it contains a lot of image details, so h needs to be small to preserve image details [4]. Moreover, the parameter h must be inversely proportional to the similarity window size in order to obtain the best results of any similarity window size used. We found that the parameter values estimated by [23] are quite near to the optimal values i.e. R search  = 5, R sim  = 2, h = 1.2σ. The optimal values for smooth and featured regions separately even could not deviate a lot from these values. To optimize the filter parameters we used MR images taken from simulated brain MRI dataset available at BrainWeb [8] as well as real time datasets taken from Abrar MRI &CT Center, Rawalpindi, Pakistan [1] and 27 horizontal slices from 3D Brain MRI dataset available in MATLAB (R2009b). We have used hundreds of images from these two datasets to optimize the filter parameter values. For this first we run our algorithm for low frequency regions only and computed parameter values by hit and trial. Improvements (reduction) in root mean square error is used as objective function. Then we run our algorithm for low frequency regions only and computed parameter values by using same procedure. An average improvement in RMSE for noise patterns at (σ = 2, 5, 9, 13, 17, 20) while estimating the parameter values are shown in Tables 1, 2, and 3. For featured regions the search window (11 × 11), similarity window (3 × 3) h is (1.1σ), and for smooth regions the search window (11 × 11), similarity window (5 × 5) h is (1.4σ).

Table 1 An average improvement in RMSE for the noise patterns at (σ = 2, 5, 9, 13, 17, 20) while estimating the parameter values are: (Rsearch varies while Rsim and h kept constant)
Table 2 An average improvement in RMSE for the noise patterns at (σ = 2, 5, 9, 13, 17, 20) while estimating the parameter values are: (Rsim varies while Rsearch and h kept constant)
Table 3 An average improvement in RMSE for the noise patterns at (σ = 2, 5, 9, 13, 17, 20) while estimating the parameter values are: (h varies while Rsearch and Rsim kept constant)

3 Results and discussions

The proposed system was implemented by using the MATLAB (R2009b) environment. In our study we have analyzed three datasets of MR images of human brain which represent “the bottleneck” for clinical diagnostic, due to very long acquisition time. The MR images analyzed (256 × 256 pixels of size) are acquired from [8], [1] and MRI dataset available with MATLAB (R2009b). In a first step, all MR images were corrupted with Rice distributed noise to simulate low quality images. In particular the percentage of noise was varied from 1 % to 20 %. (a typical range of MR image noise).

In this section we give a detailed qualitative and quantitative analysis of the proposed MRI-denoising algorithm. It compares the performance of the proposed method with several other methods including the state of the art techniques presented in literature. The filter parameters are optimized for featured regions (high spatial frequency) as well as smooth regions (low spatial frequency). So the estimated parameter values are as follows.

  • For featured/high frequency regions: R search  = 5, R sim  = 1, h = 1.1σ.

  • For smooth/low frequency regions: R search  = 5, R sim  = 2, h = 1.4σ.

The de-noising performance of our proposed algorithm is evaluated against three well known de-noising algorithms, namely Anisotropic Diffusion Filter [27], Wavelet Based Denoising Algorithm [29] and NLM based MRI Denoising Algorithm with optimized parameter values [23]. To judge the performance of the denoising techniques Mean Squared Error (MSE) and Peak Signal to Noise Ratio (PSNR) are the automatic choice for the researchers and most frequently used in literature. But a better PSNR does not imply that the visual quality of the image is good. To overcome this problem, image visual quality comparison is also presented. We observed the de-noising performance of the novel selective nonlocal means algorithm against the above mentioned well known algorithms over 200 representative test images, 20 out of these 200 images are shown in Fig. 7. These 200 images are selected from three different datasets as follows:

  • MATLAB MRI dataset: 27 images

  • BrainWeb Simulated dataset: 23 images

  • Abrar MRI &CT Center datasets: 150 images

Fig. 7
figure 7

Twenty original MR images out of 200 images for which test results are reported

For each test condition, a typical range of rician noise patterns for MRI (σ = 1, 2, 3, …, 20) are generated and the MSE, improvement in RMSE (Root Mean Squared Error) and PSNR results of these 200 de-noised images are computed. The summarized results of averaged MSE and averaged improvement in RMSE from all cases are presented in Figs. 8 and 9 respectively. According to the experimental results, the performance gap between NSNLM and NLM (with optimized parameters) based algorithm becomes larger as rician noise intensity increases. The proposed MRI-denoising method reduces the root mean square error (RMSE) is about 70 % of denoised image as compare to the noisy image when noise standard deviation is about 20. Averaged PSNR results over these 200 images are summarized in Table 4. A detailed perceptual quality comparison between NLM with optimized parameter values [23] and our proposed method is presented in Figs. 10 and 11. The examples of noisy images, de-noised images and residual images are illustrated in Fig. 12. A detailed visual/perceptual comparison [23] is also presented in Fig. 13. The images in Figs. 10(f), 11(f) and 4th column in Fig. 12 show the difference between the denoised and the noisy images. The low correlation in these images indicates that the proposed method preserves the significant image features even with high intensity rician noise. Figure 12 (3rd column) also shows that the proposed method effectively corrects for the positive rician bias in the corrupted-intensity PDF and thereby enhance inter-tissue contrast—darker background region, as compared to that in Fig. 12 (2nd column), implying low error. The test results show that the NSNLM filter performs better than the NLM filter in removing the rician noise in MRI while preserving the important image features. Preserving important image features is very important factor for MR images which gives not only more naturalness in an image after the noise removal operation but also, indeed, keeps diagnostic credibility of the processed image. We conclude that the NSNLM is superior to the many well known techniques, quantitatively as well as qualitatively.

Fig. 8
figure 8

RMSE comparison of three well known algorithms with our proposed algorithm (NSNLM)

Fig. 9
figure 9

Improvement in RMSE of three well known algorithms and our proposed algorithm (NSNLM)

Table 4 Performance of the Considered Algorithms in Terms of Averaged PSNR on 200 images out of which 20 shown in Fig. 7
Fig. 10
figure 10

a Original image b Noisy image σ = 20 c Denoised image using NLM with optimized parameter values d Residual/difference between ‘b’ and ‘c’ e Denoised image using our proposed algorithm f Residual/difference between ‘b’ and ‘e’ g Added rician noise/difference between ‘a’ and ‘b’

Fig. 11
figure 11

a Original image b Noisy image σ = 20 c Denoised image using NLM with optimized parameter values d Residual/difference between ‘b’ and ‘c’ e Denoised image using our proposed algorithm f Residual/difference between ‘b’ and ‘e’ g Added rician noise/difference between ‘a’ and ‘b’

Fig. 12
figure 12

(From Left to Right) 1st column is original images, 2nd column is noisy image with 20 % added rician noise, 3rd column is denoised image by NSNLM, 4th column is the difference of 2nd and 3rd columns (i.e. Residuals)

Fig. 13
figure 13figure 13

(From Left to Right) Column-1 is original images, Column-2 is noisy images with 18 % added rician noise, Column-3 is denoised images by NLM(18 %), Column-4 is denoised images by NSNLM(18 %), Column-5 is noisy images with 12 % added rician noise, Column-6 is denoised images by NLM(12 %), Column-7 is denoised images by NSNLM(12 %)

4 Conclusions

In this paper we have proposed a novel selective non-local means filter to suppress the rician noise while preserving important image features. For this we have to classify the low and high frequency events in the image. Morphological gradients based a simple, yet highly effective way of separating high and low frequency regions is proposed. Then novel selective NLM algorithm is applied on these areas. Filter parameters are optimized for both image regions separately. A method to selective weights matrix is also proposed to preserve the image features against smoothing. We demonstrate the performance of the proposed method by extensive simulation experiments which have been conducted on a variety of standard test images. We also compared our method with many other well known techniques. Experimental results indicate that our proposed method performs significantly better than many other existing techniques. The proposed method is simple and easy to implement. As a future work, optimization of filtering parameters can be performed automatically using some optimization technique since the values of these parameters are much correlated with each other.