1 Introduction

Medical imaging is taking an increasingly vital role in healthcare processes. Different imaging modalities like Magnetic Resonance Imaging (MRI), Computed Tomography (CT), Single Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) provide information about structure and dynamic behavior of tissues and organs of human body. Most of the time, such information is complementary. Modalities such as MRI and CT provide information about structure of organs while SPECT and PET give information about dynamic behavior of tissues. But both types of information are needed for reliable diagnosis. Then it would be a good idea to gather all information in a single image. Image fusion is the way to achieve this goal.

Image fusion is the process of gathering information from different sources into an image to get complementary information and remove redundant data. It mainly aims at keeping all the salient and complementary information while omitting redundant ones, without adding any noise and artifacts to fused image. As fused image offers better representation of information, it is preferred in many medical applications such as neurology and oncology.

Image fusion could be performed at three levels: pixel, feature, and decision (Pohl and Van Genderen 1998). During pixel level fusion, the initial information from the source pixels is directly merged; thus, the fused image is more informative (Li et al. 2017). According to James and Dasarathy (2014), Matsopoulos et al. (1994), pixel level image fusion methods can be divided into five major groups: knowledge-based methods (Dou et al. 2003; Radhouani et al. 2009), methods based on fuzzy logic (Singh et al. 2004; Singh et al. 2015; Koley et al. 2016; Yang et al. 2016), neural network based methods (Wang and Ma 2008; Liu et al. 2014; Zhao et al. 2014; Ganasala and Kumar 2014, 2016; Tang et al. 2017), morphological methods (Matsopoulos et al. 1994; Jiang and Wang 2014) and methods based multiresolution analysis (Bhatnagar et al. 2013; Shuaiqi et al. 2014; Prakash et al. 2012).

Knowledge-based methods exploit expert knowledge which is highly trusted for decision making. The benefit of these methods is to introduce standards of the human visual system in tasks. The shortcoming of these methods appears in cases where there is a large variability in the image intensities of different parts. Fuzzy logic methods have proved their efficiency in image fusion with their disjunctive and conjunctive properties; however, choosing the appropriate membership functions and fuzzy sets is still a challenging problem. Neural networks are able to learn and then form a model for future decision-making tasks. Their ability to work without complex mathematical models is advantageous, but their performance is limited by the nature of training data and training algorithms. Morphological methods have been used in medical image processing for a long time. In this group of methods, morphological filters are highly dependent on structuring elements which perform opening and closing operations. The accuracy of these methods is affected by variations in image, like the noise, shape, and size of features.

Methods based on multiscale transforms have proved their efficiency for medical image fusion. In this family of methods, source images are decomposed by means of basic functions to get information about important features like edge and sharpness. This process yields coefficients further employed to choose the desired information.

The discrete wavelet transforms (DWT) is one of the most frequent transforms used for medical image fusion. DWT has been used in combination with different rules and various other methods. It can offer information about image approximation and horizontal, vertical, and diagonal directions. It also establishes good multiresolution and time–frequency localization characteristics. The main deficiency of DWT is the pseudo-Gibbs effect that appears in fused images because of the downsampling happening at each decomposition level. Shift variance is another deficiency of DWT (Bradley 2003). Curvelets are an efficient model for resolving the disadvantages of DWT. They perform better than wavelets in extracting curvilinear properties such as edges (Ali et al. 2008; Himanshi et al. 2015). The curvelet transform (CVT) was initially developed for continuous space, but due to the rotation operation, it was too challenging to exploit it for discrete images (Do and Vetterli 2005). Unlike the curvelet transform, the contourlet transform is constructed in the discrete domain. It is a real, two-dimensional transform that can extract the basic geometric properties of a shape (Li et al. 2011). Because of its directional and multiscale representation of images, it performs better in extracting edges, textures, and complex contours, but the problem of shift variance still exists. Nonsubsampled contourlet transform (NSCT) is the shift invariant version of contourlet transform which exploits nonsubsampled filters and pyramids, and thus provides a more precise representation of the image (Li and Wang 2011). Still, the high computational cost is a major drawback of NSCT. Nonsubsampled shearlet transform (NSST) was introduced in Easley et al. (2008) to benefit from low computational costs, shift invariance, and optimal representation of images.

The general framework for image fusion based on multiresolution involves two key problems (Li et al. 2017). These are the selection of proper multiresolution decomposition method and selected approach to merge multiscale representation. Many methods such as DWT (Rangarajan 2017; Ravichandran et al. 2017; Sanjay et al. 2017), SWT (Indira et al. 2015), CVT (Bhadauria and Dewal 2013; Ali et al. 2008), contourlet transform (Al-Azzawi et al. 2009; Bhateja et al. 2015), and NSST (Singh et al. 2015; Liu et al. 2018) are suggested for obtaining the features of source images. The most common set of rules for the fusion of coefficients is to use the average of the coefficients for approximation and the bigger absolute value of details (Hill et al. 2002). Other examples include contrast (Bhatnagar et al. 2015), variance (Yang et al. 2010), energy (Yang et al. 2014), spatial frequency (Bhatnagar et al. 2013) and principal component analysis (PCA) (Himanshi et al. 2015; Krishn et al. 2014; Moin et al. 2016).

PCA is a standard tool in modern data analysis to extract important information from complicated datasets (Shlens 2014). It aims at finding the best basis to redefine data. It is simple, non-parametric and effective in extracting relevant information. Generally, PCA is considered as a feature extraction method to be merged with techniques such wavelets (Zheng et al. 2004; Cui et al. 2009; Al-Azzawi et al. 2009; Krishn et al. 2015; Benjamin and Jayasree 2018).

In Vijayarajan and Muttan (2015), a new method is presented to utilize DWT and PCA for medical image fusion. In this method, both source images are first transferred to the multiresolution space using DWT. Then, PCA is performed on each related pair of subbands to achieve principal components. The average of all the components yields values for the fusion task. However, as mentioned before, DWT lacks shift invariance and a pseudo-Gibbs effect. As mentioned before, the selection of the proper fusion role plays a critical role in the quality of the fused image. According to Li et al. (2013) rules could be classified into two groups: those based on features such as contrast (Bhatnagar et al. 2013), variance and visibility (Yang et al. 2010) and those which assign weights to each of features. The major advantage of first group is in keeping details, but these rules lack spatial consistency (Li et al. 2013). Assigning weights based on optimization methods could solve this problem. In this paper, particle swarm optimization (PSO) is the selected optimization technique because of its simplicity, ease of use and fast convergence.

In this paper, a new method in pixel-level fusion is proposed based on NSST and PSO, namely STPCPSO for medical image fusion. In this method, one pair of source images are first transferred to multiresolution space via NSST. Then PCA is performed on each related subband and fusion coefficients are derived. PSO is used to find the optimal weight combination of coefficients instead of averaging and finding the final fusion weights. The fused image is formed by applying these weights. The main contributions of this study are the use of a more efficient multiresolution transform and using optimization to achieve better fusion weights.

The rest of this paper is organized as follows. In Sect. 2, the new algorithm is presented, and the preliminaries of NSST, PCA, and PSO are briefly reviewed. The experimental setup, results, and discussions are presented in Sect. 3. Conclusions are presented in Sect. 4.

2 Proposed algorithm

Image fusion is a process for gathering the necessary information from different sources to achieve a more informative image. Methods based on multiresolution analysis have been widely exploited in this area due to their good performance in spectral domains. PCA is a simple yet effective method for this task. The idea behind method of (Vijayarajan and Muttan 2015) was to benefit from both methods in medical image fusion, but the deficiencies of DWT and averaging of weights are still two concerns. This paper proposes a new method to overcome these imperfections.

2.1 Nonsubsampled shearlet transform

The NSST is a multidimensional and multidirectional representation of WT that includes multidirectional and multiscale analysis. Firstly, nonsubsampled Laplacian pyramid (NSLP) is exploited to decompose the original image and form low and high frequency subbands. Directional filtering is then applied to extract the different direction shearlet coefficients in a high frequency component. Shear matrixes are exploited for directional filtering. A three-level decomposition of NSST is depicted in Fig. 1. This process is described briefly as follows.

Fig. 1
figure 1

Three-level decomposition of nonsubsampled shearlet transform

A two-dimensional affine system with composite dilations is considered as in Easley et al. (2008),

$$ A_{DS} = \left\{ {\psi_{j,k,m} \left( x \right) = \left| {\det D} \right|^{j/2} \psi \left( {S^{k} D^{j} x - m} \right):j,k \in {\mathbb{Z}},m \in {\mathbb{Z}}^{2} } \right\} $$
(1)

where \( \psi \) is the mother function to generate basis functions, \( A_{DS} \) is the family of basic functions produced by scale, shift and orientation changes of \( \psi \). \( D \) is the anisotropic matrix, \( S \) refers to the shear matrix, and \( j,k, \) and \( m \) denote scale, direction, and shift parameter, respectively. \( D \) and \( S \) are \( 2 \times 2 \) invertible matrices and \( \left| {\det S} \right| = 1 \). \( D \) is an anisotropic matrix in the form of \( \left[ {\begin{array}{*{20}c} d & 0 \\ 0 & {d^{1/2} } \\ \end{array} } \right] \) or \( \left[ {\begin{array}{*{20}c} {d^{1/2} } & 0 \\ 0 & d \\ \end{array} } \right] \) in which \( d > 0 \) is to control the scale of the shearlets. \( S \) is the shear matrix in the form of \( \left[ {\begin{array}{*{20}c} 1 & s \\ 0 & 1 \\ \end{array} } \right] \) or \( \left[ {\begin{array}{*{20}c} 1 & 0 \\ s & 1 \\ \end{array} } \right] \) which is to control only the direction of the shearlets. The transform function is then (Singh et al. 2015):

$$ \psi_{j,k,m}^{\left( 0 \right)} \left( x \right) = 2^{{j\frac{3}{2}}} \psi^{\left( 0 \right)} \left( {S_{0}^{k} D_{0}^{j} x - m} \right) $$
(2)
$$ \psi_{j,k,m}^{\left( 1 \right)} \left( x \right) = 2^{{j\frac{3}{2}}} \psi^{\left( 0 \right)} \left( {S_{1}^{k} D_{1}^{j} x - m} \right) $$
(3)

in which \( \ge 0 \), \( - 2^{j} \le k \le 2^{j} \), \( m \in {\mathbb{Z}}^{2} \), \( \hat{\psi }^{\left( 0 \right)} \left( \xi \right) = \hat{\psi }^{\left( 0 \right)} \left( {\xi_{1} ,\xi_{2} } \right) = \hat{\psi }^{1} \left( {\xi_{1} } \right)\hat{\psi }^{1} (\xi_{2} /\xi_{1} ) \) and

$$ \hat{\psi }^{\left( 1 \right)} \left( \xi \right) = \hat{\psi }^{\left( 1 \right)} \left( {\xi_{1} ,\xi_{2} } \right) = \hat{\psi }^{1} \left( {\xi_{2} } \right)\hat{\psi }^{1} (\xi_{1} /\xi_{2} ). $$
(4)

where \( \xi = \left( {\xi_{1} ,\xi_{2} } \right) \in R^{2} \) and \( \hat{\psi }^{\left( i \right)} \) are denoted to basic functions which support special regions of space.

A discrete transform is obtained by sampling continues wavelet transform on a proper discrete set. This transform is able to cope with discontinuities more effectively (Cao et al. 2011).

2.2 Principal component analysis (PCA)

A detailed description of PCA method is presented in Ehlers (1991). Fusion based on PCA consists of the following steps:

  1. 1.

    Arranging source images in column vectors and merging them to form a \( 2 \times n \) matrix \( {\mathbf{X.}} \)

  2. 2.

    Forming the covariance matrix of \( {\mathbf{X}} \) and computing eigenvectors \( {\mathbf{V}} \) and eigenvalues \( {\mathbf{D}} \) and sorting them from large to small.

  3. 3.

    Computing \( P_{1} \) and \( P_{2} \) as follows:

    $$ P_{1} = \frac{V\left( 1 \right)}{\sum V}\;{\text{and}}\;P_{2} = \frac{V\left( 2 \right)}{\sum V} $$
    (5)

    where \( V\left( 1 \right) \) and \( V\left( 2 \right) \) are two first eigenvectors and \( \sum V \) is summation of the eigenvectors.

  4. 4.

    Finding the fused image using the following equation:

    $$ I_{fus} = P_{1} I_{1} + P_{2} I_{2} $$
    (6)

2.3 Particle swarm optimization

Particle swarm optimization (PSO), is a member of evolutionary computation paradigms (Eberhart and Kennedy 1995). PSO is an off-line and low cost algorithm suitable for solving complex algorithm. PSO is preferred to other evolutionary algorithms due to three main improvements (Chen and Leou 2012): (1) PSO needs only basic mathematical operations. (2) Each particle is a possible solution, moving in space with a specific velocity. (3) The particles and swarm have their own memories. Each particle resembles a possible solution of a complex space. Its position is influenced by its best position and best particle in swarm. It means that each particle learns from experiences of all other particles. Performance of each particle is assessed by fitness function. During the search process, the new velocity of particle \( i \) at dimension, i.e., \( v_{i}^{d} \) and the new position of particle \( i \) at dimension \( d \), i.e., \( x_{i}^{d} \), are updated, by Chen and Leou (2012):

$$ v_{i}^{d} \left( {t + 1} \right) = \omega .v_{i}^{d} + c_{1} .r_{1} \left( t \right).\left( {p_{i}^{d} \left( t \right) - x_{i}^{d} \left( t \right)} \right) + c_{2} .r_{2} \left( t \right).\left( {p_{g}^{d} \left( t \right) - x_{i}^{d} \left( t \right)} \right) $$
(7)
$$ x_{i}^{d} \left( {t + 1} \right) = x_{i}^{d} \left( t \right) + v_{i}^{d} \left( {t + 1} \right) $$
(8)

where \( t \) donates the iteration counter, \( \omega \) is the inertia weight controlling the impact of previous velocity,\( c_{1} \) and \( c_{2} \) are learning constants, \( r_{1} \) and \( r_{2} \) are random variables in the range \( \left[ {0,1} \right] \), \( p_{i} \) is the best position of particle \( i \), and \( p_{g} \) is the best position of all particles within iteration \( t \). In this paper, \( r_{1} \) and \( r_{2} \) are selected randomly and \( c_{1} \) and \( c_{2} \) have the value of 1.49.

2.4 Proposed fusion algorithm

In this method, perfectly registered MRI images are first transformed to spectral domains based on NSST. The transform takes place in one level, so we have details on eight directions and one approximation. PCA is then performed on each pair of analogues subbands and approximations, and the necessary principal component values for fusion are derived. The problem is then modeled as an optimization problem with an objective function of a linear combination of eighteen variable. Coefficients of variables in linear combination are derived from previous step (PCA on subbands). PSO is used to solve this problem and get optimum values of variables. Achieved values are merged by their matched PCA values and summed up to get the final principal components necessary for fusion. A block diagram of the proposed algorithm is illustrated in Fig. 2. The new algorithm is formulized as:Input: source images I1 and I2

  1. 1.

    Decompose I1 and I2 images to achieve S1i and S2i, i = 1,2,…,9

  2. 2.

    Perform PCA on each related subband pair: [P1k,P2k] = PCA(S1k,S2k), k = 1,…,9

  3. 3.

    Model optimization problem as:

    \( a_{1} P_{11} + \cdots + a_{9} P_{19} + b_{1} P_{21} + \cdots + b_{9} P_{29} = 1 \), \( \left\{ {a_{1} , \ldots ,a_{9} ,b_{1} , \ldots ,b_{9} } \right\} \) are unknown.

    PSO is exploited to solve the problem.

  4. 4.

    \( PF_{1} = \mathop \sum \limits_{i = 1}^{9} a_{i} P_{1i} ,\;PF_{2} = \mathop \sum \limits_{i = 1}^{9} b_{i} P_{2i} \).

  5. 5.

    \( I_{F} = PF_{1} \times I_{1} + PF_{2} \times I_{2} \)

Output: \( I_{F} \)

Fig. 2
figure 2

Block diagram of the proposed method

3 Experimental results and discussion

To evaluate the performance of the proposed method, experiments were carried out on platform Windows 8, MATLAB R2016b tools. Experiments were performed on two sets of perfectly registered MRI images. First set contains 10 pairs of T1 and T2 weighted MRIs of healthy slices from MICCAI 2008 dataset. Second dataset, includes a pairs of T1 weighted and T2 weighted acquired at MRI center of Dr. Alinasab hospital, Tabriz, Iran. Images of first dataset are of size 512 × 512, second dataset 384 × 273. A sample of first dataset and images of second dataset are demonstrated in Fig. 3.

Fig. 3
figure 3

MRI slices a dataset 1, b dataset 2

As is known, different parts of the brain appear with different intensities in different sequences of an MRI; thus, fusing them together provides much more information and aids diagnosis. For comparison purposes, the results of the our method are compared to results achieved by applying our data to methods presented by SWT (Indira et al. 2015), DWT (Sharmila et al. 2013), UDWT + PCA (Benjamin and Jayasree 2018), NSCT (Tang et al. 2007) and DWTPCAAV (Vijayarajan and Muttan 2015). A brief review of all the methods is presented in Table 1.

Table 1 Brief description of methods to which proposed method is compared

3.1 Objective evaluation metrics

Fusion is carried out for different purposes, so there is no universal assessment for fusion performance. Fusion results could be assessed subjectively or objectively. A subjective evaluation is based on the human visual system; thus, it is difficult to perform. To overcome this problem, many objective evaluation metrics are offered. Most of the objective metrics have been reviewed in Jagalingam and Hegde (2015). In this paper, fusion performance is evaluated using seven metrics as follows:

  • Quality index (QI)

QI is used model any distortion as the combination of three factors: loss of correlation, luminance distortion and contrast distortion (Zhou Wang et al. 2004). Its range is between -1 and 1. Higher values indicate less distortion and higher degree of similarity between source and fused image. It is calculated as:

$$ QI = \frac{{4\sigma_{xy} \overline{xy} }}{{\left( {\sigma_{x}^{2} + \sigma_{y}^{2} } \right)\left( {\left( {\bar{x}} \right)^{2} + \left( {\bar{y}} \right)^{2} } \right)}} $$
(9)

Here, \( \bar{x} \) and \( \bar{y} \) are the mean of image \( x \) and image \( y \) and \( \sigma_{x}^{2} \) and \( \sigma_{y}^{2} \) are their variance, respectively.\( \sigma_{xy} \) denotes the covariance of two variables.

  • Entropy (E)

Entropy shows the average amount of information in the fused image (Li et al. 2013). Larger values of E mean a higher amount of information in the fused image. Entropy is computed as follows:

$$ E = - \mathop \sum \limits_{i = 0}^{l - 1} p\left( i \right)\log_{2} p\left( i \right) $$
(10)

where \( l \) is the number of gray levels of image, and \( p\left( i \right) \) means the probability of pixels whose gray value is i over the total number of pixels.

  • Structural similarity (SSIM)

Having images a and b, the SSIM between two images is defined as follows:

$$ SSIM\left( {a,b} \right) = \frac{{\sigma_{ab} }}{{\sigma_{a} \sigma_{b} }}.\frac{{2\mu_{a} \mu_{b} }}{{\mu_{a}^{2} + \mu_{b}^{2} }}.\frac{{2\sigma_{a} \sigma_{b} }}{{\sigma_{a}^{2} + \sigma_{b}^{2} }} $$
(11)

where \( \mu_{a} \) and \( \mu_{b} \) denote average values of a and b, respectively, \( \sigma_{ab} \) is the covariance between the two images, and \( \sigma_{a} \) and \( \sigma_{b} \) are the variances of the variables.

  • Peak signal-to-noise ratio (PSNR)

PSNR reflects the quality of the reconstructed image (Cao et al. 2011). A larger PSNR means less distortion in the fused image. It comes as:

$$ PSNR = 10 \times \log \left( {{\raise0.7ex\hbox{${I_{max}^{2} }$} \!\mathord{\left/ {\vphantom {{I_{max}^{2} } {MSE}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${MSE}$}}} \right) $$
(12)

where \( I_{max} \) is the maximum intensity value of image and \( MSE \) is the mean square error:

$$ MSE = \frac{1}{m \times n}\mathop \sum \limits_{i = 1}^{m} \mathop \sum \limits_{j = 1}^{n} \left( {I_{s} \left( {i,j} \right) - I_{f} \left( {i,j} \right)} \right)^{2} $$
(13)

where \( I_{s } \) and \( I_{f} \) are the source and fused images of size \( m \times n \), respectively.

  • Mutual information (MI)

It is used to measure the similarity of intensity between source and fused images. Higher value means better performance.

  • Correlation coefficient (CC)

This metric is used to compute the spectral similarity between reference and fused image. Values closer to 1 indicate better performance (Zhu and Bamler 2013).

  • Standard deviation (STD)

STD is the measure of contrast in fused image. Higher value means higher contrast in fused image. It is calculated as follows:

$$ STD = \sqrt {\frac{1}{M \times N}\mathop \sum \limits_{i = 1}^{M} \mathop \sum \limits_{j = 1}^{N} \left[ {f\left( {i,j} \right) - \mu } \right]^{2} } $$
(14)

where \( M \) and \( N \) are dimensions of image, while \( \mu \) is the average value of image intensity.

3.2 Performance evaluation of first dataset

Perfectly registered T1 and T2 weighted images were tested with different methods, and the results of quantitative analysis on dataset are presented in Table 2. Fusion results for two pairs of source images from dataset 1 are available in Figs. 4 and 5.

Table 2 Quantitative evaluations of different methods on dataset1
Fig. 4
figure 4

An image pair of dataset 1 and average results of different fusion methods: a MR-T1, b MR-T2, c SWT (Indira et al. 2015), d NSCT (Tang et al. 2007), e DWT (Sharmila et al. 2013), f cascade of UDWT and PCA (Benjamin and Jayasree 2018), g DWTPCAav (Vijayarajan and Muttan 2015), h proposed method

Fig. 5
figure 5

Second image pair of dataset 1 and average results of different fusion methods: a MR-T1, b MR-T2, c SWT (Indira et al. 2015), d NSCT (Tang et al. 2007), e DWT (Sharmila et al. 2013), f cascade of UDWT and PCA (Benjamin and Jayasree 2018), g DWTPCAav (Vijayarajan and Muttan 2015), h proposed method

Source images are depicted in Fig. 4a, b. Methods of Indira et al. (2015) and Tang et al. (2007) returned low quality images with almost no details indicating incapability of these methods to catch details. Results of Sharmila et al. (2013) and Benjamin and Jayasree (2018) have higher contrast and more details, although they are not visually recognizable. Method of Vijayarajan and Muttan (2015) reduced the contrast of image, especially at brain ventricles. Results of cases in which the best method is other than proposed method, are depicted in bold italic. Case in which the proposed method is the best are depicted in bold.

The values of the quantitative metrics are presented in Table 2. The proposed method has the highest PSNR among all; thus, it shows the best performance in constructing the fused image and transferring intensity levels to it. The PSNR is improved with this method compared to the method which has the best performance of PSNR among compared methods [DWTPCAav (Vijayarajan and Muttan 2015)].

QI measures the structural similarity of images instead of only intensity similarity. The proposed method has the best QI and, thus, the best performance in gathering structural information. It improves QI compared with the second-best method. SSIM is based on QI, and luminance and contrast are also considered. The method of Tang et al. (2007) gained the highest similarity value. Looking at images 4a, b, d and h it can be noticed that the result of the proposed method gained most of its visual characteristics from a T1-weighted image; thus, less luminance and contrast similarity with the other source image is inevitable. This leads to a reduction in total.

Here, the proposed method shows a reduction in SSIM compared with the method which has the highest SSIM among other examined methods [NSCT (Tang et al. 2007)]. More research is needed to solve this issue. The highest entropy of the proposed method shows the richest information content in the fused image. Entropy is raised by the proposed method, providing the highest contrast among methods. Thus, the fused image is more understandable for the human visual system.

MI values are very high compared with the proposed method, but when looking at the standard deviation of the dataset (Table 3), the proposed method performs better. High standard deviation values demonstrate the high sensitivity of methods to data variations and their unreliability. Highest values are depicted in bold.

Table 3 Values of MI and standard deviation on dataset 1

The proposed method performed best in the spatial domain but not the same in the spectral domain due to its low CC. This is another issue which needs more research.

3.3 Performance evaluation of second dataset

Perfectly registered T1 and T2 weighted images were tested with different methods, and the results of different metrics are presented in Table 4. Highest values are depicted in bold. Fusion results are available in Fig. 6.

Table 4 Quantitative evaluations of different methods on dataset2
Fig. 6
figure 6

Dataset 2 and results of different fusion methods: a MR-T1, b MR-T2, c SWT (Indira et al. 2015), d NSCT (Tang et al. 2007), e DWT (Sharmila et al. 2013), f cascade of UDWT and PCA (Benjamin and Jayasree 2018), g DWTPCAav (Vijayarajan and Muttan 2015), h proposed method

The same performance is also true for the second dataset. In the case of MI, more image pairs are needed to evaluate the sensitivity of the methods to data variations. A comparison of performance between proposed method and method which shows the best results among other methods used for comparison is presented in Fig. 7.

Fig. 7
figure 7

Percent of change in values of metrics compared to the best method

Proposed method shows improvement compared to the best of other methods in spatial domain. But fails to show the same performance in spectral domain because percent of improvement is negative. It is an issue to discuss in future works. As shown in Fig. 7, PSNR is improved about 8.85%, entropy about 3.48%, STD by amount of 16.3% and QI shows an increase of 14.84%.

The proposed method shows improvement in PSNR, QI, and STD and decreases in SSIM and CC on both datasets. In the case of entropy, the proposed method displays reverse behavior on both datasets. The skull presented in the second dataset could have caused these results. It is another issue to discuss in the future.

To prove the robustness of proposed method, Poisson noise is applied to all of source images and the same metrics are calculated as in case without noise. The outcomes of all the methods are depicted in Table 5. Our method has the same performance even in presence of noise. So, it is robust. It has the highest PSNR in presence of noise with an improvement of (+ 11.1), QI improvement of (+ 7.52) and STD improvement of (+ 9.13). The quality of image is preserved and is not distorted in presence of noise, which is an important point for the next steps of this research.

Table 5 Quantitative evaluations of different methods on noisy dataset 1

4 Conclusion

In this paper, a new method is presented for the fusion of multimodal MRI by exploiting NSST in combination with PCA and PSO. This process takes place in both spatial and transform domains. PCA is a computationally simple algorithm and integrates images based on a simple covariance-based approach. This process was further improved in this study by calculating the principal components at different subbands of NSST for the images. PSO is a well-known optimization tool that is used here to find the best combination of PCA values for the fusion process. The performance of the proposed method was tested on two datasets and compared with recent works on multimodal medical image fusion. Quantitative analysis confirmed that the proposed method outperformed others in terms of standard deviation, signal-to-noise ratio, quality index, and entropy.