1 Introduction

Rapid technological development in recent times has led to the creation of a number of sensors that are capable of capturing different phenomena in the object to be imaged from different points of view. Each individual sensor acquires particular physical property of the object to produce a two-dimensional image at the surface level or a three-dimensional image at the volume level. Thus, the ability or inability of a particular sensor to display an image of an object depends how accurately it picks up the variation of signals received from the entire object to be imaged. Wide variety of data acquisition sensors focus on different parts of the same object, and the acquired images are complimentary in nature in many ways. Not any one of them is sufficient in terms of their respective information content. The concept of multifocus images is to combine or fuse the sharply focused regions from different sensors to take a better decision than the single source only [113].

In medical diagnosis, different radiological images are important tools for visual interpretation and evaluation. Integration of information from different modalities may offer the physicians a better chance to take decision for treatment procedures and surgical planning. In the present day, physicians are recommending multisensor imaging for identification of diseases of a particular organ. For example, magnetic resonance imaging (MRI) gives better information on soft tissue regions related to normal and abnormal tissues [3]. Dose calculation is based on computed tomography (CT) data, whereas positron emission tomography (PET) images provide metabolic processes of the organs, like blood flow, food activity, etc. with low space resolution [14]. Hence, it is natural and desirable to combine different modalities of medical images together to increase the examination accuracy and evaluation specificity.

Many image fusion methods have been proposed for combining different modality images. Some of them are based on Bayesian approach [15]. Hurn et al. [15] suggested a hierarchical framework for the estimation of a fused classification of medical images by combining registered data images at different resolution. The authors not only fused the functional images representing the metabolic activities, but also included structural images to incorporate anatomical properties [16]. The Dempster–Shafer evidence theory [57, 17] has been applied to classify multisource data considering uncertainties related to different data sources. Technique [18] based on multilayer perceptron neural networks has been applied to compute nonparametric estimation of posterior class probabilities for multisource remote sensing images. A novel artificial neural model based on pulse-coupled neural network (PCNN) has been efficiently applied to the field of multimodality medical image fusion [19]. Wavelet-based multiresolution image fusion has been reported in [4, 2025].

Among other fusion techniques for multimodal images, pixel-level image fusion is one of the convenient methods, which has been developed for fusion applications as reported in [26, 27].

As mentioned earlier, no individual sensor is complete; hence, integration of salient features of images produced by different modalities serves for the enhancement of global information. Under this circumstance, the objective of present work is to introduce an ‘automatic’ multimodal medical image fusion system using multiresolution and genetic algorithm (GA)-based techniques. This process can be used in clinical diagnosis with accepted fusion accuracy. Before the implementation of fusion process, some segmentation techniques are used to extract the regions of interest. Segmentation algorithms like fuzzy C-means and Markov random field models, which are stochastic and deterministic in nature, have been implemented in an effective way prior to fusion process. In the proposed fusion scheme, finer details have been extracted from the decomposed input images using multiresolution approach. We have implemented genetic algorithm to select appropriate complementary features from the input images. The proposed fusion process has been implemented on segmented images of brain using different modalities like PD (proton density)-, T1- and T2-weighted MR. Mutual information (MI)-based similarity metric has been computed as an index for performance evaluation of the proposed fusion scheme for the images that are segmented using different techniques. The organization of the paper is as follows: Sect. 2 discusses about different segmentation methods implemented on brain images prior to fusion process; Sect. 3 describes the multiresolution- and genetic algorithm-based fusion process and measure of its performance using MI; Sect. 4 gives the details of experimental results of segmentation as well as fusion technique implemented on PD-, T1- and T2-weighted MR images of human brain; and Sect. 5 concludes the paper.

2 Process of segmentation implemented on MR T1, MR T2 and MR PD brain images

In MRI, segmentation is used to determine the volume of different brain tissues such as white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF). The volumetric changes in these brain tissues help in the study of neural disorders. Researchers have focused on segmentation of multimodal (MR and other) medical imaging [2830]. The primary difficulty in achieving an accurate segmentation is the intensity inhomogeneities that commonly occur in MR image. Since each segmentation technique has been proposed to solve a particular problem, no technique is better than the others for any purpose. We have to find out which technique gives the best result in terms of given criteria or the combination of them. Evaluation methods for image segmentation are classified into analytical and empirical evaluation methods as described in [31]. The analytical methods analyze the properties of a segmentation algorithm, such as its processing strategy, complexity and efficiency. Empirical methods are further classified into goodness methods and discrepancy methods.

The empirical goodness methods use the original image and the resulting segmented image. Goodness can be expressed in terms of a statistical measure such as the uniformity within segmented regions [32], inter-region contrast [33] or region shape [34]. The empirical discrepancy methods compute the error between the segmented image and a reference image. These empirical discrepancy methods include the following: (a) accuracy [35, 36], which refers to the degree to which the segmentation results agree with the true segmentation; (b) area or volume-based metrics, which include two approaches—one of them is by using standard statistical methods, such as two-way analysis of variance and the t-test [37, 38], and the other one is borrowed from object detection literatures [30, 39, 40]; (c) distance-based metrics are used to measure the distance between the ‘segmentation generated boundary’ and the ‘true’ boundary.

In the present work for the image fusion of PD (proton density)-, T1- and T2-weighted MR images of human brain, we have segmented the images prior to implementation of fusion, using both FCM and MRF methods. For MRF methods, we have used the techniques like ICM and Gibbs. Finally, MI-based parameter has been computed to compare the efficacy of all these segmentation techniques for the process of information combination or fusion of all these images.

2.1 Fuzzy C-means clustering

Fuzzy C-means (FCM), also known as fuzzy ISODATA, was first proposed in [41], and later, it was improved as reported in [42]. FCM clustering is a data clustering algorithm in which each data point belongs to a cluster with a degree specified by a membership grade, unlike K-means in which each observation has a clear-cut binary membership. The data samples may belong to more than one group with a varying membership value ranging from 0 to 1. The major advantage of FCM over K-means clustering is the formation of new clusters by monitoring data points that have close membership values to the existing classes. Fuzzy algorithm can be outlined as follows:

Let U = {u 1, u 2, …, u n } be a set of given data. A fuzzy c-partition of U is a family of fuzzy subsets of U, denoted by P = {A 1, A 2, …, A c }, which satisfies

$$ \sum\limits_{i = 1}^{C} {A_{i} (u_{k} )} = 1. $$
(1)

The performance index of a fuzzy partition P, I m (P), is defined in terms of cluster centers by the formula

$$ I_{m} \left( {A,v_{1} , \ldots,v_{c} } \right) = \sum\limits_{k = 1}^{n} {\sum\limits_{i = 1}^{C} {\left[ {A_{i} (u_{k} )} \right]}}^{{m}} ||u_{k} - v_{i} ||^{2} $$
(2)

where v i is the cluster center for ith cluster and ||u k  − v i || represents the distance d, between u k and v i . Clearly, the smaller the value of I m (P), the better the fuzzy partition P. Thus, fuzzy partition has a goal to minimize the performance index I m (P), which offers

$$ v_{i} = \frac{{\sum\nolimits_{k = 1}^{N} {\left[ {A_{i} (u_{k} )} \right]^{m} u_{k} } }}{{\sum\nolimits_{k = 1}^{n} {\left[ {A_{i} (u_{k} )} \right]^{m} } }} $$
(3)

and

$$ A_{i} (u_{k} ) = \frac{1}{{\sum\nolimits_{j = 1}^{C} {\left( {\frac{{d_{ik} }}{{d_{jk} }}} \right)^{{\frac{2}{m - 1}}} } }} $$
(4)

The FCM algorithm is very similar to K-means algorithm. It is an iterative procedure as described below:

  1. 1.

    Initialize the number of classes and the membership matrix A i (u k ) with random values between 0 and 1 such that (1) is satisfied.

  2. 2.

    Calculate the fuzzy cluster centers v i , i = 1,2, …, c, using (3).

  3. 3.

    Iterate until the improvement over the previous iteration is below a certain threshold; this is done by computing the cost function using (2).

  4. 4.

    Compute the new membership matrix A i (u k ) using (4). Go to step-2.

Thus, by iteratively updating the cluster centers and the membership degrees for each data point, the FCM algorithm iteratively moves toward a local minimum.

2.2 Markov random field model

Segmentation method based on Markov random field (MRF) [43] provides a convenient way to combine both the conditional intensity distribution of pixel intensities and the contextual information based on the property of the pixels which are lying in the neighborhood of others. Conditional intensity distribution says that the intensity of pixels of nearly homogeneous region will follow a certain statistical distribution. Contextual information is based on the property that the pixels which are close to other pixels or lying in the neighborhood will tend to have similar intensity values. It is a powerful method for modeling spatial continuity by using a priori contextual information. The task here is to find out the true label of each pixel or voxel which may belong to the subset of {GM, WM, CSF}. The following discussion will outline how the two sources of information—intensity distribution and contextual information—are combined using MRF models to obtain a powerful decision rule regarding the true label of the pixel.

Any MRF segmentation algorithm includes three main properties—neighborhood correlations, nonparametric statistics and signal inhomogeneities. If L denotes a lattice with dimensions M x  × M y , s be the lattice point or pixel and let N s denotes the neighborhood of s and S is the total set. In the case of MRF, the neighborhood system should satisfy two conditions: the first being that the site should not be a neighbor of itself and second, the sites follow the property of symmetry. In this respect a clique in graph theory is a subset of vertices such that every two vertices in the subset are connected by an edge. Hence from the concept of neighborhood clique is defined as a subset of points c belonging to total set C, which are all neighbors of each other.

Gibbs distribution with respect to N s is a probability measure given as

$$ P\left( \omega \right) = \frac{1}{Z}\exp \left\{ { - \frac{U\left( \omega \right)}{T}} \right\} $$
(5)

where ω is the value of random field vector F, called the configuration, and all possible configurations are present in Ω; T is a positive constant that controls the size of clustering; Z is a normalizing constant, also known as partitioning function, which is given as

$$ Z = \sum\limits_{\omega \in \Upomega } {\exp \left\{ {U\left( \omega \right)} \right\}}. $$
(6)

And U(ω) is the energy function referring to Gibbs energy,

$$ U(\omega ) = \sum\limits_{c \in C} {V_{c} (\omega )}. $$
(7)

It is the sum of energies associated with all cliques in the graph, where V c is the potential function associated with the clique. They can be one-node cliques, two-node or three-node cliques. Presently, two-node cliques are used for experimentation.

MRF is defined by the following two properties:

$$ \begin{gathered} P(\omega ) > 0,\quad \forall \omega \in \Upomega \quad {\text{Positive}}\,{\text{definiteness}}, \hfill \\ P\left( {\omega_{i} |\omega_{{S - \left\{ {s_{i} } \right\}}} } \right) = P\left( {\omega_{i} |\omega_{{N_{i} }} } \right)\quad {\text{Markov}}\,{\text{property}}. \hfill \\ \end{gathered} $$
(8)

The Markov property can be stated as the probability of labeling of a pixel, given all the labels in the image are equal to the probability of the label given and the labels of its neighbors only. Thus, the probability distribution of a variable is only related with the random variables within its neighborhood. Bayes’ principle along with Hammersley–Clifford [43] theory is used to come up with a decision rule.

If the observed image y is a realization of a random field Y, and \( \hat{x} \) indicates the estimate of true unknown label of the observed pixels, then the main objective is to find \( \hat{x} \) given the observed image y. Assuming that P(X) is our prior knowledge, P(Y|X) is the probability of realizing the observed image and posterior is P(X|Y), then by using Bayes’ theorem, we have

$$ P\left( {X/Y} \right) \propto P\left( {Y/X} \right)P\left( X \right) $$
(9)

Gaussian distribution is used for the modeling observed image intensity distribution; hence,

$$ P(Y = y/X) = \frac{1}{{\sqrt {2\pi \sigma_{is}^{2} } }}\exp \left( { - \frac{{\left( {y - \mu_{s} } \right)}}{{\sigma_{s}^{2} }}} \right) $$
(10)

where μ s is the mean and σ s is the variance. The prior knowledge is described by the MRF from (5), which is given as

$$ P(X) = \frac{1}{Z}\exp \left\{ { - \frac{U(X)}{T}} \right\}. $$
(11)

The \( \hat{x} \) is obtained by computing the maximum a posterior (MAP) estimate. Maximizing the logarithm of the posterior gives

$$ \hat{x} = \mathop {\max }\limits_{x} \left\{ {\log p\left( {y/x} \right) + \log p\left( x \right)} \right\}. $$
(12)

From (10)–(12),

$$ \hat{x} = \mathop {\max }\limits_{x} \left\{ { - \frac{{y - \mu_{s} }}{{\sigma_{s} }} - \frac{1}{2}\log \left( {2\pi \sigma_{s}^{2} } \right) - \frac{U(x)}{T}} \right\}. $$
(13)

This maximization is called optimization, and a number of methods have been proposed to solve this problem. They are either deterministic or stochastic approaches. We considered (1) iterated conditional modes (ICM), which is Besag’s deterministic approach [44], and (2) stochastic approaches—Gibbs sampling [45, 46].

Deterministic methods are highly based on initial segmentation that is obtained from thresholding. These thresholding methods are intensively used for the initial segmentation of images prior to more sophisticated segmentation method for the purpose of reduction in convergence time.

3 Proposed image fusion scheme

The image fusion involves the combination of information from different sensors to get more complete information compared to the information obtained by a single sensor. Naturally, the process of image fusion may be accomplished among the images of an object acquired by different sources in the same viewing reference frame or among the images of same object acquired by a single sensor at different imaging conditions and of different viewing frames. Advanced image fusion approaches based on multiscale representation have emerged and received attention to the researchers. Most of these approaches are based on the implementation of the multiscale decompositions (MSD) of the source images. In multiscale analysis, integration of complementary features from different input images can be achieved with relatively low loss of information than in the case of single resolution processing. The fusion processes described in many papers are generally choose max (CM) scheme for high-frequency subbands of multiscale decomposed image. CM scheme just picks the coefficient with larger intensity pixel and discards the other [47, 48]. Another coefficient combining scheme is the weighted average (WA) scheme [49, 50].

In the present application, we have done an experiment with MR PD, MR T1 and MR T2 modalities of images of same region of section of human brain to implement the proposed fusion scheme. When the images of same cross-section of human brain are acquired by different modalities, the image registration [5154] process has to be performed as a crucial prior step of fusion process. In the present application, the registration process [51] is a nonlinear 2D/2D affine transformation, which has been achieved by maximization of a similarity metric and by choosing search strategy by optimization.

3.1 Overall procedure for medical image fusion

The framework of a generic image fusion scheme based on multiscale analysis has been illustrated in Fig. 1. The basic idea is to perform a multiscale decomposition MSD of each source images and also to provide a composite multilevel representation of them. Fusion procedure introduces different techniques where the combination of coefficients is considered for desired approaches. Finally, fused image is obtained by taking an inverse multiscale transform (IMST).

Fig. 1
figure 1

Block diagram of a generic image fusion scheme

The overall image fusion scheme of the proposed algorithm involves the decomposition of input segmented and registered images, computation of genetic-based selection methodology, implementation of fusion rules and then reconstruction of the fused image as shown in Fig. 2.

Fig. 2
figure 2

Block diagram of image fusion scheme

Apart from simple CM- or WA-based coefficient combining schemes, soft computing techniques may improve the robustness and performances of the fusion approaches. The proposed fusion algorithm utilizes a frequently used Haar wavelet-based MSD method for the source images and then implements and evaluates the new aspects of coefficient combining approaches. Among them, fuzzy clustering technique and evolutionary algorithms are mainly adapted to collect and maximize the appropriate complementary features, respectively.

3.2 The proposed algorithm constitutes of the following steps:

  • Haar wavelet-based MSD of input registered images

  • Implementation of appropriate fusion procedure without marking manual interpretation of fiducial points

  • Implementation of different soft computing approaches to combine the approximate and detail coefficients obtained from MSD method

  • Reconstruction of the composite fused image taking inverse multiscale transforms.

The proposed approach starts with two registered images as input. The region of interests of these input images is then segmented using any one of FCM, Gibbs or ICM approaches. The segmented images (A and B) are decomposed into high (D A and D B ) and low (C A and C B ) frequency subbands by HAAR wavelet transform. At a particular decomposition level, both subbands must carry out the information about that particular resolution. Genetic searching algorithm is then applied for collecting maximum information from the image subbands D A and D B . The low-frequency subbands C A and C B , produced by DWT (discrete Wavelet transform), are then averaged for accumulating the gross structure of fused image. Thus, the selection rule considered for this algorithm can be described as

$$ C_{F}^{j} (u,v) = mean\left\{ {C_{A}^{j} (u,v),\,C_{B}^{j} (u,v)} \right\} $$
(14)

where the superscript j denotes the jth level of resolution.

$$ D_{F}^{j} (u,v) = \max \,{\text{of}}\,\left\{ {D_{A}^{j} (u,v),\,D_{B}^{j} (u,v)} \right\} $$
(15)

Finally, by applying the inverse wavelet transform on the selected image subbands, the fused image can be reproduced. It is noted that the proposed approach can initiate without mention of any fiducial points and it is an efficient, automatic and robust fusion technique using soft computing approaches.

3.3 Multiscale/multiresolution decomposition of images

In an image, if both small and large objects or low- and high-contrast objects are present simultaneously, it is advantageous to study them at several resolutions. This is the fundamental motivation for multiresolution processing. Multiscale/multiresolution image processing techniques, as mentioned in the previous section, are the basis for the majority of sophisticated image fusion algorithms. The ideas behind multiresolution approach are described below:

3.3.1 Image pyramid

A powerful but simple structure for representing images at more than one scale/resolution is the image pyramid. An image pyramid is a collection of decreasing resolution images arranged in the shape of a pyramid as shown in Fig. 3. Figure 4 shows a simple system for constructing image pyramids. This produces the level J − 1 approximation and level J prediction residual results. For passes j = J − 1, J − 2,…, J − P + 1, the previous iteration’s level j − 1 approximation output is used as the input. Each pass is composed of the following three steps:

Fig. 3
figure 3

A pyramidal image structure

Fig. 4
figure 4

System for constructing image pyramids

  • Compute a reduced resolution approximation of the input image. It may be done by filtering the input and then downsampling (subsampling) the filtered result by a factor of 2. Different types of filtering operations may be used, including neighborhood averaging, low-pass Gaussian filtering or no filtering. Approximate image contains only the gross structure of input. The quality of generated approximation depends on the function of filter selected.

  • Upsample the output of step 1 by a factor of 2 and filter the result. This creates a prediction image with the same resolution as the input. Interpolation filter produces the accurate prediction approximation of the original input image.

  • Compute the difference between prediction of step 2 and original input image. This difference is labeled as level j prediction residual.

In multiresolution analysis (MRA), scaling function is used to create a series of approximations of an image, each differing by a factor of 2 from its nearest neighboring approximations. Additional functions, called wavelets, are then used to encode the difference in information between adjacent approximations.

Present methodology uses Haar transform, discrete wavelet transform (DWT) and inverse discrete wavelet transform (IDWT).

Wavelet series expansion maps a function of continuous variable into a sequence of coefficients. If the function being expanded is a sequence of numbers, like samples of a continuous function f(x), the resulting coefficients are called the discrete wavelet transform (DWT) of f(x). In this case, scaling or approximation coefficients and wavelet or detail coefficients are expressed as

$$ W_{\varphi } (j_{0} ,k) = \left( {\frac{1}{\sqrt M }} \right) \times \sum\limits_{x} {f(x)\varphi_{j0,k} (x)} $$
(16)
$$ W_{\psi } (j,k) = \left( {\frac{1}{\sqrt M }} \right) \times \sum\limits_{x} {f(x)\psi_{j,k} (x)} $$
(17)

for j ≥ j 0 and f(x) is obtained via the inverse discrete wavelet transform (IDWT)

$$ f(x) = \left( {\frac{1}{\sqrt M }} \right) \times \sum\limits_{k} {W_{\varphi } (\varphi_{j0} ,k)\varphi_{j0,k} (x) + } \left( {\frac{1}{\sqrt M }} \right) \times \sum\limits_{{j = j_{0} }}^{\infty } {\sum\limits_{k} {W_{\psi } (j,k)\psi_{j,k} (x)} }. $$
(18)

Here, f(x), φ j0, k (x) and Ψ j, k (x) are functions of the discrete variable, x = 0, 1, 2, …, M − 1. Normally, it is considered that j 0 = 0 and M is selected as a power of 2 (i.e., M = 2J) so that the summations are performed over x = 0, 1, 2,…, M − 1, j = 0, 1, 2, …, J − 1 and k = 0, 1, 2, 2j − 1. For Haar wavelets, the discretized scaling and wavelet functions are employed in the transform (i.e., the basis functions), correspond to the rows of the M × M Haar transformation matrix. The transform itself is composed of M coefficients with minimum scale to be zero and maximum J − 1.

The 1D transforms of DWT are easily extended to 2D functions like images. In 2D wavelet transform, a 2D scaling function, φ(x,y), and three 2D wavelets, ΨH(x,y), ΨV(x,y), ΨD(x,y), are required. The separable scaling function and directionally sensitive wavelets are expressed as

$$ \varphi (x,y) = \varphi (x)\,\varphi (y) $$
(19)
$$ \Uppsi^{H} (x,y )= \Uppsi (x)\,\varphi (y) $$
(20)
$$ \Uppsi^{V} (x,y )= \varphi (x)\,\Uppsi (y) $$
(21)
$$ \Uppsi^{D} (x,y )= \Uppsi (x)\,\Uppsi (y). $$
(22)

The wavelet ΨH measures intensity variations along columns (like horizontal edges), ΨV responds to the intensity variations along rows (like vertical edges) and ΨD corresponds to variations along diagonals. For given separable 2D scaling and wavelet functions, extension of 1D DWT to 2D is straightforward and corresponding basis functions are expressed as

$$ \varphi_{j,m,n} (x,y) = 2^{\frac{j}{2}} \varphi (2^{j} x - m,\,2^{j} y - n) $$
(23)
$$ \psi_{j,m,n}^{i} (x,y) = 2^{\frac{j}{2}} \psi^{i} (2^{j} x - m,\,2^{j} y - n),\quad i = \{ H,\,V,\,D\} $$
(24)

where index i indicates the directional wavelets.

The discrete wavelet transform of function f(x,y) of size M × N is then

$$ W_{\varphi } (j_{0} ,m,n) = \left( {\frac{1}{{\sqrt {MN} }}} \right) \times \sum\limits_{x = 0}^{M - 1} {\sum\limits_{y = 0}^{N - 1} {f(x,y)\varphi_{j0,m,n} (x,y)} } $$
(25)
$$ W_{\psi }^{i} (j,m,n) = \left( {\frac{1}{{\sqrt {MN} }}} \right) \times \sum\limits_{x = 0}^{M - 1} {\sum\limits_{y - 0}^{N - 1} {f(x,y)\psi_{j,m,n}^{i} (x,y),\quad i = \{ H,V,D\} } } . $$
(26)

As in the 1D case, j 0 is an arbitrary starting scale and W φ (j 0 ,m,n) coefficients define an approximation of f(x,y) at scale j0. The W Ψ(j,m,n) coefficients represent horizontal, vertical and diagonal details for scales j ≥ j 0. It is generally assumed that j 0 = 0 and N = M = 2J, so that j = 0, 1, 2,…, J − 1 and m, n = 0, 1, 2,….2j − 1.

For given W φ and W Ψ, f(x,y) is obtained via IDWT,

$$ f(x,y) = \left( {\frac{1}{{\sqrt {MN} }}} \right) \times \sum\limits_{m} {\sum\limits_{n} {W_{\varphi } (j_{0} ,m,n)\varphi_{j0.m.n} (x,y)} } + \left( {\frac{1}{{\sqrt {MN} }}} \right)\sum\limits_{i = H,V,D} {\sum\limits_{{j = j_{0} }}^{\infty } {\sum\limits_{m} {\sum\limits_{n} {\psi_{j,m,n}^{i} \psi_{j,m,n}^{i} (x,y)} } } } $$
(27)

3.4 Maximization of HF subbands components

In the proposed fusion rule, maximization of HF components has been achieved by genetic searching algorithms. Averaging technique has been implemented for low-frequency subbands. The maximization and averaging process are independent of manual marking of fiducial points or any prior knowledge of seed points.

3.4.1 Genetic algorithm for optimization

Genetic algorithm is a search algorithm based on the mechanism of natural selection and natural genetics. The parameters that are to be optimized are represented as the binary coded string structures called the chromosomes. A collection of possible chromosomes then forms a population, which produces next generation through a natural search process. This searching algorithm considers ‘the fittest survives’ rule after a structured yet randomized information exchange within the existing generation to yield a new generation. Genetic algorithm efficiently exploits the information to speculate on new search points with expected improved performance using three operators—selection (or reproduction), crossover and mutation—to achieve the goal of evolution. GA differs from the conventional optimization techniques in three ways:

  • GA optimizes through a population of points, not by a single point. Thus, the probability of finding a false local peak is reduced over the conventional methods that go through point-to-point search.

  • GA has no need for any auxiliary information.

  • To perform an effective optimization, GA only requires objective function values associated with individual strings. This characteristic makes GA a more robust method with respect to many conventional optimization schemes.

In the present approach, GA has been utilized for maximization of detail coefficients of the input images (A,B) at a particular resolution to achieve the improved results. GA is capable of preserving the important detail features into the fused image. The steps of GA are executed iteratively as described below:

  • Generate initial population of GA by randomly choosing the detail coefficients from each of the HF subbands D A and D B (50% population from D A and 50% from D B )

  • Calculate the average (avg1) and maximum (max1) value of the members of current population

  • Select mating members from the current population based on the fitness value/objective function of the corresponding members (the individual values of each member (detail coefficients) play the role of objective function)

  • Encode the eligible members by simple binary numbers to form the chromosomes

  • Perform three basic genetic operators—crossover and mutation to create new generation chromosomes

  • Decode the new generation chromosomes to form the updated population of detail coefficients

  • Calculate the new average (avg2) and maximum (max2) value of the updated population

  • If the difference between ‘avg1’ and ‘avg2’ and ‘max1’ and ‘max2’ is less than a predefined threshold value (T), respectively, stop the iteration; otherwise, replace the value of ‘avg1’ and ‘max1’ by ‘avg2’ and ‘max2’, respectively, and go to step 3

In the present study, selected initial population size = 40, crossover rate = 0.95, mutation rate = 0.1 and T = 0.001 to meet the goal of convergence.

3.5 Evaluation of fusion qualities

Mutual information is a basic concept from information theory, measuring the statistical dependence between two random variables or the amount of information that one variable contains about the other. In this study, MI is used for evaluating the quality of the fused image. It is also used for comparing the superiority of the proposed methodology over the pixel-wise fusion technique.

The joint probability distribution of two images is estimated by calculating a normalized joint histogram of the gray values. The definition of the MI of two images A and B combines the marginal entropy, p A (a) and p B (b) and joint entropy p AB (a,b) of the images in the following manner:

$$ I(A,B) = \sum\limits_{a,b} {p_{AB} (a,b)\log \frac{{p_{AB} (a,b)}}{{p_{A} (a)p_{B} (b)}}}. $$
(28)

MI is related to entropy by the equation

$$ I(A,B) = H(A) + H(B)-H(A,B) $$
(29)

with H(A) and H(B) being the marginal entropy of A and B, respectively, and H(A,B) be their joint entropy.

$$ H(A) = - \sum\limits_{a} {p_{A} (a)\log p_{A} (a)} $$
(30)
$$ H(A,B) = - \sum\limits_{a,b} {p_{AB} (a,b)\log p_{AB} (a,b)} $$
(31)

4 Experimental results

To get the experimental results of the proposed fusion scheme, we have used PD-, T1- and T2-weighted MR images of section of human brain. To demonstrate the segmentation process, let us start with an image of MR T1 slice of section of human brain. In the present problem, we have used brain MR images that provide the ground truth. The original image, the histogram and ground truth are shown in Fig. 5a–c. In the ground truth different gray shades represent different cortical tissues. The most lightest gray shade represents white matter, next dark shade represents gray matter and the darkest gray shade represents cerebrospinal fluid.

Fig. 5
figure 5

a Original image used for the segmentation, b histogram of the original image, c ground truth showing WM in yellow, GM in green and CSF in blue (color figure online)

The MR brain image has been segmented into GM, WM and CSF. Figure 6a shows the segmented image using fuzzy C-means clustering, and the corresponding histogram is shown in Fig. 7a. To implement the process of segmentation using iterated conditional modes (ICM), the image is initially segmented using the thresholding (T) method. This is accomplished by analyzing the histogram of the image and finding the intensity points that divide the image into regions. These values are passed to label the pixels of the image into different regions. Thus, the image is initially labeled into three different classes {GM, WM, CSF}. This image, along with the original image, is taken as input for ICM with the constant T set to 1.7. The segmentation result using ICM and its histogram are shown in Figs. 6b and 7b, respectively. Gibbs sampling is based on stochastic approaches of Markov random field model which have been inspired by physical annealing process that occurs in matter. The segmented image using Gibbs sampling and its histogram are shown in Figs. 6c and 7c respectively.

Fig. 6
figure 6

Segmentation results a FCM, b ICM, c Gibbs sampling

Fig. 7
figure 7

Histograms of segmented images. a FCM, b ICM, c Gibbs sampling

For the analytical comparison of the segmentation algorithms, the computation time, user interaction and reproducibility have been considered. The computation times for the three algorithms are shown in Table 1. The algorithms are run on Intel core 2 duo, 4 GB RAM machine. The time for Markov random field-based approaches is noted for 1 iteration each.

Table 1 Computation time for the three methods on Intel core 2 duo, 4 GB RAM machine

We have presented the results of segmentation of MR T1, MR T2 and MR PD (axial proton density MR image) brain images of the same patient to study the proposed fusion schemes. The images are registered and then have been segmented using FCM and MRF (ICM and Gibbs) algorithms. These segmented images are then analyzed using multiresolution approach to combine information extracted from each segmented images into the fused image. According to the proposed fusion rule, biologically inspired genetic evolutionary algorithm searched out all information in details of high-frequency subbands.

The performance of different segmentation-based fusion approaches are measured and compared using MI as similarity metric. Both ICM and Gibbs segmentation techniques produce the comparable performance index, but FCM segmentation approach exhibits poorer index. Tables 2, 3, and 4 describe the comparative study of fusion process using ICM, Gibbs and FCM segmentation.

Table 2 MR T1 versus MR T2 fusion
Table 3 MR PD versus MR T2 fusion
Table 4 MR PD versus MR T1 fusion

The results of segmentation of MR T1, MR T2 and MRI PD images using FCM, ICM and Gibbs sampling are shown in Fig. 8. The results of the fusion process on each set of segmented images have been demonstrated in Fig. 9a–c between the MR T1 and MR T2. Similarly, Fig. 10a–c demonstrate the fusion process of segmented images of MR T2 and MR PD brain images and Fig. 11a–c demonstrate the same using MR T1 and MR PD brain images.

Fig. 8
figure 8

The segmentation of different modality MR images using FCM, ICM and Gibbs sampling processes

Fig. 9
figure 9

Demonstrate the GA-based fusion of segmented images of MR T1 and MR T2 modalities. a Fusion of MR T1 versus MR T2 for ICM Segmentation (F1). b Fusion of MR T1 versus MR T2 for Gibbs segmentation (F2). c Fusion of MR T1 versus MR T2 for FCM Segmentation (F3)

Fig. 10
figure 10

Demonstrate the GA-based fusion of segmented images for MR T2 and MR PD modalities. a Fusion of MR PD versus MR T2 for ICM segmentation (F1). b Fusion of MR PD versus MR T2 for Gibbs segmentation (F2). c Fusion of MR PD versus MR T2 for FCM segmentation (F3)

Fig. 11
figure 11

Demonstrate the GA-based fusion of segmented images of MR T1 and MR PD modalities. a Fusion MR PD versus MR T1 for ICM segmentation (F1). b Fusion of MR PD versus MR T1 for Gibbs segmentation (F2). c Fusion of MR PD versus MR T1 for FCM segmentation (F3)

The MI-based comparative study of the fusion techniques is given in Table 2 for a set of images of MR T1 and MR T2 modalities. I1 and I2 are the computed MI for MR T1 and MR T2 images of the same patient, respectively. F1, F2 and F3 denote the image fusion techniques using ICM, Gibbs and FCM segmentation, respectively.

The MI-based comparative study of fusion techniques is given in Table 3. I1 and I2 are MI for MR PD and MR T2 images of the same patient, respectively. F1, F2 and F3 denote the image fusion techniques using ICM, Gibbs and FCM segmentation, respectively. The MI-based comparative study of fusion techniques is given in Table 4. I1 and I2 are MI of MR PD and MR T1 images of the same patient, respectively. F1, F2 and F3 denote the image fusion techniques using ICM, Gibbs and FCM segmentation, respectively.

5 Conclusion

In the present research, MSD-based medical image fusion procedure has been described using genetic algorithm GA. This study focuses on how to use the MSD coefficients of the source images to produce a composite fused image that should be more informative for human interpretation. In the present experiments, two levels of MSD methods in every fusion scheme have been applied. However, increase in decomposition levels may not necessarily produce better results. Maximization of HF components has been achieved by genetic searching algorithms because GAs are the evolutionary techniques that naturally better suited for discrete search spaces. Since digitized intensity values of HF subbands play the role of objective function, it is obvious that GA will be more efficient to collect the important complimentary information from multimodal images. In the present study, GA maximizes the coefficients of HF subbands after 14–16 iterations. Maximization of the detail coefficients collects useful information and reduces the effect of noises. Simple CM method of combining techniques just selects the maximum values between the two images. Since the high-frequency wavelet components also include noise of the image along with the useful details, CM rule incorporates noise in the fused result. In contrary, GA-based combining method maximizes the average weight of the population, which in turns reduces the effect of noise into the fused image.

In the present article, the authors have attempted to present fusion schemes to fuse segmented information obtained from different modality brain MR images (PD, T1 weighted and T2 weighted) using multiresolution- and genetic algorithm-based techniques. Prior to the fusion process, the images are segmented using FCM and MRF models in different brain tissue regions like GM, WM and CSF. Earlier research [29] on data fusion of multimodal brain image shows the results of FCM-based segmentation prior to context dependent (CD) image fusion. In the present study, we have focused not only on the fusion process but also on the segmentation techniques that are having a great role for more effective fusion operation for multimodality medical imaging. In our experiment we have presented the experimental facts that MRF-based techniques like ICM and Gibbs sampling yield better results than FCM-based segmentation for further implementation of fusion operators on the segmented images. To establish the efficacy of segmentation processes in the combined fusion scheme, MI-based index has been computed as similarity metric measure. The value of this index shows the efficiency of the techniques for a recommended fusion scheme and which may be considered by the physicians to determine their diagnostic procedures and therapeutic planning.