1 Introduction

Image fusion is the process to fuse multiple registered images into a composite fused image which is more suitable for the purposes of human visual perception [2, 18, 23, 25]. The fused image contains more accurate description of the scene than any of the images to be fused, which is a branch of information fusion and has become a popular research field in recent years [11, 14, 15, 18, 23, 37]. With the development of imaging sensors technologies, people can more easily get abundant images; image processing techniques therefore play an important role in obtaining the image information of scenes [32,33,34,35]. Image fusion can remove the redundant information and keep the complementary information of different source images, as a consequence image fusion technology has been widely used in military, remote sensing, medical imaging, intelligent robots, defect inspection and computer vision, etc. [2, 11, 14, 15, 18, 23, 25, 37].

Image fusion algorithms vary from simple pixel weighted average to very complex transform domain [37]. Moreover approaches of image fusion can be distinguished depending on whether the images are fused in the spatial domain or transform domain. There are many algorithms which are easy to implement, such as weighted average [29] and principal component analysis (PCA) [25, 29] methods, but the performance may be not satisfactory due to the loss of contrast or blur of details. Besides, there are also some image fusion methods based on different ways of transformation. The common transform domain based methods used in image fusion not only include conventional downsampling multi-scale transform, such as, pyramid transform [15], discrete wavelet transform [5], stationary wavelet transform [26], curvelet transform [3] and contourlet transform [1], but also include nonsubsampled multiscale transform such as nonsubsampled contourlet transform [28] and nonsubsampled shearlet transform [38]. Transform domain based fusion method is the most popular research area in image fusion, and this kind of algorithms is usually used in pixel-level based methods [25]. Firstly, multi-scale decomposition for the source image is performed; and then these decompositions are fused by different rules; finally, the fused image is reconstructed by performing an inverse multi-scale decomposition.

Although the fusion approaches based on nonsubsampled multi-scale transform can achieve better image fusion performance by combining all of the coefficients in different layers, they usually suffer the high computational complexity [18, 27, 28, 31]. On the other hand, the commonly used transform based fusion approaches may not achieve very good performance, but a low computational complexity. Further research is therefore required to propose new image fusion schemes that can achieve better fusion quality with modest computational complexity and satisfy requirements from practical applications. As a result, this work tries to design an eclectic method to improve the quality and performance of image fusion by the combination of easily implemented and computational methods. Laplacian pyramid transform (LPT) and pulse coupled neural network (PCNN) are therefore employed in this work due to their outstanding performance in image processing field, which have been proved.

LPT is a popular transform domain based image processing method with the characteristic of multi-scale, multi-resolution and multiple-level decomposition, which is also known as band-pass pyramid decomposition [9, 10]. LPT can decompose the important features of the source image (such as edge, texture, etc.) into different layers at different scales. Therefore, LPT has been widely used in image processing field [4, 20, 30]. Compared with simple transform domain image processing algorithms, LPT can achieve better decomposition effect. Moreover, compared with complex nonsubsampled multi-scale transform algorithms, LPT has lower computational complexity. However, it alone cannot achieve good image fusion quality; the combinations with other methods can provide the possibility to improve the performance of image fusion [4].

PCNN is a biologically inspired neural network model developed by Eckhorn et al. [6, 7, 19], and it is a single layer artificial neural network model which simulates the information processing mechanisms of cat’s visual cortex neurons. The group of similar neurons issues synchronous pulses under the effect of mutual coupling pulses, and the pulses can effectively describe the information of the input signal and can be used as the features of the input signal [16, 24]. In recent years, the PCNN model has been widely used in image processing field, and shows outstanding properties [17, 22, 31]. In the field of image fusion, PCNN is a global fusion algorithm, which can reserve more detail information and conforms in principle to the physiological basis of human visual system (HVS) [23]. However, there are many parameters that need to be set in PCNN mode. These parameters are usually set manually, which is complex and time-consuming and usually causes inconsistence. This shortcoming also limits the application of this model. Researchers are working hard to make partial PCNN parameters be set automatically, because there are no widely recognized and consistently effective setting methods for all the parameters of PCNN.

SF was proposed as a measure method of image quality, which can represent the regional information of image and was in accordance HVS [8]. In 2001, Li et al. proposed a SF based image fusion method, which proved that SF can be used in image fusion field and can achieved good performance [21]. Afterwards, many image fusion methods were designed based on SF [18, 27, 36]. SF is described by a couple of simple mathematical formulas, and the calculation is not complicated. Therefore, SF is employed in this work to improve the fusion performance of the proposed image fusion scheme.

In order to improve image fusion quality and reduce computational complexity, we propose a lightweight image fusion scheme by combining LPT and PCNN models. This combination makes use of the advantages of these two models, meanwhile restrains the effect of disadvantages of conventional image fusion methods. Compared with simple transform and nonsubsampled multi-scale transform domain based image fusion methods, the proposed image fusion scheme has a modest computational complexity and is easy to implement. The proposed scheme firstly employs LPT to decompose the source image into corresponding constituent sub-images. This process is easy to implement, has low computational complexity and does not require much storage space. Then an adaptive PCNN is employed to get the sub-image’s oscillation frequency graph (OFG) which contains the texture, edge, regional distribution information of the sub-image. In the adaptive PCNN model, SF is calculated as the β parameter of PCNN, so the β can be automatically adjusted according to the sub-image. This makes the proposed method be more effectively extract image’s features than the general PCNN model. Within the scheme, LSF is employed to enhance the features of OFG to make the features of sub-images be easier to extract, as LSF can effectively describe the clarity of an image according to its regional information. Finally, according to the values of LSF, some component coefficients are chosen from different sub-images as the new component coefficients of the fused image. The experimental results indicate that the new scheme is more effective than other commonly used image fusion algorithms.

This paper is organized as follows: Section 2 introduces related theories including LPT and PCNN model. Section 3 presents the detailed processes of the proposed image fusion scheme. The experimental results and discussions about gray and color image fusion are presented in Section 4. Section 5 concludes this paper.

2 Related theories

In this section, we will review the theory of LPT algorithm and PCNN model, which are used in the rest of this paper.

2.1 Laplacian pyramid transform of the image

Laplacian pyramid transform algorithm is based on Gaussian pyramid decomposition (GPD), which consists of two steps. The first step is to get Gaussian pyramid, and the second step is to get Laplacian pyramid [9, 10].

2.1.1 Gaussian pyramid decomposition

In GPD, Gaussian low-pass filter and down-sampling process are repeatedly performed on the source image to get the whole layers of Gaussian pyramid. Let the source image be G0 which is regarded as the bottom layer, and the l-th layer is represented by G1, the decomposition process is described as follows:

$$ {\displaystyle \begin{array}{l}{G}_l\left(i,j\right)=\sum \limits_{m=-2}^2\sum \limits_{n=-2}^2\omega \left(m,n\right){\mathrm{G}}_{l-1}\left(2i+m,2j+n\right)\\ {}\left(1\le l\le N,0\le i<{R}_l,0\le j<{C}_l\right)\end{array}} $$
(1)

where (i, j) represents the pixel position; Gl(i,j) is regarded as the l-th layer of GPD; N denotes the total number of GPD; Ri and Ci denote the rows and columns of the l-th layer of GPD respectively; [ω(m, n)] is a 5 × 5 window matrix, (m, n ∈  − [2, 2]) with low pass resistance, and ω(m, n) is the value of a matrix element at position (m, n). Matrix ω is defined as follow:

$$ \omega =\frac{1}{256}\left[\begin{array}{ccccc}1& 4& 6& 4& 1\\ {}4& 16& 24& 16& 4\\ {}6& 24& 36& 24& 6\\ {}4& 16& 24& 16& 4\\ {}1& 4& 6& 4& 1\end{array}\right]. $$
(2)

2.1.2 Laplacian pyramid decomposition

In GPD, the size of the obtained sub-image is 1/4 of its preceding layer. The dilation process is applied to the Gaussian pyramid via interpolation, making the size of Gl is the same as the size of Gl-1 after dilation. The processes of dilation and Laplacian pyramid decomposition are described by the following Eqs. (3)–(5) where (i, j) represents the position in the pyramid:

$$ {G}_l^{\ast}\left(i,j\right)=4\sum \limits_{m=-2}^2\sum \limits_{n=-2}^2\omega \left(m,n\right){G}_l\left(\frac{i+m}{2},\frac{j+n}{2}\right),\left(0<l\le N,0\le i<{R}_l,0\le j<{C}_l\right), $$
(3)

where,

$$ {G}_l\left(\frac{i+m}{2},\frac{j+n}{2}\right)=\left\{\begin{array}{cc}{G}_l\left(\frac{i+m}{2},\frac{j+n}{2}\right);& \frac{i+m}{2},\frac{j+n}{2}\ \mathrm{are} \operatorname {int}\mathrm{eger}\\ {}0;& \mathrm{other}\end{array}\right. $$
(4)

and,

$$ \left\{\begin{array}{cc}{LP}_l={G}_l-{G}_{l+1}^{\ast };& 0\le l<N\\ {}{LP}_N={G}_N;& l=N\end{array}\right.. $$
(5)

N denotes the top of the pyramid layer; Gl is the l-th layer of Gaussian pyramid;\( {G}_{l+1}^{\ast } \) is the l-th dilated layer of Gaussian pyramid; LPl denotes the l-th layer of Laplacian pyramid.LP0, LPl, …, LPN make up a Laplacian pyramid.

2.1.3 Inverse Laplacian pyramid transform

We can get the following equations from (5):

$$ \left\{\begin{array}{cc}{G}_N={LP}_N;& l=N\\ {}{G}_l={LP}_l+{G}_{l+1}^{\ast };& 0\le l<N\end{array}\right.. $$
(6)

It can be seen from (6), it can be known that Laplacian pyramid layers are gradually enlarged by insertion operation, in this way, the size of the current layer is the same as the lower layer. The source image can be reconstructed by adding all the layers together one by one.

Figure 1 shows the three-layer LPT decomposition of two source images. It can be seen that LPT can effectively extract the important features of the source images, e.g., in Fig.1 the edges and regional characteristics of the source images are described in these sub-images. It can also be seen that the difference of the two source images in the 3rd layer is not obvious. Therefore, the 3-layer decomposition is adapted to the proposed image fusion scheme.

Fig. 1
figure 1

An illustration of LPT. a Source image A and sub-images of the detailed coefficients of source image A. b Source image B and sub-images of the detailed coefficients of source image B

2.2 PCNN

Every neuron in PCNN model can be regarded as three parts: the receptive field, the modulation field and the pulse generator [16, 17], as shown in Fig. 2. The model is described as the following Eqs. (7)–(11), where the subscript (ij) represents a neuron, n denotes the current iteration. Especially, “a neuron ignition” means a PCNN’s neuron generates a pulse [16, 17].

Fig. 2
figure 2

The diagram of a neuron in PCNN model

In the receptive field, the input of a neuron consists of Lij(n) and Fij(n) channels that receive the neighboring neurons’ coupling input Yij(n), and in addition Fij(n) channel also receives the external input stimulus Sij, as described in (7)–(8). In Lij(n) and Fij(n) channels, a neuron links to its neighborhood neurons by linking weights Wijkl and Mijkl matrix respectively. Need to point out that Wijkl and Mijkl is equal, which are embodied by convolution operation in the model. The two channels accumulate the previous output, and the decay exponentials are αL and αF, while the channel amplitudes are VL and VF respectively.

$$ {F}_{ij}(n)={V}^F\sum \limits_{kl}{M}_{ij kl}{Y}_{kl}\left(n-1\right)+{e}^{-{\alpha}^F}{F}_{ij}\left(n-1\right)+{S}_{ij}, $$
(7)
$$ {L}_{ij}(n)={V}^L\sum \limits_{kl}{W}_{ij kl}{Y}_{kl}\left(n-1\right)+{L}_{ij}\left(n-1\right){e}^{-{\alpha}^L}, $$
(8)

In the modulation field, a constant positive bias (as 1) is added to the linking input Lij(n) which it is multiplied by the feeding input Fij(n), β determines the magnified relationship between Uij(n) and Lij(n). Accordingly, the internal activity Uij(n) can be described by as:

$$ {U}_{ij}(n)={F}_{ij}(n)\left[1+\beta {L}_{ij}(n)\right], $$
(9)

The pulse generator consists of a threshold adjuster, a comparison organ and a pulse generator, which are described by (10)–(11). The function of pulse generator is to generate the pulse output Yij(n), and the threshold adjuster is to adjust threshold θij(n), \( {V}_{ij}^{\theta } \) is the threshold amplification coefficient. When the internal state Uij(n) is larger than the threshold θij(n), that is Uij(n) > θij(n), a pulse is generated by the neuron, which is called an ignition.

$$ {\theta}_{ij}(n)={e}^{-{\alpha}^{\theta }}{\theta}_{ij}\left(n-1\right)+{V}_{ij}^{\theta }{Y}_{ij}\left(n-1\right), $$
(10)
$$ {Y}_{ij}(n)=\left\{\begin{array}{l}1\kern0.5em ,{U}_{ij}(n)>{\theta}_{ij}(n)\\ {}0\kern0.5em , otherwise\end{array}\right., $$
(11)

When PCNN model is used for image processing, the total number of neurons in the neural network is equal to the total number of pixels in the input images, and the relationship between a pixel and a neuron is one-to-one. The output of each neuron results in two states, pulse and non-pulse, so the outputs of neurons comprise a binary image. More information about PCNN can be found in [16, 17, 22, 24, 31].

3 The proposed image fusion scheme

In this section, the proposed image fusion scheme is presented in detail. The scheme is shown in Fig. 3. In this scheme, all input images are decomposed into corresponding constituent sub-images by LPT. Adaptive PCNN is employed to extract the features of these sub-images, and then LSF is employed to enhance their feature regions, which can make the features easy to extract. The fusion rule determines which pixels of the sub-images are used in the fused image. At last, inverse LPT (ILPT) is performed to get the fused image according to the new fused coefficients.

Fig. 3
figure 3

The diagram of the proposed image fusion scheme

The main motivation of this work is to propose a lightweight image fusion scheme which can keep the balance of fusion performance and complexity. We try to extend the vitality of the simple and effective methods to improve the performance of image fusion method by their organic combination, because the employed algorithms are effective and easy to implement which have been proven by many researchers.

In the proposed method, LPT is used to decompose the important features of the source images into different layers at different scales, which is the foundation of the proposed scheme. The obtained sub-images by LPT can effectively represent the complementary information and redundant information of the input source images, as shown in Figs. 1 and 3. Therefore, PCNN can be employed to extract the complementary and redundant features of these sub-images; the biological characteristics and effective computing mechanism of PCNN can provide satisfactory performance to accomplish the task in this stage, as shown in Fig. 4. Besides, the SF of sub-images is calculated as the parameter of PCNN to reduce the number of parameter settings, which also can improve the adaptability and accuracy of feature extraction for these sub-images. LSF provides the possibility to accurately represent the detailed information of these sub-images; therefore it is utilized to enhance the regional features of the sub-images according to OFG, as shown in Fig. 5. The image fusion rule is applied for each coefficient to choose the high quality coefficients of these sub-images for effectively fuse the source images.

Fig. 4
figure 4

OFG of the sub-images. a OFG of the sub-images A. b OFG of sub-images B

Fig. 5
figure 5

LSF of the OFG. a LSF of the OFG A. b LSF of the OFG B

3.1 Adaptive PCNN

The linking strength β of PCNN is the most important parameter and the key determinant of the ignition behavior of PCNN, and SF is the important image definition indicator, which represents the quality of the image [8]. In conventional image processing methods, the linking strength β of PCNN needs to be set manually according to lots of experiments, which limits the widespread application of PCNN. The proposed image fusion scheme uses SF as the β of PCNN, which makes the linking strength be set automatically according to the quality of the sub-images. So the PCNN model can be adaptively adjusted by SF to make it work better. η is used to control the magnitude of SF because the calculated SF may be too large or too small, which may have a negative impact on the proposed method, and it is set as 0.01 in the experiments. As a result, the proposed scheme can effectively extract the important features of the source images. Actually, within the scheme the value of β is calculated as:

$$ \beta ={\eta}^{\ast } SF, $$
(12)

where, η is adjusted factor which is set manually, and

$$ SF=\sqrt{RF^2+{CF}^2}, $$
(13)
$$ RF=\sqrt{\frac{1}{M\times N}\sum \limits_{i=1}^M\sum \limits_{j=2}^N{\left[ im\left(i,j\right)- im\left(i,j-1\right)\right]}^2}, $$
(14)
$$ LCF=\sqrt{\frac{1}{w^2}\frac{1}{M\times N}\sum \limits_{i=1}^N\sum \limits_{j=2}^M{\left[ im\left(i,j\right)- im\left(i,j-1\right)\right]}^2}, $$
(15)

SF depends on row frequency (RF) and column frequency (CF), where M is the of rows of the image, and N is the number of columns of the image, im(i, j) is the image gray level at pixel (i, j).

In most applications of PCNN, its parameters are set empirically according to repeated experiment. In this paper, we set the other parameters of PCNN model as follows: the simplest settings of Wijkl and Mijkl is Wijkl = Mijkl = [(i − k)2 − (k − l)2]−1; the iterationN of PCNN is 300, VL generally is set as 0.01, VF is generally a relatively small number such as 1; αF is set as αF = 0.0164. \( {V}_{ij}^{\theta } \) is set as Vθ = 62.6012; because of \( {e}^{-{\alpha}^{\theta }}<1 \), so αθ is set as αθ = 0.0637. θ(0) is the first internal threshold value, we do not hope the neuron ignite at the first time so that it must be θ(0) > 1, such as θ(0) = 1.2. The αL is set as αL = 0.7260. In these parameters, Wijkl and Mijkl are computable according to the position of neuron; N, VL, VF and θ(0) are fixed generally.

3.2 The OFG of PCNN

PCNN has the characteristics of global coupling and pulse synchronization. In the proposed scheme, PCNN is used to extract the complementary and redundant features of sub-images, and these features contain the detailed information of these sub-images, such as texture, edge and regional distribution. The ignition operation of PCNN can generate a binary image by recording whether the neurons ignite or not. The binary images can be regarded as the features of source image because they can effectively express the detailed information of their sub-images. With the statistic of the binary image of the neurons, an oscillation frequency graph (OFG) can be obtained, as is shown in (16). It is obvious that OFG can effectively represent the detailed information of the source images when comparing Fig. 4 with Fig. 1, but does not have obvious regional characteristics. The reason is that the detail information is described by isolated pixel; however, the regional information of images is very important for HVS and image fusion.

$$ OFG\left(i,j\right)=\sum \limits_{n=1}^N{Y}_{ij}(n). $$
(16)

where N denotes the iteration times, Yij denotes the pulse output of the neuron (i, j).

3.3 LSF of OFG

Local spatial frequency (LSF) can represent the regionally detailed information of an image, which is composed of local row frequency (LRF) and local column frequency (LCF) [8, 21, 27]. In the proposed scheme, the LSF of pixel represents the information of the regional characteristics in the sub-images, because LSF can consider the image information of the neighboring pixels in a special region, which is shown in eqs. (17)–(19). It can make the features easier to extract. The pixels of fused image can be obtained by comparing the value of LSF, which are presented in the algorithm sub-section. Figure 5 shows the LSF of an OFG form Fig. 4, it can be seen that the regional detail information of the sub-image in OFG is enhanced when compared with Fig. 4. Calculating the LSF of OFG makes the features of source image more evident, and in turn makes the regional features easier to be extracted. As a result, the important coefficients will be easier to be selected as the final coefficients of the fused image.

$$ LSF=\sqrt{LRF^2+{LCF}^2}, $$
(17)
$$ LRF=\sqrt{\frac{1}{w^2}\sum \limits_{i=1}^w\sum \limits_{j=2}^w{\left[ OFG\left(i,j\right)- OFG\left(i,j-1\right)\right]}^2}, $$
(18)
$$ LCF=\sqrt{\frac{1}{w^2}\sum \limits_{i=2}^w\sum \limits_{j=1}^w{\left[ OFG\left(i,j\right)- OFG\left(i-1,j\right)\right]}^2}, $$
(19)

where w is the size of the window, OFG(i, j) is the oscillation frequency graph at pixel (i, j).

3.4 The rule of fusion

The rule of fusion is very important in an image fusion method; as it determines which pixel or coefficient can be fused in the final result. Therefore, it has a significant influence on the fused image. In this paper, the fused sub-image coefficients are determined by the fusion rule which is described by (20), which is used for each coefficient to recognize the high quality coefficients according to the LSF vale of OFG. When the corresponding LSF value of a coefficient from sub-image A is larger than the LSF value of the coefficient from sub-image B, the coefficients from sub-image A would be chose as the final coefficient of the fused image, and vice versa. When the LSF value of a coefficient form sub-image A and B is equal, the mean of the two coefficients is calculated as the final coefficient of the fused image.

$$ {FC}_{ij}=\left\{\begin{array}{cc}{C}_{Aij},& \left({LSF}_{Aij}>{LSF}_{Bij}\right)\\ {}{C}_{Bij},& \left({LSF}_{Aij}<{LSF}_{Bij}\right)\\ {}\left({C}_{Aij}+{C}_{Bij}\right)/2,& \left({LSF}_{Aij}={LSF}_{Bij}\right)\end{array}\right.. $$
(20)

where the subscript ij represents the pixel position in the sub-image; FCij is the fused sub-image coefficients; CAij and CBij are the sub-images coefficients from image A and B respectively; and LSFij is the LSF value of the OFG in the sub-image.

3.5 Fusion algorithm

The image fusion algorithm based on the proposed scheme shown in Fig.3 is described as the following steps:

  • Step 0: Given source images A and B.

  • Step 1: The images A and B are decomposed by LPT to get several corresponding sub-image sets represented as CAij and CBij.

  • Step 2: Calculate the SF of these sub-images as the β of PCNN, respectively.

  • Step 3: PCNN is utilized to deal with the sub-images to get the OFG of all sub-images.

  • Step 4: Calculate the LSFij of each coefficient in the OFG with the size of the window w being of 9 × 9 pixels.

  • Step 5: Determine every fused sub-image coefficients FCij by following the fusion rule described in (20).

  • Step 6: Reconstruct the fused image from the fused sub-images coefficients by using ILPT.

3.6 Color image fusion

RGB color image is the fundamental and often-used image data format; it can be directly applied to image display equipment. Thus, the proposed scheme is performed in RGB color space to fuse the source images to reduce the operation of color transform, which also can avoid the nonlinear operations and transform error. RGB color image is composed of R (Green), G (Green) and B (Blue) components. In order to fuse color source images, we regard the three components of a RGB color image as three gray images, and apply the proposed image fusion scheme to them separately. The final fused color image is then produced from the individual fused component images. Accordingly the color image fusion framework based on the proposed fusion scheme is shown in Fig. 6.

Fig. 6
figure 6

The diagram of the proposed color image fusion framework

4 Experiments and analysis

We selected commonly used image fusion methods to compare with our method to verify the validity of proposed scheme. These comparison methods were principal component analysis (PCA); wavelets transform (WT); stationary wavelets transform (SWT); filter-subtract-decimate pyramid (FSDP); gradient pyramid (GP); pulse coupled neural networks (PCNN); nonsubsampled contourlet transform (NSCT); nonsubsampled shearlet transform (NSST); NSCT+PCNN-LSF and NSST+PCNN-LSF. The last two schemes are similar to our scheme. The details of the above mentioned methods are shown in Table 1.

Table 1 Fusion methods for comparison and their features (‘--’ represents null)

In order to evaluate the performance of different image fusion methods, we adopted the commonly used image fusion performance indexes, which were mutual information (MI), entropy (EN), standard deviation (SD), space frequency (SF), edge based on similarity measure (Qabf) and feature mutual information (FMI) [12, 13, 25, 28, 38]. Among them Qabf is the most important index as it indicates how much edge information is preserved in the fused image. Higher index value(s) of a fused image means a higher quality of the fused image.

4.1 Gray image fusion experiments and analysis

Several groups of images with different focuses were used to evaluate the performance of the proposed algorithm. These images have different resolutions, focused areas and detailed information, are often-used in image fusion research. The first group of images is Clocks, which are shown in Fig. 1a, b. Image A focuses on the right and image B focuses on the left.

The fused images generated from different methods are shown in Fig. 7. In Fig. 7, the proposed algorithm did well in extracting the features of the source images. The fusion effect of the PCA, SWT, FSDP and PCNN method was poorer than other methods. There were clarity fallings to some extent in PCA, SWT, FSDP, GP and PCNN methods. Nevertheless, WT, NSCT, NSST, NSCT+PCNN-LSF, NSST+PCNN-LSF and the proposed method had a good fusion effect; although there were some blurry regions in the big clock, for NSST, NSCT+PCNN-LSF and NSST+PCNN-LSF methods. Overall the result generated by the proposed scheme was better than the competitive image fusion methods; even the schemes NSCT-PCNN-LSF and NSST+PCNN-LSF which are similar to our proposed scheme.

Fig. 7
figure 7

Fused images from different methods. a PCA. b WT. c SWT. d FSDP. e GP. f PCNN.g NSCT. h NSST. i NSCT+PCNN-LSF. j NSST+PCNN-LSF. k The proposed

It can be seen in Fig. 7k that the fused image generated by our proposed scheme is clear. It retains most details of the source images. Besides, the edges are clearer and the definition is higher than most of others. In other words the proposed scheme performed image fusion better than the conventional transform domain image fusion methods, and it also achieves the similarity or even better image fusion effect as those complicated transform domain fusion methods.

Table 2 lists the fusion quality index values of the all experiment fusion methods for this group of images. It can be seen in Table 2 that the fusion image generated by the proposed method contained more information. The MI, Qabf and FMI values of this method are much higher than those of other methods. The SD and SF of the proposed method are closer to WT method, which are little better than other methods. The EN values of all methods are very close. It is well known that MI and Qabf are the most important quality indexes as they effectively indicate how much source image information is reserved in the fused image. The SF index indicates the definition quality of an image. The higher MI, Qabf and SF values of the proposed method indicated this method outperformed the other methods in terms of fused image quality.

Table 2 Fusion quality indexes of Fig. 7

The second group of images is Boats, which are shown in Fig. 8a, b. Image A focuses on the left and image B focuses on the right. The fused images generated by different methods are shown in Fig. 9. It can be seen that there were clarity fallings to some extent for in PCA, SWT, FSDP, GP and PCNN methods. It can be seen that the fused images of WT, NSCT, NSST, NSCT+PCNN-LSF, NSST+PCNN-LSF and the proposed method had a good fusion effect, and were clearer than those generated by the conventional transform domain image fusion methods.

Fig. 8
figure 8

Source image. a Source image A. b Source image B

Fig. 9
figure 9

Fused images from different methods. a PCA. b WT. c SWT. d FSDP. e GP. f PCNN. g NSCT. h NSST. i NSCT+PCNN-LSF. j NSST+PCNN-LSF. k The proposed

Table 3 lists the fusion quality index values of the all experiment fusion methods for this group of images. It can be seen in Table 3 that the fused image generated by the proposed method contained more information. The MI, Qabf and FMI values of this method are much higher than other methods. The SD and SF values of the proposed method and WT method are very close, and are little better than those of the other methods. The EN values of all methods are very close. The evaluation conclusion made from Table 3 is the same as from Table 2.

Table 3 Fusion quality indexes of Fig. 9

It can be seen in Figs. 7 and 9 that the edge of the fused image generated by the proposed method is clear, and it retains most of textures and details in the source images. It can be seen in Tables 2 and 3 that the values of evaluation indexes are similar. The experiments on the two groups of images obtained consistent results which proved the reasonability of the experiments and the effectiveness of the proposed method. Overall, the experimental results showed that the proposed scheme was an effective fusion method for getting better fusion image, whether from the subjective or objective aspect. This method was able to extract the main features from the source images; and achieved good results for gray images. It can also be seen the proposed fusion method achieved better results than the conventional transform domain image fusion methods, and achieved the results that were similar to or better than those of the complicated transform domain image fusion methods.

4.2 Color image fusion experiments and analysis

Several groups of color images with different focuses were used to evaluate the performance of the proposed algorithm. The first group of color images is Cups, which are shown in Fig. 10. Image A focuses on the left and image B focuses on the right, and there are many words as details.

Fig. 10
figure 10

Source image. a Source image A. b Source image B

The fused images generated by different methods are shown in Fig. 11.The fused image generated by the proposed algorithm was better than the others, except a little bit color distortion near the black word “Flora”. The proposed algorithm did well in extracting the features of the source images, and the fused image was the closest one to the source image, as it had natural colors and contained more edges, textures, details and less artifacts. While there were some ghost artifacts in the fused images generated by WT, NSCT, NSST, NSCT+PCNN-LSF, and NSST+PCNN-LSF methods, and the clarity of PCA, SWT, FSDP, GP and PCNN had fallings to some extent. From the fused images in Fig. 11, we can conclude that the proposed algorithm is more effective than the most of the competitive fusion image methods for this group of color images.

Fig. 11
figure 11

Fused images from different methods. a PCA. b WT. c SWT. d FSDP. e GP. f PCNN. g NSCT. h NSST. i NSCT+PCNN-LSF. j NSST+PCNN-LSF. k The proposed

Table 4 lists the fusion quality index values of the all experiment fusion methods for this group of color images. We can see in Table 4 that the fused images generated by the proposed method contained more information, as the Qabf of this method is higher than that of the other methods and the MI of the proposed method is very close to the best one possessed by PCNN. The EN of the proposed method is closer to that of FSDP. SF, SD and FMI of the proposed method are also at a higher level. Since Qabf and MI are two important quality indexes in which the proposed method achieved higher values, we can say that generally the proposed method achieves a better effect than the simple image fusion methods, and achieves a satisfactory effect that is closer to that of the complex algorithms.

Table 4 Fusion quality indexes of Fig. 11

The second group of color images is shown in Fig. 12. Image A focuses on the left and image B focuses on the right, and there are many words as details.

Fig. 12
figure 12

Source image. a Source image A. b Source image B

The fused images generated by different methods are shown in Fig. 13. There were clarity fallings to some extent for PCA, SWT, FSDP, GP and PCNN methods. The proposed fusion method achieved a better fusion result than the conventional transform domain image fusion methods, and a satisfactory result that is similar to that of the complicated transform domain image fusion methods for this group of color images. The experiment showed that the proposed algorithm did well in extracting the features of the source images, such as edges, textures and details.

Fig. 13
figure 13

Fused images from different methods. a PCA. b WT. c SWT. d FSDP. e GP. f PCNN. g NSCT. h NSST. i NSCT+PCNN-LSF. j NSST+PCNN-LSF. k The proposed

Table 5 lists the fusion quality index values of the all experiment fusion methods for this group of color images. Table 5 showed that overall, the results are similar to those of the first group of color images. The Qabf of the proposed method is much higher than that of the other methods. This indicates that there is more edge information of the source image being preserved in the fused image. Most of the evaluation indexes of the proposed method are better than those of the simple transform domain based image fusion methods, and are very close to those of the complicated transform domain methods.

Table 5 Fusion quality indexes of Fig. 13

The experimental results about the color image fusion showed that the color of color images generated by the proposed method was well preserved, and there was less color distortion in the final fused image. It can be seen in Tables 4 and 5 that the most evaluation indexes of the proposed method are better than those of the often-used image fusion methods and are similar to those of the complicated fusion methods. The experimental results show that the proposed method is effective in color image fusion generally. On the other hand, the values of evaluation indexes in both Table 4 and Table 5 are quite similar, which shows the reasonability of the experiments.

It can be seen from Table 1 that the proposed method only needs to deal with 3 sub-images, while most of the other methods need to deal with much more than this number; besides, the size of the sub-images generated by proposed method is smaller than the most of other methods. So the proposed method requires less memory space when conducting image fusion. In addition, LPT is the key process of the proposed scheme, which is easy to implement and has a low computational complexity. As a result, the proposed image fusion scheme has the advantages of low space requirements, easy implementation and modest computational complexity.

From Table 2 to Table 5, we can find that the effect of the color image fusion and gray image fusion is different, no matter for the proposed or other methods. This implies that it might not be a better option to directly apply gray image fusion methods to color image fusion. Generally speaking, a color image is the combination of 3 channels, i.e., R, G and B, and the gray level distribution and features of these three channels are quite different from a normal gray image. So directly applying a gray image fusion method/scheme to deal with the three color channels as the three gray images, which might not be able to properly catch the main features of color images and get better results.

5 Conclusions

We propose an effective image fusion scheme based on LPT and adaptive PCNN-LSF. LPT algorithm can decompose important image features into sub-images at different scales and levels. The proposed scheme employs LPT to decompose the source image into corresponding constituent sub-images. SF makes PCNN adaptive and effective in extracting image features. LSF is employed to enhance the features of the sub-images. Compared with commonly used image fusion algorithms, the proposed scheme only need to deal with fewer sub-images and is easy to implement. The experimental results on gray and color images show that the proposed image fusion method can fuse different focus positions of the source images, and the fused image contains more information of the source images than conventional methods. Compared with the commonly used algorithms, the proposed scheme achieves better fusion performance. The experiments also indicate that color image fusion should be specially treated, as color images have their own characteristics that are different from those of gray images.