A lightweight scheme for multi-focus image fusion

Jin, Xin; Hou, Jingyu; Nie, Rencan; Yao, Shaowen; Zhou, Dongming; Jiang, Qian; He, Kangjian

doi:10.1007/s11042-018-5659-4

A lightweight scheme for multi-focus image fusion

Published: 30 January 2018

Volume 77, pages 23501–23527, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

A lightweight scheme for multi-focus image fusion

Download PDF

Xin Jin ORCID: orcid.org/0000-0003-2211-2006¹,
Jingyu Hou²,
Rencan Nie¹,
Shaowen Yao³,
Dongming Zhou¹,
Qian Jiang¹ &
…
Kangjian He¹

552 Accesses
11 Citations
Explore all metrics

Abstract

The aim of multi-focus image fusion is to fuse the images taken from the same scene with different focuses so that we can obtain a resultant image with all objects in focus. However, the most existing techniques in many cases cannot gain good fusion performance and acceptable complexity simultaneously. In order to improve image fusion efficiency and performance, we propose a lightweight multi-focus image fusion scheme based on Laplacian pyramid transform (LPT) and adaptive pulse coupled neural networks-local spatial frequency (PCNN-LSF), and it only needs to deal with fewer sub-images than common methods. The proposed scheme employs LPT to decompose a source image into the corresponding constituent sub-images. Spatial frequency (SF) is calculated to adjust the linking strength β of PCNN according to the gradient features of the sub-images. Then oscillation frequency graph (OFG) of the sub-images is generated by PCNN model. Local spatial frequency (LSF) of the OFG is calculated as the key step to fuse the sub-images. Incorporating LSF of the OFG into the fusion scheme (LSF of the OFG represents the information of its regional features); it can effectively describe the detailed information of the sub-images. LSF can enhance the features of OFG and makes it easy to extract high quality coefficient of the sub-image. The experiments indicate that the proposed scheme achieves good fusion effect and is more efficient than other commonly used image fusion algorithms.

A novel approach for multi-focus image fusion based on SF-PAPCNN and ISML in NSST domain

Article 20 June 2020

Multi-focus image fusion combining focus-region-level partition and pulse-coupled neural network

Article 08 March 2018

Multi-focus Image Fusion with Cooperative Image Multiscale Decomposition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Image fusion is the process to fuse multiple registered images into a composite fused image which is more suitable for the purposes of human visual perception [2, 18, 23, 25]. The fused image contains more accurate description of the scene than any of the images to be fused, which is a branch of information fusion and has become a popular research field in recent years [11, 14, 15, 18, 23, 37]. With the development of imaging sensors technologies, people can more easily get abundant images; image processing techniques therefore play an important role in obtaining the image information of scenes [32,33,34,35]. Image fusion can remove the redundant information and keep the complementary information of different source images, as a consequence image fusion technology has been widely used in military, remote sensing, medical imaging, intelligent robots, defect inspection and computer vision, etc. [2, 11, 14, 15, 18, 23, 25, 37].

Image fusion algorithms vary from simple pixel weighted average to very complex transform domain [37]. Moreover approaches of image fusion can be distinguished depending on whether the images are fused in the spatial domain or transform domain. There are many algorithms which are easy to implement, such as weighted average [29] and principal component analysis (PCA) [25, 29] methods, but the performance may be not satisfactory due to the loss of contrast or blur of details. Besides, there are also some image fusion methods based on different ways of transformation. The common transform domain based methods used in image fusion not only include conventional downsampling multi-scale transform, such as, pyramid transform [15], discrete wavelet transform [5], stationary wavelet transform [26], curvelet transform [3] and contourlet transform [1], but also include nonsubsampled multiscale transform such as nonsubsampled contourlet transform [28] and nonsubsampled shearlet transform [38]. Transform domain based fusion method is the most popular research area in image fusion, and this kind of algorithms is usually used in pixel-level based methods [25]. Firstly, multi-scale decomposition for the source image is performed; and then these decompositions are fused by different rules; finally, the fused image is reconstructed by performing an inverse multi-scale decomposition.

Although the fusion approaches based on nonsubsampled multi-scale transform can achieve better image fusion performance by combining all of the coefficients in different layers, they usually suffer the high computational complexity [18, 27, 28, 31]. On the other hand, the commonly used transform based fusion approaches may not achieve very good performance, but a low computational complexity. Further research is therefore required to propose new image fusion schemes that can achieve better fusion quality with modest computational complexity and satisfy requirements from practical applications. As a result, this work tries to design an eclectic method to improve the quality and performance of image fusion by the combination of easily implemented and computational methods. Laplacian pyramid transform (LPT) and pulse coupled neural network (PCNN) are therefore employed in this work due to their outstanding performance in image processing field, which have been proved.

LPT is a popular transform domain based image processing method with the characteristic of multi-scale, multi-resolution and multiple-level decomposition, which is also known as band-pass pyramid decomposition [9, 10]. LPT can decompose the important features of the source image (such as edge, texture, etc.) into different layers at different scales. Therefore, LPT has been widely used in image processing field [4, 20, 30]. Compared with simple transform domain image processing algorithms, LPT can achieve better decomposition effect. Moreover, compared with complex nonsubsampled multi-scale transform algorithms, LPT has lower computational complexity. However, it alone cannot achieve good image fusion quality; the combinations with other methods can provide the possibility to improve the performance of image fusion [4].

PCNN is a biologically inspired neural network model developed by Eckhorn et al. [6, 7, 19], and it is a single layer artificial neural network model which simulates the information processing mechanisms of cat’s visual cortex neurons. The group of similar neurons issues synchronous pulses under the effect of mutual coupling pulses, and the pulses can effectively describe the information of the input signal and can be used as the features of the input signal [16, 24]. In recent years, the PCNN model has been widely used in image processing field, and shows outstanding properties [17, 22, 31]. In the field of image fusion, PCNN is a global fusion algorithm, which can reserve more detail information and conforms in principle to the physiological basis of human visual system (HVS) [23]. However, there are many parameters that need to be set in PCNN mode. These parameters are usually set manually, which is complex and time-consuming and usually causes inconsistence. This shortcoming also limits the application of this model. Researchers are working hard to make partial PCNN parameters be set automatically, because there are no widely recognized and consistently effective setting methods for all the parameters of PCNN.

SF was proposed as a measure method of image quality, which can represent the regional information of image and was in accordance HVS [8]. In 2001, Li et al. proposed a SF based image fusion method, which proved that SF can be used in image fusion field and can achieved good performance [21]. Afterwards, many image fusion methods were designed based on SF [18, 27, 36]. SF is described by a couple of simple mathematical formulas, and the calculation is not complicated. Therefore, SF is employed in this work to improve the fusion performance of the proposed image fusion scheme.

In order to improve image fusion quality and reduce computational complexity, we propose a lightweight image fusion scheme by combining LPT and PCNN models. This combination makes use of the advantages of these two models, meanwhile restrains the effect of disadvantages of conventional image fusion methods. Compared with simple transform and nonsubsampled multi-scale transform domain based image fusion methods, the proposed image fusion scheme has a modest computational complexity and is easy to implement. The proposed scheme firstly employs LPT to decompose the source image into corresponding constituent sub-images. This process is easy to implement, has low computational complexity and does not require much storage space. Then an adaptive PCNN is employed to get the sub-image’s oscillation frequency graph (OFG) which contains the texture, edge, regional distribution information of the sub-image. In the adaptive PCNN model, SF is calculated as the β parameter of PCNN, so the β can be automatically adjusted according to the sub-image. This makes the proposed method be more effectively extract image’s features than the general PCNN model. Within the scheme, LSF is employed to enhance the features of OFG to make the features of sub-images be easier to extract, as LSF can effectively describe the clarity of an image according to its regional information. Finally, according to the values of LSF, some component coefficients are chosen from different sub-images as the new component coefficients of the fused image. The experimental results indicate that the new scheme is more effective than other commonly used image fusion algorithms.

This paper is organized as follows: Section 2 introduces related theories including LPT and PCNN model. Section 3 presents the detailed processes of the proposed image fusion scheme. The experimental results and discussions about gray and color image fusion are presented in Section 4. Section 5 concludes this paper.

2 Related theories

In this section, we will review the theory of LPT algorithm and PCNN model, which are used in the rest of this paper.

2.1 Laplacian pyramid transform of the image

Laplacian pyramid transform algorithm is based on Gaussian pyramid decomposition (GPD), which consists of two steps. The first step is to get Gaussian pyramid, and the second step is to get Laplacian pyramid [9, 10].

2.1.1 Gaussian pyramid decomposition

In GPD, Gaussian low-pass filter and down-sampling process are repeatedly performed on the source image to get the whole layers of Gaussian pyramid. Let the source image be G₀ which is regarded as the bottom layer_, and the l-th layer is represented by G₁, the decomposition process is described as follows:

$$ {\displaystyle \begin{array}{l}{G}_l\left(i,j\right)=\sum \limits_{m=-2}^2\sum \limits_{n=-2}^2\omega \left(m,n\right){\mathrm{G}}_{l-1}\left(2i+m,2j+n\right)\\ {}\left(1\le l\le N,0\le i<{R}_l,0\le j<{C}_l\right)\end{array}} $$

(1)

where (i, j) represents the pixel position; G_l(i,j) is regarded as the l-th layer of GPD; N denotes the total number of GPD; R_i and C_i denote the rows and columns of the l-th layer of GPD respectively; [ω(m, n)] is a 5 × 5 window matrix, (m, n ∈ − [2, 2]) with low pass resistance, and ω(m, n) is the value of a matrix element at position (m, n). Matrix ω is defined as follow:

$$ \omega =\frac{1}{256}\left[\begin{array}{ccccc}1& 4& 6& 4& 1\\ {}4& 16& 24& 16& 4\\ {}6& 24& 36& 24& 6\\ {}4& 16& 24& 16& 4\\ {}1& 4& 6& 4& 1\end{array}\right]. $$

(2)

2.1.2 Laplacian pyramid decomposition

In GPD, the size of the obtained sub-image is 1/4 of its preceding layer. The dilation process is applied to the Gaussian pyramid via interpolation, making the size of G_l is the same as the size of G_l-1 after dilation. The processes of dilation and Laplacian pyramid decomposition are described by the following Eqs. (3)–(5) where (i, j) represents the position in the pyramid:

$$ {G}_l^{\ast}\left(i,j\right)=4\sum \limits_{m=-2}^2\sum \limits_{n=-2}^2\omega \left(m,n\right){G}_l\left(\frac{i+m}{2},\frac{j+n}{2}\right),\left(0<l\le N,0\le i<{R}_l,0\le j<{C}_l\right), $$

(3)

where,

$$ {G}_l\left(\frac{i+m}{2},\frac{j+n}{2}\right)=\left\{\begin{array}{cc}{G}_l\left(\frac{i+m}{2},\frac{j+n}{2}\right);& \frac{i+m}{2},\frac{j+n}{2}\ \mathrm{are} \operatorname {int}\mathrm{eger}\\ {}0;& \mathrm{other}\end{array}\right. $$

(4)

and,

$$ \left\{\begin{array}{cc}{LP}_l={G}_l-{G}_{l+1}^{\ast };& 0\le l<N\\ {}{LP}_N={G}_N;& l=N\end{array}\right.. $$

(5)

N denotes the top of the pyramid layer; G_l is the l-th layer of Gaussian pyramid;$ {G}_{l+1}^{\ast } $ is the l-th dilated layer of Gaussian pyramid; LP_l denotes the l-th layer of Laplacian pyramid.LP₀, LP_l, …, LP_N make up a Laplacian pyramid.

2.1.3 Inverse Laplacian pyramid transform

We can get the following equations from (5):

$$ \left\{\begin{array}{cc}{G}_N={LP}_N;& l=N\\ {}{G}_l={LP}_l+{G}_{l+1}^{\ast };& 0\le l<N\end{array}\right.. $$

(6)

It can be seen from (6), it can be known that Laplacian pyramid layers are gradually enlarged by insertion operation, in this way, the size of the current layer is the same as the lower layer. The source image can be reconstructed by adding all the layers together one by one.

Figure 1 shows the three-layer LPT decomposition of two source images. It can be seen that LPT can effectively extract the important features of the source images, e.g., in Fig.1 the edges and regional characteristics of the source images are described in these sub-images. It can also be seen that the difference of the two source images in the 3rd layer is not obvious. Therefore, the 3-layer decomposition is adapted to the proposed image fusion scheme.

2.2 PCNN

Every neuron in PCNN model can be regarded as three parts: the receptive field, the modulation field and the pulse generator [16, 17], as shown in Fig. 2. The model is described as the following Eqs. (7)–(11), where the subscript (ij) represents a neuron, n denotes the current iteration. Especially, “a neuron ignition” means a PCNN’s neuron generates a pulse [16, 17].

In the receptive field, the input of a neuron consists of L_ij(n) and F_ij(n) channels that receive the neighboring neurons’ coupling input Y_ij(n), and in addition F_ij(n) channel also receives the external input stimulus S_ij, as described in (7)–(8). In L_ij(n) and F_ij(n) channels, a neuron links to its neighborhood neurons by linking weights W_ijkl and M_ijkl matrix respectively. Need to point out that W_ijkl and M_ijkl is equal, which are embodied by convolution operation in the model. The two channels accumulate the previous output, and the decay exponentials are α^L and α^F, while the channel amplitudes are V^L and V^F respectively.

$$ {F}_{ij}(n)={V}^F\sum \limits_{kl}{M}_{ij kl}{Y}_{kl}\left(n-1\right)+{e}^{-{\alpha}^F}{F}_{ij}\left(n-1\right)+{S}_{ij}, $$

(7)

$$ {L}_{ij}(n)={V}^L\sum \limits_{kl}{W}_{ij kl}{Y}_{kl}\left(n-1\right)+{L}_{ij}\left(n-1\right){e}^{-{\alpha}^L}, $$

(8)

In the modulation field, a constant positive bias (as 1) is added to the linking input L_ij(n) which it is multiplied by the feeding input F_ij(n), β determines the magnified relationship between U_ij(n) and L_ij(n). Accordingly, the internal activity U_ij(n) can be described by as:

$$ {U}_{ij}(n)={F}_{ij}(n)\left[1+\beta {L}_{ij}(n)\right], $$

(9)

The pulse generator consists of a threshold adjuster, a comparison organ and a pulse generator, which are described by (10)–(11). The function of pulse generator is to generate the pulse output Y_ij(n), and the threshold adjuster is to adjust threshold θ_ij(n), $ {V}_{ij}^{\theta } $ is the threshold amplification coefficient. When the internal state U_ij(n) is larger than the threshold θ_ij(n), that is U_ij(n) > θ_ij(n), a pulse is generated by the neuron, which is called an ignition.

$$ {\theta}_{ij}(n)={e}^{-{\alpha}^{\theta }}{\theta}_{ij}\left(n-1\right)+{V}_{ij}^{\theta }{Y}_{ij}\left(n-1\right), $$

(10)

$$ {Y}_{ij}(n)=\left\{\begin{array}{l}1\kern0.5em ,{U}_{ij}(n)>{\theta}_{ij}(n)\\ {}0\kern0.5em , otherwise\end{array}\right., $$

(11)

When PCNN model is used for image processing, the total number of neurons in the neural network is equal to the total number of pixels in the input images, and the relationship between a pixel and a neuron is one-to-one. The output of each neuron results in two states, pulse and non-pulse, so the outputs of neurons comprise a binary image. More information about PCNN can be found in [16, 17, 22, 24, 31].

3 The proposed image fusion scheme

In this section, the proposed image fusion scheme is presented in detail. The scheme is shown in Fig. 3. In this scheme, all input images are decomposed into corresponding constituent sub-images by LPT. Adaptive PCNN is employed to extract the features of these sub-images, and then LSF is employed to enhance their feature regions, which can make the features easy to extract. The fusion rule determines which pixels of the sub-images are used in the fused image. At last, inverse LPT (ILPT) is performed to get the fused image according to the new fused coefficients.

The main motivation of this work is to propose a lightweight image fusion scheme which can keep the balance of fusion performance and complexity. We try to extend the vitality of the simple and effective methods to improve the performance of image fusion method by their organic combination, because the employed algorithms are effective and easy to implement which have been proven by many researchers.

In the proposed method, LPT is used to decompose the important features of the source images into different layers at different scales, which is the foundation of the proposed scheme. The obtained sub-images by LPT can effectively represent the complementary information and redundant information of the input source images, as shown in Figs. 1 and 3. Therefore, PCNN can be employed to extract the complementary and redundant features of these sub-images; the biological characteristics and effective computing mechanism of PCNN can provide satisfactory performance to accomplish the task in this stage, as shown in Fig. 4. Besides, the SF of sub-images is calculated as the parameter of PCNN to reduce the number of parameter settings, which also can improve the adaptability and accuracy of feature extraction for these sub-images. LSF provides the possibility to accurately represent the detailed information of these sub-images; therefore it is utilized to enhance the regional features of the sub-images according to OFG, as shown in Fig. 5. The image fusion rule is applied for each coefficient to choose the high quality coefficients of these sub-images for effectively fuse the source images.

3.1 Adaptive PCNN

The linking strength β of PCNN is the most important parameter and the key determinant of the ignition behavior of PCNN, and SF is the important image definition indicator, which represents the quality of the image [8]. In conventional image processing methods, the linking strength β of PCNN needs to be set manually according to lots of experiments, which limits the widespread application of PCNN. The proposed image fusion scheme uses SF as the β of PCNN, which makes the linking strength be set automatically according to the quality of the sub-images. So the PCNN model can be adaptively adjusted by SF to make it work better. η is used to control the magnitude of SF because the calculated SF may be too large or too small, which may have a negative impact on the proposed method, and it is set as 0.01 in the experiments. As a result, the proposed scheme can effectively extract the important features of the source images. Actually, within the scheme the value of β is calculated as:

$$ \beta ={\eta}^{\ast } SF, $$

(12)

where, η is adjusted factor which is set manually, and

$$ SF=\sqrt{RF^2+{CF}^2}, $$

(13)

$$ RF=\sqrt{\frac{1}{M\times N}\sum \limits_{i=1}^M\sum \limits_{j=2}^N{\left[ im\left(i,j\right)- im\left(i,j-1\right)\right]}^2}, $$

(14)

$$ LCF=\sqrt{\frac{1}{w^2}\frac{1}{M\times N}\sum \limits_{i=1}^N\sum \limits_{j=2}^M{\left[ im\left(i,j\right)- im\left(i,j-1\right)\right]}^2}, $$

(15)

SF depends on row frequency (RF) and column frequency (CF), where M is the of rows of the image, and N is the number of columns of the image, im(i, j) is the image gray level at pixel (i, j).

In most applications of PCNN, its parameters are set empirically according to repeated experiment. In this paper, we set the other parameters of PCNN model as follows: the simplest settings of W_ijkl and M_ijkl is W_ijkl = M_ijkl = [(i − k)² − (k − l)²]⁻¹; the iterationN of PCNN is 300, V^L generally is set as 0.01, V^F is generally a relatively small number such as 1; α^F is set as α^F = 0.0164. $ {V}_{ij}^{\theta } $ is set as V^θ = 62.6012; because of $ {e}^{-{\alpha}^{\theta }}<1 $, so α^θ is set as α^θ = 0.0637. θ(0) is the first internal threshold value, we do not hope the neuron ignite at the first time so that it must be θ(0) > 1, such as θ(0) = 1.2. The α^L is set as α^L = 0.7260. In these parameters, W_ijkl and M_ijkl are computable according to the position of neuron; N, V^L, V^F and θ(0) are fixed generally.

3.2 The OFG of PCNN

PCNN has the characteristics of global coupling and pulse synchronization. In the proposed scheme, PCNN is used to extract the complementary and redundant features of sub-images, and these features contain the detailed information of these sub-images, such as texture, edge and regional distribution. The ignition operation of PCNN can generate a binary image by recording whether the neurons ignite or not. The binary images can be regarded as the features of source image because they can effectively express the detailed information of their sub-images. With the statistic of the binary image of the neurons, an oscillation frequency graph (OFG) can be obtained, as is shown in (16). It is obvious that OFG can effectively represent the detailed information of the source images when comparing Fig. 4 with Fig. 1, but does not have obvious regional characteristics. The reason is that the detail information is described by isolated pixel; however, the regional information of images is very important for HVS and image fusion.

$$ OFG\left(i,j\right)=\sum \limits_{n=1}^N{Y}_{ij}(n). $$

(16)

where N denotes the iteration times, Y_ij denotes the pulse output of the neuron (i, j).

3.3 LSF of OFG

Local spatial frequency (LSF) can represent the regionally detailed information of an image, which is composed of local row frequency (LRF) and local column frequency (LCF) [8, 21, 27]. In the proposed scheme, the LSF of pixel represents the information of the regional characteristics in the sub-images, because LSF can consider the image information of the neighboring pixels in a special region, which is shown in eqs. (17)–(19). It can make the features easier to extract. The pixels of fused image can be obtained by comparing the value of LSF, which are presented in the algorithm sub-section. Figure 5 shows the LSF of an OFG form Fig. 4, it can be seen that the regional detail information of the sub-image in OFG is enhanced when compared with Fig. 4. Calculating the LSF of OFG makes the features of source image more evident, and in turn makes the regional features easier to be extracted. As a result, the important coefficients will be easier to be selected as the final coefficients of the fused image.

$$ LSF=\sqrt{LRF^2+{LCF}^2}, $$

(17)

$$ LRF=\sqrt{\frac{1}{w^2}\sum \limits_{i=1}^w\sum \limits_{j=2}^w{\left[ OFG\left(i,j\right)- OFG\left(i,j-1\right)\right]}^2}, $$

(18)

$$ LCF=\sqrt{\frac{1}{w^2}\sum \limits_{i=2}^w\sum \limits_{j=1}^w{\left[ OFG\left(i,j\right)- OFG\left(i-1,j\right)\right]}^2}, $$

(19)

where w is the size of the window, OFG(i, j) is the oscillation frequency graph at pixel (i, j).

3.4 The rule of fusion

The rule of fusion is very important in an image fusion method; as it determines which pixel or coefficient can be fused in the final result. Therefore, it has a significant influence on the fused image. In this paper, the fused sub-image coefficients are determined by the fusion rule which is described by (20), which is used for each coefficient to recognize the high quality coefficients according to the LSF vale of OFG. When the corresponding LSF value of a coefficient from sub-image A is larger than the LSF value of the coefficient from sub-image B, the coefficients from sub-image A would be chose as the final coefficient of the fused image, and vice versa. When the LSF value of a coefficient form sub-image A and B is equal, the mean of the two coefficients is calculated as the final coefficient of the fused image.

$$ {FC}_{ij}=\left\{\begin{array}{cc}{C}_{Aij},& \left({LSF}_{Aij}>{LSF}_{Bij}\right)\\ {}{C}_{Bij},& \left({LSF}_{Aij}<{LSF}_{Bij}\right)\\ {}\left({C}_{Aij}+{C}_{Bij}\right)/2,& \left({LSF}_{Aij}={LSF}_{Bij}\right)\end{array}\right.. $$

(20)

where the subscript ij represents the pixel position in the sub-image; FC_ij is the fused sub-image coefficients; C_Aij and C_Bij are the sub-images coefficients from image A and B respectively; and LSF_ij is the LSF value of the OFG in the sub-image.

3.5 Fusion algorithm

The image fusion algorithm based on the proposed scheme shown in Fig.3 is described as the following steps:

Step 0: Given source images A and B.
Step 1: The images A and B are decomposed by LPT to get several corresponding sub-image sets represented as C_Aij and C_Bij.
Step 2: Calculate the SF of these sub-images as the β of PCNN, respectively.
Step 3: PCNN is utilized to deal with the sub-images to get the OFG of all sub-images.
Step 4: Calculate the LSF_ij of each coefficient in the OFG with the size of the window w being of 9 × 9 pixels.
Step 5: Determine every fused sub-image coefficients FC_ij by following the fusion rule described in (20).
Step 6: Reconstruct the fused image from the fused sub-images coefficients by using ILPT.

3.6 Color image fusion

RGB color image is the fundamental and often-used image data format; it can be directly applied to image display equipment. Thus, the proposed scheme is performed in RGB color space to fuse the source images to reduce the operation of color transform, which also can avoid the nonlinear operations and transform error. RGB color image is composed of R (Green), G (Green) and B (Blue) components. In order to fuse color source images, we regard the three components of a RGB color image as three gray images, and apply the proposed image fusion scheme to them separately. The final fused color image is then produced from the individual fused component images. Accordingly the color image fusion framework based on the proposed fusion scheme is shown in Fig. 6.

4 Experiments and analysis

We selected commonly used image fusion methods to compare with our method to verify the validity of proposed scheme. These comparison methods were principal component analysis (PCA); wavelets transform (WT); stationary wavelets transform (SWT); filter-subtract-decimate pyramid (FSDP); gradient pyramid (GP); pulse coupled neural networks (PCNN); nonsubsampled contourlet transform (NSCT); nonsubsampled shearlet transform (NSST); NSCT+PCNN-LSF and NSST+PCNN-LSF. The last two schemes are similar to our scheme. The details of the above mentioned methods are shown in Table 1.

Table 1 Fusion methods for comparison and their features (‘--’ represents null)

Full size table

In order to evaluate the performance of different image fusion methods, we adopted the commonly used image fusion performance indexes, which were mutual information (MI), entropy (EN), standard deviation (SD), space frequency (SF), edge based on similarity measure (Q^abf) and feature mutual information (FMI) [12, 13, 25, 28, 38]. Among them Q^abf is the most important index as it indicates how much edge information is preserved in the fused image. Higher index value(s) of a fused image means a higher quality of the fused image.

4.1 Gray image fusion experiments and analysis

Several groups of images with different focuses were used to evaluate the performance of the proposed algorithm. These images have different resolutions, focused areas and detailed information, are often-used in image fusion research. The first group of images is Clocks, which are shown in Fig. 1a, b. Image A focuses on the right and image B focuses on the left.

The fused images generated from different methods are shown in Fig. 7. In Fig. 7, the proposed algorithm did well in extracting the features of the source images. The fusion effect of the PCA, SWT, FSDP and PCNN method was poorer than other methods. There were clarity fallings to some extent in PCA, SWT, FSDP, GP and PCNN methods. Nevertheless, WT, NSCT, NSST, NSCT+PCNN-LSF, NSST+PCNN-LSF and the proposed method had a good fusion effect; although there were some blurry regions in the big clock, for NSST, NSCT+PCNN-LSF and NSST+PCNN-LSF methods. Overall the result generated by the proposed scheme was better than the competitive image fusion methods; even the schemes NSCT-PCNN-LSF and NSST+PCNN-LSF which are similar to our proposed scheme.

It can be seen in Fig. 7k that the fused image generated by our proposed scheme is clear. It retains most details of the source images. Besides, the edges are clearer and the definition is higher than most of others. In other words the proposed scheme performed image fusion better than the conventional transform domain image fusion methods, and it also achieves the similarity or even better image fusion effect as those complicated transform domain fusion methods.

Table 2 lists the fusion quality index values of the all experiment fusion methods for this group of images. It can be seen in Table 2 that the fusion image generated by the proposed method contained more information. The MI, Q^abf and FMI values of this method are much higher than those of other methods. The SD and SF of the proposed method are closer to WT method, which are little better than other methods. The EN values of all methods are very close. It is well known that MI and Q^abf are the most important quality indexes as they effectively indicate how much source image information is reserved in the fused image. The SF index indicates the definition quality of an image. The higher MI, Q^abf and SF values of the proposed method indicated this method outperformed the other methods in terms of fused image quality.

Table 2 Fusion quality indexes of Fig. 7

Full size table

The second group of images is Boats, which are shown in Fig. 8a, b. Image A focuses on the left and image B focuses on the right. The fused images generated by different methods are shown in Fig. 9. It can be seen that there were clarity fallings to some extent for in PCA, SWT, FSDP, GP and PCNN methods. It can be seen that the fused images of WT, NSCT, NSST, NSCT+PCNN-LSF, NSST+PCNN-LSF and the proposed method had a good fusion effect, and were clearer than those generated by the conventional transform domain image fusion methods.

Table 3 lists the fusion quality index values of the all experiment fusion methods for this group of images. It can be seen in Table 3 that the fused image generated by the proposed method contained more information. The MI, Q^abf and FMI values of this method are much higher than other methods. The SD and SF values of the proposed method and WT method are very close, and are little better than those of the other methods. The EN values of all methods are very close. The evaluation conclusion made from Table 3 is the same as from Table 2.

Table 3 Fusion quality indexes of Fig. 9

Full size table

It can be seen in Figs. 7 and 9 that the edge of the fused image generated by the proposed method is clear, and it retains most of textures and details in the source images. It can be seen in Tables 2 and 3 that the values of evaluation indexes are similar. The experiments on the two groups of images obtained consistent results which proved the reasonability of the experiments and the effectiveness of the proposed method. Overall, the experimental results showed that the proposed scheme was an effective fusion method for getting better fusion image, whether from the subjective or objective aspect. This method was able to extract the main features from the source images; and achieved good results for gray images. It can also be seen the proposed fusion method achieved better results than the conventional transform domain image fusion methods, and achieved the results that were similar to or better than those of the complicated transform domain image fusion methods.

4.2 Color image fusion experiments and analysis

Several groups of color images with different focuses were used to evaluate the performance of the proposed algorithm. The first group of color images is Cups, which are shown in Fig. 10. Image A focuses on the left and image B focuses on the right, and there are many words as details.

The fused images generated by different methods are shown in Fig. 11.The fused image generated by the proposed algorithm was better than the others, except a little bit color distortion near the black word “Flora”. The proposed algorithm did well in extracting the features of the source images, and the fused image was the closest one to the source image, as it had natural colors and contained more edges, textures, details and less artifacts. While there were some ghost artifacts in the fused images generated by WT, NSCT, NSST, NSCT+PCNN-LSF, and NSST+PCNN-LSF methods, and the clarity of PCA, SWT, FSDP, GP and PCNN had fallings to some extent. From the fused images in Fig. 11, we can conclude that the proposed algorithm is more effective than the most of the competitive fusion image methods for this group of color images.

Table 4 lists the fusion quality index values of the all experiment fusion methods for this group of color images. We can see in Table 4 that the fused images generated by the proposed method contained more information, as the Q^abf of this method is higher than that of the other methods and the MI of the proposed method is very close to the best one possessed by PCNN. The EN of the proposed method is closer to that of FSDP. SF, SD and FMI of the proposed method are also at a higher level. Since Q^abf and MI are two important quality indexes in which the proposed method achieved higher values, we can say that generally the proposed method achieves a better effect than the simple image fusion methods, and achieves a satisfactory effect that is closer to that of the complex algorithms.

Table 4 Fusion quality indexes of Fig. 11

Full size table

The second group of color images is shown in Fig. 12. Image A focuses on the left and image B focuses on the right, and there are many words as details.

The fused images generated by different methods are shown in Fig. 13. There were clarity fallings to some extent for PCA, SWT, FSDP, GP and PCNN methods. The proposed fusion method achieved a better fusion result than the conventional transform domain image fusion methods, and a satisfactory result that is similar to that of the complicated transform domain image fusion methods for this group of color images. The experiment showed that the proposed algorithm did well in extracting the features of the source images, such as edges, textures and details.

Table 5 lists the fusion quality index values of the all experiment fusion methods for this group of color images. Table 5 showed that overall, the results are similar to those of the first group of color images. The Q^abf of the proposed method is much higher than that of the other methods. This indicates that there is more edge information of the source image being preserved in the fused image. Most of the evaluation indexes of the proposed method are better than those of the simple transform domain based image fusion methods, and are very close to those of the complicated transform domain methods.

Table 5 Fusion quality indexes of Fig. 13

Full size table

The experimental results about the color image fusion showed that the color of color images generated by the proposed method was well preserved, and there was less color distortion in the final fused image. It can be seen in Tables 4 and 5 that the most evaluation indexes of the proposed method are better than those of the often-used image fusion methods and are similar to those of the complicated fusion methods. The experimental results show that the proposed method is effective in color image fusion generally. On the other hand, the values of evaluation indexes in both Table 4 and Table 5 are quite similar, which shows the reasonability of the experiments.

It can be seen from Table 1 that the proposed method only needs to deal with 3 sub-images, while most of the other methods need to deal with much more than this number; besides, the size of the sub-images generated by proposed method is smaller than the most of other methods. So the proposed method requires less memory space when conducting image fusion. In addition, LPT is the key process of the proposed scheme, which is easy to implement and has a low computational complexity. As a result, the proposed image fusion scheme has the advantages of low space requirements, easy implementation and modest computational complexity.

From Table 2 to Table 5, we can find that the effect of the color image fusion and gray image fusion is different, no matter for the proposed or other methods. This implies that it might not be a better option to directly apply gray image fusion methods to color image fusion. Generally speaking, a color image is the combination of 3 channels, i.e., R, G and B, and the gray level distribution and features of these three channels are quite different from a normal gray image. So directly applying a gray image fusion method/scheme to deal with the three color channels as the three gray images, which might not be able to properly catch the main features of color images and get better results.

5 Conclusions

We propose an effective image fusion scheme based on LPT and adaptive PCNN-LSF. LPT algorithm can decompose important image features into sub-images at different scales and levels. The proposed scheme employs LPT to decompose the source image into corresponding constituent sub-images. SF makes PCNN adaptive and effective in extracting image features. LSF is employed to enhance the features of the sub-images. Compared with commonly used image fusion algorithms, the proposed scheme only need to deal with fewer sub-images and is easy to implement. The experimental results on gray and color images show that the proposed image fusion method can fuse different focus positions of the source images, and the fused image contains more information of the source images than conventional methods. Compared with the commonly used algorithms, the proposed scheme achieves better fusion performance. The experiments also indicate that color image fusion should be specially treated, as color images have their own characteristics that are different from those of gray images.

References

Adu J, Gan J, Wang Y, Huang J (2013) Image fusion based on nonsubsampled contourlet transform for infrared and visible light image. Infrared Phys Technol 61:94–100
Article Google Scholar
Bavirisetti DP, Dhuli R (2016) Two-scale image fusion of visible and infrared images using saliency detection. Infrared Phys Technol 76(52–64)
Bhateja V, Patel H, Krishnm A, Sahu A, Lay-Ekuakille A (2015) Multimodal medical image sensor fusion framework using cascade of wavelet and contourlet transform domains. IEEE Sensors J 15(12):6783–6790
Article Google Scholar
Bulanon DM, Burks TF, Alchanatis V (2009) Image fusion of visible and thermal images for fruit detection. Biosyst Eng 103(1):12–22
Article Google Scholar
Cheng J, Liu H, Liu T, Wang F, Li H (2015) Remote sensing image fusion via wavelet transform and sparse representation. ISPRS J Photogramm Remote Sens 104:158–173
Article Google Scholar
Eckhorn R, Reitboeck HJ, Arndt M, Dicke PW (1989) A neural network for feature linking via synchronous activity: results from cat visual cortex and from simulations. In: Cotterill RMJ (ed) Models of brain function. Cambridge Univ. Press, Cambridge, pp 255–272
Google Scholar
Eckhorn R, Reitboeck HJ, Arndt M, Dicke PW (1990) Feature linking via synchronization among distributed assemblies: simulation of results from cat cortex. Neural Comput 2:293–307
Article Google Scholar
Eskicioglu AM, Fisher PS (1995) Image quality measures and their performance. IEEE Trans Commun 43(12):2959–2965
Article Google Scholar
Frejlichowski D (2010) Robert Wanat.: application of the Laplacian pyramid decomposition to the enhancement of digital dental radiographic images for the automatic person identification. Image analysis and recognition. Lect Notes Comput Sci 6112:151–160
Article Google Scholar
Gao X, Zhang H, Chen H, Li J (2015) Multi-modal image fusion based on ROI and Laplacian Pyramid, Proc. SPIE 9443, Sixth International Conference on Graphic and Image Processing (ICGIP 2014), 94431A
Geng P, Huang M, Liu S et al (2016) Multifocus image fusion method of Ripplet transform based on cycle spinning. Multimed Tools Appl 75(17):1–11
Article Google Scholar
Haghighat MBA, Aghagolzadeh A, Seyedarabi H (2011) A non-reference image fusion metric based on mutual information of image features. Comput Electr Eng 37:744–756
Article MATH Google Scholar
Hong R, Cao W, Pang J et al (2014) Directional projection based image fusion quality metric. Inf Sci 281:611–619
Article MathSciNet Google Scholar
Ji X, Zhang G (2015) Image fusion method of SAR and infrared image based on curvelet transform with adaptive weighting. Multimed Tools Appl 76(17):17633–17649
Jin H, Xing B, Wang L, Wang Y (2015) Fusion of remote sensing images based on pyramid decomposition with Baldwinian clonal selection optimization. Infrared Phys Technol 73:204–211
Article Google Scholar
Jin X, Nie R, Zhou D, Yao S et al (2016) A novel DNA sequence similarity calculation based on simplified pulse-coupled neural network and Huffman coding. Physica A 461:325–338
Article MathSciNet Google Scholar
Jin X, Zhou D, Yao S et al (2016) Remote sensing image fusion method in CIELab color space using nonsubsampled shearlet transform and pulse coupled neural networks. J Appl Remote Sens 10(2):025023:1–025023:18
Article Google Scholar
Jin X, Jiang Q, Yao S et al (2017) A survey of infrared and visual image fusion methods. Infrared Phys Technol 85(2017):478–501
Article Google Scholar
Johnson JL, Padgett ML (1999) PCNN models and applications. IEEE Trans Neural Netw 10(3):480–498
Article Google Scholar
Kountchev R, Rubin S, Milanova M, Kountcheva R (2015) Comparison of image decompositions through inverse difference and Laplacian pyramids. International Journal of Multimedia Data Engineering & Management Archive 6(1):19–38
Article Google Scholar
Li S, Kwok J, Wang Y (2001) Combination of images with diverse focuses using the spatial frequency. Information Fusion 2(3):169–176
Article Google Scholar
Li H, Jin X, Yang N, Yang Z (2015) The recognition of landed aircrafts based on PCNN model and affine moment invariants. Pattern Recogn Lett 51:23–29
Article Google Scholar
Li S, Kang X, Fang L, Hub J, Yin H (2017) Pixel-level image fusion: a survey of the state of the art. Information Fusion 33(2017):100–112
Article Google Scholar
Monica Subashini M, Sahoo SK (2014) Pulse coupled neural networks and its applications. Expert Syst Appl 41:3965–3974
Article Google Scholar
Naidu VPS (2014) Hybrid DDCT-PCA based multi sensor image fusion. J Opt 43(1):48–16
Article MathSciNet Google Scholar
Nencini F, Garzelli A, Baronti S, Alparone L (2007) Remote sensing image fusion using the curvelet transform. Information Fusion 8(2):143–156
Article Google Scholar
Qu XB, Yan JW, Xiao HZ, Zhu ZQ (2008) Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Automat Sin 34(12):1508–1514
Article MATH Google Scholar
Singh S, Gupta D, Anand RS, Kumar V (2015) Nonsubsampled shearlet based CT and MR medical image fusion using biologically inspired spiking neural network. Biomed Signal Process Control 18:91–101
Article Google Scholar
Vijayarajan R, Muttan S (2015) Discrete wavelet transform based principal component averaging fusion for medical images. AEU Int J Electron Commun 69(6):896–902
Article Google Scholar
Wen D, Jiang Y, Zhang Y et al (2014) Modified block-matching 3-D filter in Laplacian pyramid domain for speckle reduction. Opt Commun 322:150–154
Article Google Scholar
Xiang T, Yan L, Gao R (2015) A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain. Infrared Phys Technol 69:53–61
Article Google Scholar
Yan C, Zhang Y, Xu J et al (2014) Efficient parallel framework for HEVC motion estimation on many-Core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089
Article Google Scholar
Yan C, Zhang Y, Xu J et al (2014) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett 21(5):573–576
Article Google Scholar
Yan C, Xie H, Yang D et al (2017, In press) Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2017.2749965
Yan C, Xie H, Liu S et al (2017, In press) Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2017.2749965
Yang B, Li S (2007) Multi-focus image fusion based on spatial frequency and morphological operators. Chin Opt Lett 5(8):452–453
MathSciNet Google Scholar
Yang Y, Tong S, Huang S, Pan L (2014) Dual-tree complex wavelet transform and image block residual-based multi-focus image fusion in visual sensor networks. Sensors 14:22408–22430
Article Google Scholar
Zhang B, Lu X, Pei H, Ying Z (2015) A fusion algorithm for infrared and visible images based on saliency analysis and non-subsampled Shearlet transform. Infrared Phys Technol 73:286–297
Article Google Scholar

Download references

Acknowledgements

The authors thank the editors and the anonymous reviewers for their careful works and valuable suggestions for this study. The authors thank O. rockinger, G. Easley et al., and da Cunha AL et al. for their kindly sharing of program. This study is supported by the National Natural Science Foundation of China (No. 61365001, No. 61463052 and No.61640306). We thank to the support of Scientific Research Fund of Education Department of Yunnan Province (No. 2017YJS108) and Doctoral Candidate Academic Award of Yunnan Province. We also thank Dr. Shin-Jye Lee for his valuable advises.

Author information

Authors and Affiliations

School of Information, Yunnan University, Kunming, China
Xin Jin, Rencan Nie, Dongming Zhou, Qian Jiang & Kangjian He
School of Information Technology, Deakin University, Melbourne, Australia
Jingyu Hou
School of Software, Yunnan University, Kunming, China
Shaowen Yao

Authors

Xin Jin
View author publications
You can also search for this author in PubMed Google Scholar
Jingyu Hou
View author publications
You can also search for this author in PubMed Google Scholar
Rencan Nie
View author publications
You can also search for this author in PubMed Google Scholar
Shaowen Yao
View author publications
You can also search for this author in PubMed Google Scholar
Dongming Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Qian Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Kangjian He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shaowen Yao or Dongming Zhou.

Ethics declarations

Conflict of interests

The authors declare that there is no conflict of interests regarding the publication of this manuscript. This article does not contain any studies with human participants performed by any of the authors. Informed consent was obtained from all individual participants included in the study.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, X., Hou, J., Nie, R. et al. A lightweight scheme for multi-focus image fusion. Multimed Tools Appl 77, 23501–23527 (2018). https://doi.org/10.1007/s11042-018-5659-4

Download citation

Received: 13 June 2017
Revised: 22 November 2017
Accepted: 14 January 2018
Published: 30 January 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s11042-018-5659-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A lightweight scheme for multi-focus image fusion

Abstract

Similar content being viewed by others

A novel approach for multi-focus image fusion based on SF-PAPCNN and ISML in NSST domain

Multi-focus image fusion combining focus-region-level partition and pulse-coupled neural network

Multi-focus Image Fusion with Cooperative Image Multiscale Decomposition

1 Introduction

2 Related theories