1 Introduction

THE rapid development of IR imaging technology has made IR imaging more widely used in automatic driving, night search and rescue, security, and other fields [1]. However, the signals received by IR imaging systems are easily affected by atmospheric thermal radiation, which makes the acquired IR images exhibit low resolution, poor contrast, and blurred images. Therefore, it is of great significance to study infrared image enhancement algorithms.

IR image enhancement mainly focuses on highlighting specific areas or target objects that are interesting to people in IR images while reducing the interference of unnecessary information in IR images. Infrared image enhancement algorithms can be generally classified into frequency and spatial domain methods according to the space the algorithm operates. The frequency-domain method mainly converts the image to the transform domain for correction to enhance the image details and contrast. The Fourier transform is generally the tool for conversion between the space and frequency domains of infrared images. Although the Fourier transform has good frequency domain expression, it ignores the spatial location relationship between pixels. Therefore, the frequency-domain method can only process the whole image globally and cannot analyze the frequency of specific details. The representative algorithms are [2,3,4,5]. The spatial domain method belongs to image pixels’ operation, mapping the input image’s gray values to the output image according to the mapping rules. The processing methods based on the spatial domain include: histogram equalization, histogram equalization expansion algorithm, gradient-domain processing, image sharpening, spatial filtering processing, and optimization-based algorithms, as well as the derived combinatorial algorithms, the representative algorithms are [6,7,8].

Guided image filter (GIF) [9], weighted least squares filtering [10], and bilateral filtering [11] are three classical edge-preserving filtering algorithms. Bilateral filtering is a filter that takes into account both pixel space differences and intensity differences. Moreover, it has the property of preserving the image edges. The purpose of weighted least squares filtering is to make the output image as similar as possible to the original image after smoothing, and keep the original shape in the edge part as much as possible. The GIF is a filtering technique that has emerged only in the last decade. It is similar to bilateral and weighted least squares filtering because it has the same edge-preserving smoothing property. The most significant advantage of GIF is that it can write algorithms whose time complexity is independent of the window size. When processing images with large windows, GIF is more efficient. Therefore, this paper studies the infrared image enhancement algorithm based on the GIF.

IR image enhancement algorithm based on the GIF is an image enhancement processing algorithm that belongs to the multi-scale decomposition of spatial domain filtering. Although there have been many studies on infrared image enhancement based on guided filtering, no matter which type of enhancement algorithm faces a common problem: halo artifacts and image noise quickly appear when enhancing image details. As shown in Fig. 1, the enhanced IR image shows noticeable halo artifacts at the edge of the IR target. To solve the above common problems, this paper proposes an IR image enhancement algorithm based on DGIF, which can effectively inhibit image noise and improve image contrast while enhancing image details.

Fig. 1
figure 1

Effect of halo artifacts

The rest of this paper is organized as follows. Section 2 introduces the representative works related to the GIF and the IR image enhancement algorithm based on the GIF. Section 3 introduces the algorithm proposed in this paper. Section 4 presents the ablation experiments and compares the proposed algorithm with five other infrared image enhancement algorithms. Section 5 provides a comprehensive summary of the research work done in this paper.

2 Related works

Since Francesco Branchitta proposed an image enhancement algorithm based on image dynamic range segmentation in 2009 [12], scholars have proposed similar image layered enhancement algorithms, among which the more widely used one is based on the GIF. In 2013, He et al. proposed the GIF. The essence of the guided filter is a local linear model. In terms of filtering effect, because the GIF has specific edge-preserving smoothing characteristics, the image processed by the GIF contains a more detailed texture than other types of filters. The most significant advantage of GIF is that the algorithm’s time complexity is independent of the filtering radius, so it is faster when processing images with large windows. However, the GIF can produce problems such as halo artifacts near the edge of the image, so improved algorithms for the GIF have been proposed. Li et al. proposed the Weighted Guided Image Filter (WGIF) [13] to preserve better image edges, adding a variance-based edge perception factor to the cost function. Since there are no explicit constraints to deal with image edges in WGIF, Kou et al. proposed the Gradient Domain Guided Image Filter (GGIF) [14]. It improves the edge perception factor in WGIF and proposes a new detail regulation factor to make the gradient of the guided image more similar to the output image. However, neither WGIF nor GGIF can avoid the generation of halo artifacts when applied to contrast enhancement. Lu et al. proposed the Effective Guided Image Filter (EGIF) [15]. It introduces a new variance-based weighting method in the cost function. Since this method inhibits areas with slight variance, this causes information loss in areas with insignificant details.

In recent years, with continuous research on the GIF, researchers from various countries have proposed many IR image enhancement algorithms based on the GIF. Liu et al. proposed an IR image detail enhancement algorithm based on the GIF [16], which can effectively avoid the gradient flip phenomenon and retain more detailed textures. However, due to the fixed parameters set by the algorithm, the scene automatically has Poor adaptability. Zhou et al. proposed an adaptive IR image enhancement algorithm based on the GIF [17], which determines the adaptive thresholds for the base layer through the histogram distribution information so that the base layer can better display the valid information in the image. However, because the fusion ratio between the base layer and the detail layer of the image is set to a fixed value in this algorithm, the IR image with a large area of sky background will contain a lot of noise interference after processing by this algorithm. Wang et al. proposed an adaptive IR image enhancement algorithm to suppress image noise based on the GIF [18] effectively. On the one hand, this algorithm implements an adaptive selection threshold to adjust the mapping range of the base layer histogram. On the other hand, it applies adaptive gain control to the detail layer to highlight the detail contour information and reduce the detail layer noise. However, this algorithm does not specify the parameters of the GIF, and the designed adaptive threshold for the base layer is unreasonable, resulting in the contrast of the enhanced image is not effectively enhanced. In addition, a variety of IR image enhancement algorithms based on the GIF have been proposed [5, 19, 20]. In summary, the traditional GIF and their improved algorithms are prone to halo artifacts and image noise when applied to IR image enhancement, resulting in blurred image edges and poor visual effects. Therefore, this paper proposes a new method to enhance IR images to solve the above problems.

The difference between the algorithm in this paper and the previous IR image enhancement algorithms based on GIF is that this algorithm does not apply the GIF directly to decompose the image to be filtered but proposes a DGIF. The DGIF consists of the constructed edge perception and detail regulation factors added to the cost function of the original GIF so that the filtered image retains more image edges and weak details. In addition, this paper also adds the detail modulation factor to the detail layer enhancement for the visual characteristics of human eyes to effectively suppress the noise in the detail layer and improve the visual effect.

3 Method

This section proposes an IR image enhancement algorithm based on DGIF. The block diagram of the algorithm is shown in Fig. 2. First, we propose a DGIF, which mainly adds the constructed edge perception factor and detail regulation factor to the cost function of the GIF. It makes the filtered image contain more detail information while preserving the image edges. Then, the detail regulation factor is applied to the detail layer enhancement so that the enhanced detail layer can effectively inhibit the image noise while enhancing the details. Finally, the enhanced detail layer is directly fused with the base layer.

Fig. 2
figure 2

The algorithm block diagram of this paper

3.1 Guided image filter

The GIF is a local filter with the advantage of the edge-preserving property and low complexity. The core idea is to define any pixel in the image and the pixels in its neighborhood as a local linear model.

The GIF decomposes the image \(p_{\text{i}}\) to be filtered into a base layer \(q_{\text{i}}\) reflecting the contour of the image and a detail layer \(n_{\text{i}}\) reflecting the detail texture and noise information.

$$ q_{\text{i}} = p_{\text{i}} - n_{\text{i}} $$
(1)

The base layer \(q_{{\text{i}}}\) is defined as a model that has a local linear model with the guide image \(I_{i}\) in the filter window \(\omega_{k}\).

$$ q_{\text{i}} = a_{\text{k}} I_{\text{i}} + b_{\text{k}} ,\;\forall i \in \omega_{\text{k}} $$
(2)

where \(\omega_{k}\) is the filter window centered on the pixel \(k\) with size \((2r + 1) \times (2r + 1)\), and \({\text{r}}\) is the filter radius. \(a_{\text{k}}\) and \(b_{\text{k}}\) are the two constants in the window whose values are obtained by minimizing the cost function. The cost function is as follows:

$$ E(a_{\text{k}} ,b_{\text{k}} ) = \sum\limits_{{\text{i} \in \omega_{\text{k}} }} {((a_{\text{k}} I_{\text{i}} + b_{\text{k}} - p_{\text{i}} )^{2} + \varepsilon a_{\text{k}}^{2} )} . $$
(3)

where \(\varepsilon\) is the regularization parameter used to prevent the coefficient \(a_{\text{k}}\) from being too large.

3.2 Detail enhancement guided filter

Since the traditional GIF uses fixed regularization parameters for the whole image, it cannot adapt to the texture features of different image areas, resulting in a “halo” phenomenon near the edge of the image during image enhancement. To enhance the image details and avoid halo artifacts, this paper proposes a DGIF. The base layer retains more image edges and weak details, while the detail layer contains only a large amount of image noise and a small amount of filtered detail information.

In WGIF, an edge perception factor based on local variance is defined according to the variance of different position pixels in the local window. The algorithm is used with the default premise that all regions with significant local variances are edge regions. However, not all regions with significant local variances are at the image edges, and this edge perception factor is not sensitive to weak edges of the image. It is generally accepted that those gradients are essential to how humans perceive images and that human cortical cells can preferentially respond to high contrast stimuli [21]. So, this paper introduces the gradient into the new edge perception factor, as shown in Eq. (4). It mainly penalizes the regularization parameter as a way to obtain image adaptivity to each area and attenuate halo artifacts, thereby improving the perception of the image’s weak edges and the algorithm’s robustness. The value \(\Gamma_{\text{I}} (k^{\prime } )\) is usually greater than one if \(k^{\prime }\) is on edge and less than one if \(k^{\prime }\) is in the smooth area. Obviously, by using the edge perception factor \(\Gamma_{I} (k^{\prime } )\), the edge pixels are assigned a greater weight than the pixels in the flat area.

$$ \Gamma_{I} (k^{\prime } ) = \frac{1}{N}\sum\limits_{k = 1}^{N} {\frac{{\varphi (k^{\prime } ) + \gamma }}{\varphi (k) + \gamma }} $$
(4)

where \(\varphi (k) = \sigma^{2}_{I,r} (k) + G(\left| {grad} \right|)\) contains rich boundary and detail information, \(\sigma^{2}_{{\text{I},\text{r}}} (k)\) represents the local variance of the \((2r + 1) \times (2r + 1)\) window, \(G\) represents Gaussian filtering, \(\left| {grad} \right|\) represents the absolute value of the gradient, \(\gamma\) is a constant, usually defined as \((0.001 \times L)^{2}\), \(L\) is the dynamic range of the image to be filtered, and \(N\) is the total number of pixels in the guide image \(I\). In addition, the edge perception factor \(\Gamma_{\text{I}} (k^{\prime } )\) measures the importance of pixels \(k^{\prime }\) relative to the whole bootstrap image and the algorithmic complexity of \(\Gamma_{\text{I}} (k^{\prime } )\) is \(O(N)\) for an image with \(N\) pixels.

According to the local linear model of Eq. (2), we can obtain the \(\nabla q_{\text{i}} = a_{\text{k}} \nabla I_{\text{i}}\), which means that the output image and the guide image have similar gradients. The degree of smoothing of the filter output image depends on the window coefficient \(a_{\text{k}}\). If \(a_{\text{k}}\) is 1, the pixel is at the edge of the image, and that edge will be preserved intact. On the contrary, if \(a_{\text{k}}\) is 0, then the pixel is in a flat area of the image, and that area is well smoothed. This paper proposes a new detail regulation factor \(\gamma_{\text{k}}\) to make the image edges stand out better, as shown in Eq. (5). This factor not only preserves the image edges but also enables the weak details in the relatively flat areas of the image to be well preserved. \(\gamma_{\text{k}}\) is close to 1 when the pixel is in an edge area and close to 0 when the pixel is in a flat area.

$$ \gamma_{\text{k}} = \frac{\lg (n + x) - \lg (x)}{{\lg (1 + x) - \lg (x)}} $$
(5)

where \(x\) is 0.3 times the pixel mean of the image to be filtered and the specific expression \(n\) is as follows:

$$ n = \frac{{\varphi (P) - \,\text{min}\,(\varphi (P))}}{\max \,(\varphi (P)) - \,\min \,(\varphi (P))} $$
(6)

Therefore, from Eqs. (3), (4), and (5), we define a new cost function for the DGIF proposed in this paper as follows:

$$ E(a_{\text{k}} ,b_{\text{k}} ) = \sum\limits_{{\text{i} \in \omega_{\text{k}} }} {\left( {(a_{\text{k}} I_{\text{i}} + b_{\text{k}} - p_{\text{i}} )^{2} + \frac{\varepsilon }{{\Gamma_{\text{I}} (k^{\prime } )}}(a_{\text{k}} - \gamma_{\text{k}} )^{2} } \right)} $$
(7)

The optimal values of \(a_{\text{k}}\) and \(b_{\text{k}}\) are calculated as follows:

$$ a_{\text{k}} = \frac{{\mu_{{\text{I} \odot \text{p}}} (k) - \mu_{\text{I}} (k)\mu_{\text{p}} (k) + \frac{\varepsilon }{{\Gamma_{\text{I}} (k^{\prime } )}}\gamma_{\text{k}} }}{{\sigma_{\text{I}}^{2} (k) + \frac{\varepsilon }{{\Gamma_{\text{I}} (k^{\prime } )}}}} $$
(8)
$$ b_{\text{k}} = \mu_{\text{p}} (k) - a_{\text{k}} \mu_{\text{I}} (k) $$
(9)

where \(\odot\) represents the product of the corresponding elements of the two matrices, \(\mu_{{\text{I} \odot \text{p}}} (k)\), \(\mu_{I} (k)\) and \(\mu_{\text{p}} (k)\) represent the mean values of \(I \odot P\), \(I\) and \(P\) in the filter window \(\omega_{\text{k}}\), respectively, and \(\sigma_{I}^{2} (k)\) represents the variance.

During the filter window shift, each pixel point \(i\) on the image is contained by \(\left| \omega \right|\) filter windows. To get the filter output for a single pixel, simply calculate the mean values of \(a_{\text{k}}\) and \(b_{k}\) within the window \(\left| \omega \right|\).

$$ q_{\text{i}} = \frac{1}{\left| \omega \right|}\sum\limits_{{\text{k}\left| {\text{i} \in \omega_{\text{k}} } \right.}} {(a_{\text{k}} I_{\text{i}} + b_{\text{k}} )} = \overline{{a_{\text{i}} }} I_{\text{i}} + \overline{{b_{\text{i}} }} $$
(10)

3.3 Detail layer enhancement and image fusion

The image details layer corresponds to the high-frequency part of the image, which contains detailed information reflecting the image texture and the noise information [22]. For IR images, directly enhancing the detail layer with a fixed gain factor will simultaneously amplify the noise in the flat areas of the image and reduce the enhancement effect.

Psychological analysis confirms that the human eye is more sensitive to noise in flat areas. In contrast, in the image’s edge area, the human eye’s noise perception ability will decrease with the increasing intensity of the gray level change at the edge of the image. Because the human eye is insensitive to noise in such complex backgrounds, the current standard method for image detail layer enhancement is the adaptive gain correction method based on the noise visibility function. The gain coefficient is calculated as shown in (11).

$$ f(i,j) = \frac{1}{M(i,j) \times \theta + 1} $$
(11)

where \(\theta\) is the parameter adjustment factor, the value is usually set to 1. \(M({\text{i,j}})\) is the noise characterization value at the pixel \(({\text{i,j}})\). An enormous value means that the point is more likely to be noisy. \(f({\text{i,j}})\) is the gain value at the pixel \(\left( {\text{i,j}} \right)\). The gain value is small for the noise in the flat area, which means the noise will be suppressed. The gain value is enormous for the edge area, which means the region will be amplified. The detail regulation factor \(\gamma_{\text{k}}\) in the DGIF also approximately reflects the local grayscale change of the image, \(\gamma_{\text{k}} \approx 1\) in the edge area and \(\gamma_{\text{k}} \approx 0\) in the flat area. The analysis found that the detail regulation factor and the gain factor have similar characteristics, and both parameters will be small when the pixel is located in the flat region. Conversely, both parameters will be enormous when the pixel is in the edge region. Therefore, this paper introduces the detail regulation factor into the detail layer enhancement.

The result of the enhancement processing of the detail layer is as follows:

$$ n_{\text{e}} = (k \times \gamma_{\text{k}} + b) \times n $$
(12)

The fused output image is defined as follows:

$$ p_{\text{e}} = q + n_{\text{e}} $$
(13)

where \(n_{\text{e}}\) is the enhanced detail layer, \(p_{\text{e}}\) is the final enhanced image, and \(k\) and \(b\) are constant factors. The final enhancement effect is jointly determined by the two constant factors \(k\) and \(b\). Under certain conditions, the enormous value of k, the more pronounced the noise suppression effect. The enormous value of b, the better the enhancement of edge details in the detail layer. However, larger values of \(k\) and \(b\) will over-enhance the infrared target and cause severe loss of local detail information. Through experimental analysis, \(k\) and \(b\) are empirically set to be 1 and 4.5 in this paper.

4 Experimental analysis

4.1 Ablation experiment

In order to better verify the effectiveness of the DGIF, this section conducts ablation experiments on DATASETS CIDI with the guide filter as the baseline for the edge perception factor and the detail regulation factor. The first structure is to add the edge perception factor on top of the baseline. The second structure is to add a detail regulation factor on top of the baseline. The third structure is to add the edge perception factor and the detail regulation factor on top of the baseline. The visualization results of the ablation experiment are shown in Fig. 3, and the detailed quantitative results are shown in Table 1.

Fig. 3
figure 3

Ablation experiment visualization results. From left to right: Baseline, Baseline + edge perception, Baseline + detail regulation, Baseline + edge perception + detail regulation

Table 1 DGIF Ablation experiments

It can be seen from Fig. 3 that the overall image is smoothed significantly after the GIF processes, which not only retains the overall edge information and some detailed information of the image but also produces certain halo artifacts at the edges of the image. The image’s contrast after the baseline with the edge perception factor is improved to a certain extent, and the halo artifacts are also significantly improved. In contrast, baseline with the detail regulation factor results in significantly improved image sharpness, richer detail information, and no halo artifacts. The infrared target is more prominent and has richer detail texture after baseline, adding edge perception and detail regulation factors. The data in Table 3 show that the edge intensity and image contrast of the image are improved to some extent under the separate and joint constraints of the edge perception factor and the detail regulation factor based on the GIF as the baseline. The experimental results show that the rationality and effectiveness of the DGIF are verified by both qualitative and quantitative analysis.

4.2 Guided filter comparison experiment

To verify the ability of the DGIF proposed in this paper to preserve the image edges and smooth the noise in flat areas, the DGIF is compared and analyzed with GIF [9], WGIF [13], EGIF [15], and GGIF [14] in this paper. Where all filter parameters are optimal in the literature, the filter radius in this paper is 16, and the regularization parameter is 0.0196.

Figure 4a shows a composite image with image edges at the junction of the gray levels of each image. Figure 4b–f shows the base layers generated from Fig. 4a by GIF, WGIF, EGIF, GGIF, and the DGIF that is proposed in this paper. From the visual effect, slight halo artifacts are generated near the edge of the image after WGIF, GGIF, and GIF. The GGIF generates more serious halo artifacts than the WGIF and GIF. In contrast, the images produced by EGIF and DGIF had less significant halo artifacts near the image edges. These show that the DGIF proposed in this paper can effectively preserve the image edges.

Fig. 4
figure 4

Comparison of edge retention capabilities

Figure 5a comprises four noisy images with the image’s firm edges at the junction of the blocks. Figure 5b, f shows the base layers of Fig. 5a generated by GIF, WGIF, EGIF, GGIF, and the DGIF that is proposed in this paper. It can be seen from Fig. 5d that the brightness of the image generated after EGIF processing is slightly lower than the original image, and the flat areas of the image are not smoothed. In contrast, GIF, WGIF, GGIF, and DGIF have more vital smoothing ability than EGIF, and GGIF’s smoothing ability is weaker than GIF, WGIF, and DGIF. These show that the DGIF proposed in this paper has a specific smoothing ability for the flat areas of the image.

Fig. 5
figure 5

Comparison of smoothing capabilities

Due to the visual characteristics of the human eye, slight gaps in the images are not recognized by the human eye. To further verify the superiority of the proposed DGIF in the image edge-preserving and smoothing, this paper also shows the quantitative analysis of the one-dimensional signal from images A and B processed by four classical guided filters and DGIF in Fig. 6c and d. The one-dimensional signal is generated by the grayscale changes of the 330th column of image A and the 230th row of image B. By observing the partially enlarged image in Fig. 6c, it can be seen that the output values of DGIF are closer to the original image than GIF, WGIF, and GGIF. The output values of EGIF near the edge of the image are the same as the original image. These show that the proposed DGIF in this paper has a more robust edge preservation ability than GIF, WGIF, and GGIF and is weaker than that EGIF.

Fig. 6
figure 6

Edge preserving smooth 1D illustration

It can be seen from the partial enlargement of Fig. 6d that the output values of DGIF, GIF, WGIF, and GGIF are far away from the output values of the original image, and the output values curve of DGIF is relatively smooth compared with that of GIF, WGIF, and GGIF. The output value of EGIF is always a similar trend to the original image’s output value, and the distance is relatively close. This shows that the smoothing ability of DGIF proposed in this paper is more robust than GIF, WGIF, GGIF, and EGIF, and EGIF has the worst smoothing effect on the flat area of the image.

Combining the subjective evaluation and quantitative analysis, the DGIF proposed in this paper has superiority in image edge preservation and flat area smoothing compared with the other four classical guided filters.

4.3 Qualitative results

To verify the detail enhancement effect of the algorithm in this paper on infrared images, it is selected from the infrared images in TNO Image Fusion Dataset and DATASETS CIDI for experiments. The algorithms in this paper are compared with five infrared image enhancement algorithms, including the traditional infrared image enhancement algorithms AHPBC [23], AGCWD [24], RSTDA [25], and the guided filter-based infrared image enhancement algorithms EGIF [15] and GGIF [14]. Then all the algorithms are evaluated comprehensively in terms of subjective visual effects and evaluation indicators.

This paper selects three representative infrared images for evaluation from subjective visual effects. The image shown in Fig. 7 has a low overall gray level and contains prominent IR targets. Figure 8 shows a building scene with more edge information and does not contain prominent IR targets. Figure 9 contains rich detailed information and obvious IR targets, such as ground, trees, houses, and pedestrians, which belong to the complex scene image.

Fig. 7
figure 7

The performance comparison of different enhancement algorithms on image “thermal”

Fig. 8
figure 8

The performance comparison of different enhancement algorithms on image “Marne_01”

Fig. 9
figure 9

The performance comparison of different enhancement algorithms on image “2_men_in_front_of_house”

Figure 7 provides the enhancement effect of all enhancement algorithms on the image “thermal.” The AHPBC algorithm has low overall contrast among all enhancement algorithms, but some details are enhanced, such as road edges and trees around houses. Although the AGCWD algorithm improves the image’s overall brightness, the contrast is low, resulting in a poor overall visual effect. While the RSTDA algorithm improves overall contrast, some detailed information is masked by noise, and the visual effect is poor. Compared with the AHPBC algorithm and AGCWD algorithm, the EGIF algorithm improves overall image contrast, but the overall edge details are not highlighted better. The comparison shows that the algorithm in this paper and the GGIF algorithm have the best enhancement effect. However, the algorithm in this paper has richer detail texture, such as the overall outline of the house and the water tower on the roof, which is easier to observe, with higher contrast and better visual effect for human eyes.

Figure 8 provides the enhancement effects of all enhancement algorithms on the image “Marne_01.” The AHPBC algorithm has a softer overall enhancement effect, and the visual effect is not significantly improved among all the enhancement algorithms. The RSTDA algorithm image clarity has been improved to some extent, but the image contrast is not high enough to highlight the visual effect. The brightness display of the AGCWD algorithm is overexposed, making it impossible to observe the image details, and the overall image is blurred with poor visual effects. The EGIF algorithm has a specific enhancement effect on image details, such as the overall outline of the house, trees, fences, and other scene details are highlighted. However, the overall contrast of the image is not sufficiently improved. The GGIF algorithm has a specific enhancement effect, and although the image brightness is improved overall, the image outline details and contrast are not sufficiently improved. The algorithm in this paper makes the contrast and clarity of each area of the image improved to different degrees, such as the edge outline of the roof being more explicit and the contrast of the wall part being improved. The details of the scene of the image are highlighted, and the visual effect of the human eye is more realistic.

Figure 9 provides the enhancement effect of all enhancement algorithms on the image “2_men_in_front_of_house.” Among all enhancement algorithms, the overall image enhancement effect of the AHPBC algorithm is not apparent enough. The RSTDA algorithm has an optimistic view, which results in poor overall image visualization due to low image contrast. The image processing of the AGCWD algorithm is too bright, and the detailed information of the target scene is seriously lost, which is not easy to observe visually. Although the overall brightness of the EGIF algorithm is improved to a certain extent, the overall contrast of the image is not high, and the detailed information is not prominent enough, such as the details of the shrubs and background trees below. The GGIF algorithm can better extract the texture details of the infrared target and background in the target scene. However, due to the slow change of the gray pixel value at the firm edge of the image after guided filtering, there is a vignetting artifact at the person’s outline. The algorithm in this paper significantly improves image contrast. It has a better effect on detail enhancement and maintenance, especially the outline of the house and external pavilion, and the details of the trees are better reflected. At the same time, the outline of the infrared target is visible without halo artifacts, which has an excellent visual effect.

4.4 Quantitative results

To comprehensively evaluate the enhancement effect of images, six image evaluation indexes, namely Information Entropy (IE), Average Gradient (AG), Edge Intensity (EIN) [26], Figure Definition (FD) [27], Linear Index of Fuzziness (η) [28], and Root-Mean-Square Contrast (RMSC) [29], are used to objectively evaluate the enhancement effect of different algorithms in 20 different scenes. All evaluation results are shown in Fig. 10. The average value of all evaluation parameters is given in Table 1, and the optimal value of each parameter is marked in bold.

Fig. 10
figure 10

Comparison of six evaluation indicators

Table 2 Quantitative evaluation means of 20 images

From the results, the algorithm in this paper has higher AG, EIN, and FD than the other five algorithms in seventeen of the twenty scenes image enhancement results. The enormous IE value indicates that the enhancement effect of this algorithm contains more image information. The enormous AG value indicates that the enhancement effect of this algorithm contains more gradient information and detailed texture. The enormous EIN value indicates that the enhancement effect of this algorithm has higher image contrast and richer detail texture. The enormous FD value indicates that the enhancement effect of this algorithm has higher clarity. The enormous RMSC value indicates the higher contrast of the IR image. The little value of η means that the image contains less noise information and the better the image enhancement effect. Although the remaining evaluation parameters did not obtain the optimal values, it can be seen from Table 2 that the algorithm of this paper achieves the optimal values for all mean values of evaluation parameters for twenty scenarios. Compared with the optimal parameter, the proposed algorithm achieves about 0.23%, 3.4%, 4.3%, 2.1%, and 0.7% improvement in IE, AG, EIN, FD, and RMSC, further illustrating the robustness of the proposed algorithm. The evaluation results of the evaluation parameters are consistent with the subjective visual effect evaluation results, which proves that the algorithm proposed in this paper has certain advantages.

To further illustrate the effectiveness of the algorithm proposed in this paper, we also compared the algorithm in this paper with the recently published infrared image enhancement algorithm LEAS [28]. Since the source code in the paper is not publicly available, we can only apply the algorithm in this paper to five images from the literature. We compare the original evaluation data in the literature with the evaluation results of the algorithm in this paper, and the comparison results are shown in Table 2. From the table, we can see that the evaluation parameters in this paper achieve the optimal values in all five scenes, which further proves that the proposed algorithm in this paper has certain superiority.

Table 3 Comparison of five scene evaluation parameters

5 Conclusions

This paper proposes an infrared image enhancement algorithm based on a detail enhancement guided filter. The difference between this algorithm and other infrared image enhancement algorithms based on edge-preserving filters is that it does not simply apply guided filters to decompose the image. Instead, it introduces the constructed edge perception and detail regulation factors into the cost function of the guided filter. Therefore, the base layer image retains more edge information while smoothing the image and avoiding halo artifacts’ generation. In addition, according to the visual characteristics of human eyes, the constructed detail regulation factor is used to adaptive enhance the detail layer of the image, which can enhance the detail texture in the detail layer and effectively suppress the image noise. Finally, the base layer image is fused with the enhanced detail layer. The experimental results show that the proposed algorithm can effectively handle infrared images of many scenes. Compared with other algorithms, the mean values of the six evaluation indicators IE, AG, EIN, FD, RMSC, and η of the algorithm in this paper obtain the optimal values. It proves that the algorithm of this paper has certain superiority in effectively enhancing the visual effect of infrared images while maintaining the image details.