1 Introduction

Owing to the widespread use of social media, people are increasingly recording their daily lives using image acquisition devices, such as cameras. Under low-light conditions, these devices cannot capture enough number of photons, resulting in low-quality images captured. To increase the light sensitivity of an image capturing device, the optical size of the image sensor is usually reduced or the light-sensitive area is reduced. However, these operations increase the exposure time while adding photons, which may lead to new problems. For example, prolonging the exposure time may increase the possibility of hand tremor and, thus, cause motion blur. In addition, low-illuminance images are characterized by low definition, low contrast, and high noise, which are severe obstacles to the application of image processing [1].

To enhance the quality of low-illuminance images, various aspects should be considered. For example, when considering the saturation of a bright area, attention must be focused on the details. A number of studies have been conducted in this regard, and they used various methods based on histograms, Retinex, and transmission graphs. Histogram-based methods, such as histogram equalization (HE) [2] use the cumulative distribution function of the input histogram as a mild transformation function to enhance the global or local contrast, thereby mitigating the effects of low light. However, this method will lead to underestimation or excessive enhancement [3]. The Retinex theory, proposed by Land et al., estimates the brightness of images at different wavelengths by calculating the path between the target and the adjacent pixels. This method illustrates the color constancy of human vision, indicating that the human eye’s judgment of color does not affect the surface around an object [4, 5]. In SRIE [6], a weighted variation model was used to obtain better reflectivity and illuminance. Banić et al. [7] and others improved the RSR method to further reduce the impact of noise. Single-scale Retinex (SSR) is also available [8], but this method is prone to halo effects near the edges according to the variance of the Gaussian low-pass filter. To reduce the halo effect in SSR, the MSR method based on the Retinex theory is proposed [9]. To further reduce color loss, a multi-scale Retinex color restoration (MSRCR) method based on color constancy is also proposed [10]. However, the estimation of the illumination component based on the Retinex method is not sufficiently accurate, and the results are prone to noise amplification and color distortion.

In [11], a JED low-light enhancement method was proposed to reduce noise while ensuring low-light enhancement. This method decomposes the Retinex model in a continuous sequence to smoothen illumination in sections and suppress the reflectivity of noise. A multi-exposure fusion framework proposed by BIMEF [12] can provide an improved contrast and brightness enhancement. Further, based on the dehazing method of hazy images, the authors of [13] invert the low-light images to correlate them with the hazy images.

Owing to the reflected light on an object’s surface under low-light conditions, brightness distribution of the scene becomes uneven. To be best of our knowledge [14], multiple overexposed images can enhance low-light images. As shown in Fig. 1. Some underexposed areas become evident as the number of light increases, but areas that were originally well-exposed / overexposed get worse. Because the adjustment of the exposure is performed globally, the overexposed areas in the low-light image are enhanced again, causing excessive enhancement. It is a common problem in enhancement methods to perform secondary enhancement on overexposed areas in low-light images.

Fig. 1
figure 1

Examples of images with different exposure scales

In order to alleviate the problem of excessive enhancement, we propose a low-light image enhancement based on multi-illumination estimation. The method is mainly composed of two parts: The first part is to obtain the exposure corrected image, and the second part is to perform image fusion. In the first part, we receive multiple exposure images through gamma correction and inversion, and then estimate the illumination map of the various images to receive exposure corrected images. In the second part, the exposure corrected images are combined into the low-light enhanced image through weights. We reverse the image to achieve the exposure adjustment of the over-exposed areas in the low-light image, avoiding the secondary enhancement of the over-exposed areas. During multi-exposure image fusion, we use a Laplacian multi-scale fusion to avoid artifacts.

We regard low-light image enhancement as a problem of spatial contrast and saturation enhancement. The main contributions of this study include the following three aspects:

  1. 1.

    We performed double exposure correction on the image, which effectively improved the global brightness of the image. The gamma correction used for the first time adjusting the image brightness to clarify the dark areas. Exposure correction is performed again on the gamma-corrected image through illumination estimation for the second time, which is used as the input part of image fusion.

  2. 2.

    We reversed the low-light image to adjust the over-exposed area’s exposure, which further effectively suppress the excessive enhancement of the image.

  3. 3.

    We adjusted the underexposed and overexposed areas of the low-light image separately to improve the quality of the low-light enhanced image. Through qualitative and quantitative analysis of a large number of experimental results, we concluded that our method is effective in improving brightness, enhancing contrast, and preserving image details.

2 Related work

Low-light image enhancement is a hot topic in the image processing field. Current studies on increasing the quality of low-light images mainly consider two aspects. The first is to improve the quality of the generated image by adjusting hardware quality, such as with infrared sensors or thermal imager equipment [15]. A camera with an infrared sensor can compensate for the defects in human eyes in low-light conditions and observe the general outline of an object. However, the quality of images captured by a camera with infrared sensor is not high, but its price is generally high. The second solution is using a standard camera for shooting, but a video [16, 17] or image enhancement algorithm is used thereafter to enhance a low-light image captured.Low-light image enhancement technology can effectively improve the quality of low-light images and has a wide range of applications. Low light enhancement methods can be roughly divided into two categories: traditional methods and deep learning-based methods. The deep learning method uses convolution convolutional neural network for feature extraction [18,19,20,21]. In the following subsection, we will summarize the work of previous researchers and discuss other related works in this field.

2.1 Low-illumination imaging

The light emitted from the surface of an object is reflected on the imaging unit of a camera to form an image, and this reflected light of the object is the light that the object itself cannot absorb. The color perception of non-light source objects depends on the spectral components of external light and the characteristics of their absorption spectrum. For example, flower petals are red because they absorb a large amount of blue-violet and red-orange spectral components and reflect the red part that cannot be absorbed; therefore, the color of the petals appears red. In the case of insufficient indoor light or cloudy conditions, the photon count and signal-to-noise ratio are relatively low, resulting in weak reflected light from the object’s surface, rendering the captured image appear underexposed.

In the era of digital photography, Photoshop since 2005 provides the high-dynamic range (HDR) function to better improve the layering of photos. An ordinary camera cannot record all details of brightness and darkness because of a single shooting angle in the shooting process. However, we can use the HDR function to merge images with different exposure levels to overcome this shortcoming. The stack-based HDR method [22, 23] combines multiple captured images with different exposures and/or low dynamic ranges, thereby improving the quality of the images. Although the HDR method can combine images with different degrees of exposure, subtle differences may be observed in the images’ content during the dynamic image movement, for example, “artifacts” generated by the HDR algorithm.

In low-light conditions, cameras and other related devices use the raw data from sensors to produce an enhanced RGB output. However, the enhanced low-light image also exhibits some problems. For example, the enhancement process will add particle noise to the image’s micro-channels. The noise is mainly concentrated in the high-frequency component of the image. With this prior knowledge, the noise can be filtered using a low-pass filter. However, owing to the relatively complex noise generated during the enhancement process, a simple low-pass filter may not be sufficient.

2.2 Low-illumination imaging enhancement

A low-light image itself has a low signal-to-noise ratio, low dynamic range and visibility, and low contrast. Moreover, as the entire image is underexposed, the actual color of the object cannot be recorded. To improve the quality of low-light images, researchers have focused on low-light enhancement.

In this study, we analyze the histogram of a low-light image, as shown in Fig. 2. Figure 2a is the original low-light image. As shown in the histogram statistics, the dynamic range of the low-light image is relatively narrow, with low image contrast. Figure 2b shows the image processed by the method mentioned in this paper. Here, the dynamic range of the image becomes wider, and the contrast is enhanced. From an image processing perspective, we can increase the contrast and information entropy (IE) of the image by HE, stretching the gray range of the low-light image, and other methods.

Fig. 2
figure 2

Histogram statistics of low-light image and gamma-corrected image. a Original image. b Our proposed method

At present, researchers have proposed a variety of methods to enhance image contrast. For example, HE is based on the probability distribution of the input image to remap the pixels to make the image histogram distribution more uniform and improve the dynamic range [24, 25]. However, HE has excessive enhancement and image distortion problems, among others [26]. The author of [27] proposed a fractional fusion model (FFM) for low-light enhancement. This method uses a fractional mask to extract content from low-light areas and can effectively suppress noise and keep the image clear and natural.

Noise is inevitably generated in the image enhancement process. Therefore, when enhancing a low-light image, noise reduction processing should be also performed. Malm et al. [28] proposed an adaptive spatiotemporal smoothing and contrast limited HE enhancement method to increase the dynamic range of low-light images while denoizing. However, this method has relatively high calculation cost. Horn’s Retinex method estimates the reflectance component of the input image through operator log transformation [29]. Moreover, the estimation of illumination component based on the Retinex method is not sufficiently accurate, and the noise can also be easily amplified.

With the development of deep learning, low-light enhancement methods based on deep learning have been emerging continuously. The multi-branch low-light image enhancement network (MBLLEN) proposed in [30] extracts the features of different levels through multiple network branches, combines the extracted multiple features, and then merges the branches into a final low-light enhanced image. Further, the low-light net proposed in [31] provides contrast enhancement and denoizing modules. KinD [32] provides an end-to-end low-light enhancement network by training pairs of images captured under different exposure conditions. The network is divided into two parts: one is used to adjust the lighting, and the other is used to improve the reflectivity. Deep learning-based methods are trained through data-driven networks and have achieved satisfactory results in various applications. The choice of dataset is extremely critical for a data-driven low-light enhancement method. Uneven illumination of an image in the dataset may cause it to be locally over-enhanced.

2.3 Image fusion

The dynamic range of an image is the ratio between the maximum and minimum brightness of its visible area. Compared with HDR scenes, a camera’s dynamic range is usually much narrower than expected. Therefore, to shoot an HDR scene using a mobile phone, typically multiple images with short and long exposures should be captured, and these images can be combined with different exposure levels to obtain an HDR scene.

Image fusion is the fusion of information from two or more images. Different fusion methods are used based on image collection scenes, fusion purposes, and usages. Existing fusion methods are mainly divided into three categories: pixel-based fusion, feature-based fusion, and decision-based fusion. The levels of these three methods vary from low to high, and most of the existing studies mainly focus on the first two levels. We can define a weight map for each multiple exposure for combining the final map as a weighted sum of images and then directly combine multiple exposure images into a tone map similar to an LDR image. This process is called multiple exposure fusion (MEF).

In this study, we propose a low-light image enhancement based on multi-illumination estimation, but we did not use a network for data-driven network training. We used gamma correction and inversion to expose the low-light image to different degrees and multi-scale fusion to enhance the image. To achieve a comprehensive description of the target and scene, image fusion technology is an information fusion of image research objects that can reflect the information of multiple images. Through multi-scale fusion, the details of multiple images are merged to generate the final enhanced image.

3 Proposed method

We propose a low-light image enhancement based on multi-illumination estimation. First, we perform gamma correction and inversion on the low-light image. The gamma correction is used for the brightness adjustment of the underexposed area, and the inverted image is used for the exposure adjustment of the overexposed area. Two gamma correction branches get images with different degrees of exposure. The first branch adjusts the brightness less, and the second branch adjusts the brightness more. We then estimate the illumination of the gamma-corrected image and the inverted image to procedure the over-and–underexposed corrected images, respectively. Finally, we use the Laplace-based multi-scale fusion method to fuse the well-exposed areas of the low-light image, over-exposure corrected image, and under-exposure corrected image to generate the low-light enhanced image. The method flowchart is shown in Fig. 3.

Fig. 3
figure 3

Framework of the proposed method

3.1 Inversion for low-light image

First, we represent the image as a pixel-wise product of the desired enhanced image and the illumination map according to Retinex low-light enhancement theory [33], as shown in formula (1).

$$ I{\text{ = }}I^{\prime} \times L, $$
(1)

where \(I^{\prime }\) represents desired enhanced image, L represents illumination map, × represents pixel-wise multiplication.

Through formula (1), we can derive the formula \(I^{\prime } = I \times {L^{- 1}}\) for exposure corrected. Therefore, if we estimate the illumination map L, we can derive the exposure corrected image. For the gamma corrected image, we use the formula to correct the underexposed area and get the underexposed corrected image. However, the image obtained may have the problem of excessive enhancement. Because enhancement is operation on the entire image, the overexposed area will also be enhanced again. The research findings indicate [34] that the overexposed image obtained by inversion can also be described by the illuminance map estimation. In order to achieve the exposure adjustment of the overexposed areas in the low-light image, we obtain the inverted image Irev through Irev = 1 − I. Irev changes the overexposed area in the low-light image to an underexposed area. The correction of underexposed areas in Irev is denoted as \(I^{\prime }_{rev} = {I_{rev}} \times L_{rev}^{- 1}\), and the correction of overexposed areas can be represented as \({I^{\prime }} = 1 - I^{\prime }_{rev}\). It is worth noting that the inverted image is usually unreal, but the recovered overexposure corrected image is a realistic image, as shown in Fig. 3.

It is worth noting that in the previous enhancement method, the low-light image was inverted to produce a hazy image [35]. Then apply the dehazing method to the hazy image and proceed to obtain the low-light enhanced image. In our method, we achieve the exposure adjustment of the over-exposed areas in the low-light image by inversion to produce an over-exposure corrected image.

3.2 Gamma correction for low-light images

The so-called gamma correction edits the gamma curve of an image, detects the dark and the light parts of the image signal and increases their ratio, and improves the contrast, thereby realizing non-linear tone editing.

In photography, exposure is the amount of light that enters a camera and reaches its sensor [36]. In an actual shooting process, the aperture can be adjusted to change the exposure, but not every area is properly exposed. Moreover, owing to the wide range of light reaching the camera, different imaging scene areas may require completely different exposures. We can adjust the image’s exposure through gamma correction to properly expose each image area. Gamma correction must globally modify the image intensity and then perform a power function transformation. The transformation formula is shown expressed as follows:

$$ I(x) \mapsto \alpha \cdot I{(x)^{\gamma} }, $$
(2)

where α and γ are real positive numbers.

In an image, the difference in dark areas is more pronounced than in bright areas. The quantization of a gamma-corrected digital signal uses a wider quantization interval, where the brightness range is higher and the change is not evident. Conversely, narrow intervals should be applied to darker areas so that details can be more easily perceived.

In this study, our point of interest is not the optimal coefficient of gamma transformation, but an image’s exposure through gamma transformation. That is, we want to increase or decrease the exposure of a global image through gamma correction. To capture images with different degrees of exposure, the parameter γ in formula (2) should be adjusted. During the experiment, we observed that when γ < 1, the image brightness is excessively low, making it underexposed. When γ > 1, the overall image brightness increases. Hence, the closer γ is to 0, the more severe the overexposure of the image. In view of the situation in this paper, we need to increase the image’s exposure so that the original dark areas become evident.

3.3 Illumination map estimation

For gamma corrected and inverted images, we first use the maximum RGB color channel as the illuminance value of each image pixel [37] to produce the initial illuminance map, as shown in formula (3)

$$ L^{\prime}_{p} = \max {I_{p}^{c}},\forall c \in \{ r,g,b\} , $$
(3)

where \({I_{p}^{c}}\) represents color channel at pixel p.

The initial illuminance map estimation method risks sending the color channel of the restored image out of the color gamut. Although the initial illumination map contains rich details and textures, these details and textures have little effect on the exposure-corrected image’s details. Figure 4 shows the initial illumination map and the improved illumination map recovered image. It can be seen from the figure that the refined illumination map has almost no texture details, but it can recover visually pleasing underexposure corrected image. Therefore, we can retain a prominent structure while removing excess texture details. To this end, we define the objective function to obtain the refined illumination map L:

$$ \mathop {\arg \min }\limits_{L} \sum\limits_{p} {(({L_{p}}} - L^{\prime}_{p}{)^{2}} + \lambda ({w_{x,p}}({\partial_{x}}L)_{p}^{2} + {w_{x,p}}({\partial_{y}}L)_{p}^{2})), $$
(4)

where x and y represent the spatial derivatives in the horizontal and vertical directions, respectively. wx,p and wy,p represent the weights of spatial smoothing. The first item \({({L_{p}} - L^{\prime }_{p})^{2}}\) on the left makes the refined illuminance map as close to \(L^{\prime }_{p}\) as possible. The second term is to eliminate excess texture details in \(L^{\prime }_{p}\) by minimizing partial derivatives, λ is a trade-off coefficient.

Fig. 4
figure 4

Illumination estimation. a Input image. b Initial illumination. c Result recovered from (b). e Result recovered from (d)

To make the function achieve better results, we define the weight of smoothness. The smooth weight in the x-direction is defined as shown in formula (5):

$$ {w_{x,p}} = \frac{{{T_{x,p}}}}{{\left| {\left. {{{({\partial_{x}}L^{\prime})}_{p}}} \right| + \varepsilon } \right.}}, $$
(5)

Inspired by relative total variation (RTV) [38], we express Tx,p as follows:

$$ {T_{x,p}} = \sum\limits_{q \in {\varOmega_{p}}} {\frac{{{G_{\sigma} }(p,q)}}{{\left| {\left. {\sum {_{q \in {{\Omega}_{p}}}{G_{\sigma} }(p,q){{({\partial_{x}}L^{\prime})}_{q}}} } \right|} \right. + \varepsilon }}} , $$
(6)

where Ωp represents a 15 × 15 squared window centered on pixel p, and ε = 1e − 3 in formulas (5) and (6). Gσ(p,q) is defined as the Gaussian weight between the pixels p and q based on the spatial affinity, and the standard deviation σ = 3. Gσ(p,q) is defined as follows:

$$ {G_{\sigma} }(p,q) = \exp \left( - \frac{{D(p,q)}}{{2{\sigma^{2}}}}\right), $$
(7)

where D(p,q) represents the Euclidean distance between pixels p and q.

3.4 Laplacian multi-scale image fusion

Traditional MEF methods are mostly based on pixel-level operations, requiring the size of the weight map to be the same as that of the input image. The weight is displayed as the importance of a corresponding pixel in the input image; therefore, finding an appropriate weight is crucial. Several studies have attempted to find suitable weights. Burt [38] used Laplace pyramid decomposition and calculated weight maps to find a correlation between the effective local energy and the pyramid. Moreover, the author of [39] uses light to estimate the weight map. However, a weight map is usually noisy, which affects the quality of image fusion.

Most MEF algorithms define weights Wk for each exposure image and then use the resulting fusion image as their weighted sum, as expressed in formula (8). Therefore, we can assign larger weights to areas with better exposure and increase their proportion.

$$ J(x) = \sum\limits_{k = 1}^{k} {{W_{k}}(x)} {E_{k}}(x), $$
(8)

where k is an exposure corrected image EK(x) of different degrees and J(x) is a globally exposed image synthesized by EK(x). As the weight Wk is standardized, \(\sum \nolimits _{k} {{W_{k}}} = 1,\forall x\).

To optimize Wk in formula (8), a multi-resolution strategy is usually adopted to avoid mixing artifacts. For example, the author of [40] uses contrast, saturation, and full exposure to detect well-exposed areas before multi-scale fusion. In an overexposed image, the pixel value of the bright area is close to 1, and the pixel value of the dark area is close to 0. As the shaded and the illuminated area’s pixel values significantly vary, a large gradient is rapidly formed between the light and dark areas. Considering this, gradient information and tensor structure are used in [41] for multi-scale image fusion.

In this study, we used a multi-scale image fusion method based on classic Laplacian pyramid [42] to avoid the appearance of mixed artifacts. To obtain a well-exposed image J(x), we assume that a set of weight maps Wk points to the well-exposed areas in the image. We merge them according to formula (8) to obtain the final J(x). During the experiment, we observed that this direct hard-switching method may lead to difficult conversion of the fusion image boundary. To obtain a better fusion image, we used a Gaussian pyramid to combine the exposure image and the weight map. The construction process of the weighted Gaussian pyramid is expressed as follows:

$$ {W_{K}^{i}} = d{s_{2}}[W_{k}^{i - 1}], $$
(9)

where ds2[⋅] represents the down-sampling operation of the Gaussian convolution kernel, which reduces the image dimensions to half of its original dimensions. We iterate the above formula N times and gradually produce a smaller and smoother weight map \(\left \{ {{W_{K}^{1}}} \right .,{W_{K}^{2}} {\cdots } \left . {{W_{K}^{N}}} \right \}\).

Based on the construction process of the weight map’s Gaussian pyramid, we constructed a Gaussian pyramid for the exposure corrected images \(\left \{ {{E_{k}^{1}}} \right ., \cdots \left . {{E_{k}^{N}}} \right \}\) of different degrees. Hence, we can construct a Laplacian pyramid Ek for each exposure corrected image through the recursive formula as shown in the following formula:

$$ {L_{k}^{i}} = {E_{k}^{i}} - u{s_{2}}[E_{k}^{i + 1}], $$
(10)

where \(us{}_{2}\left [ \cdot \right ]\) is a Gaussian convolution kernel up-sampling operation, which enlarges the image dimensions to twice the original dimensions and \({L_{K}^{i}}(x)\) represents the frequency content captured at scale i. To ensure the correctness of the recursive formula, we define \({L_{k}^{N}} = {E_{k}^{N}}\).

\({L_{K}^{i}}(x)\) is the frequency content at scale i. A multi-scale combination of different exposure corrected images Ek can be achieved by combining the images from different layers in the K-layer pyramid and adding the up-sampled results. If the size of the original image Ek is m × n, then the mixture of Laplacian pyramids is expressed as follows:

$$ \begin{array}{l} J(x) = u{s_{(m,n)}}[{L_{1}^{1}}(x) \cdot {W_{1}^{1}}(x) + {\cdots} {L_{k}^{1}}(x) \cdot {W_{k}^{1}}(x)]\\ {\text{ + }}u{s_{(m,n)}}[{L_{1}^{2}}(x) \cdot {W_{1}^{2}}(x) + {\cdots} {L_{k}^{2}}(x) \cdot {W_{k}^{2}}(x)]\\ {\text{ }} {\cdots} + u{s_{(m,n)}}[{L_{1}^{N}}(x) \cdot {W_{1}^{N}}(x) + {\cdots} {L_{k}^{N}}(x) \cdot {W_{k}^{N}}(x)]\\ {\text{ = }}\sum\limits_{i = 1}^{N} {u{s_{(m,n)}}\left[ {\sum\limits_{k = 1}^{k} {{L_{1}^{i}}(x) \cdot {W_{1}^{i}}(x)} } \right]} , \end{array} $$
(11)

where us(m,n) is the operation symbol that up-samples the image to Ek size. To show the Laplacian decomposition more clearly, we developed a Laplacian decomposition model, as shown in Fig. 5.

Fig. 5
figure 5

Example of multi-exposure image fusion based on Laplacian pyramid

Our goal is to integrate well-exposed regions of images with different exposure levels, so we require a suitable set of weight values. The weight values are used to assign different proportions to different images to select a well-exposed area from each area. In the fusion process, we obtained the image’s weight map by rapidly estimating its contrast and saturation. This study refers to the concepts in [43] to simplify the method in [41]. From the given source image \({E^{K}}(x) = ({E_{K}^{R}}(x) + {E_{K}^{G}}(x) + {E_{K}^{B}}(x))\), the contrast Ck(x) of each pixel was calculated using the corresponding response value of a simple Laplacian filter, as shown in formula (12). The saturation Sk(x) of each pixel was estimated by the standard deviation of the RGB channels, as shown in formula (13).

$$ {C_{k}}(x) = \frac{{{\partial^{2}}{E_{k}}}}{{\partial {x^{2}}}}(x) + \frac{{{\partial^{2}}{E_{k}}}}{{\partial {y^{2}}}}(x), $$
(12)
$$ {S_{k}}(x) \!= \!\sum\limits_{c \in \{ R,G,B\} }\! {{{\left( {{E_{K}^{C}}(x) \!- \frac{{{E_{K}^{R}}(x) + \! {E_{K}^{G}}(x) + \! {E_{K}^{B}}(x)}}{3}} \right)}^{2}}\!} , $$
(13)

We calculated the contrast and saturation of the image through the formula (12) and formula (13). Contrast and saturation maps were combined by a simple multiplicative combination to obtain the weight maps of different exposure images. The weight map formula is expressed as follows:

$$ {W^{k}}(x) = {C^{k}}(x) \cdot {S^{k}}(x), $$
(14)

The obtained weight map was introduced into formula (8), and the Laplacian multi-scale fusion operation was performed again to obtain the final low-light enhanced image.

3.5 Low-light image enhancement

We use formula (1) to perform the second exposure correction on the gamma-corrected image. In this process, the brightness of the image is increased, but some areas may appear to be over-enhanced. In order to suppress excessive enhancement, we invert the low-light image to obtain an over-exposure image and then use the formula (1) to procedure the over-exposure corrected image as the input part of image fusion. Besides, in low-light images, we can also find well-exposed areas that are closer to the camera. In the fusion process, we take the original low-light image, underexposure correction image, and overexposure correction image as input.

The details of our method are shown in Algorithm 1:

figure e

4 Experiment analysis

In this section, we elaborate on the experiment details. Further, by enhancing a real low-light image and artificial low-light image, we analyzed the experimental results of the proposed method from both qualitative and quantitative aspects. During the experiment, we enhanced the images from the non-uniform illumination dataset [44], NASA dataset [45], ExDark dataset [46],and Google search.

In the following sections, we first explain the parameters set during the experiment (Section 4.1). Then, we subjectively and objectively evaluate real low-light images (Section 4.2). Further, we use artificial low-light images for qualitative and quantitative analyses (Section 4.3). Finally, we compare the proposed method with a previous fusion-based method (Section 4.4).

4.1 Parameter settings

In this study, we propose a low-light image enhancement based on multi-illumination estimation. In terms of parameter setting, we first need to perform gamma correction on the input image. During the gamma correction process, we set γ = [0.4,0.6]. In the estimation process of the illuminance map, we use λ to control the illuminance map’s smoothness. The larger the λ, the more obvious the smoothing effect of the illuminance map and the stronger the local contrast of the exposure-corrected image. However, too large λ will cause the brightness of the exposure-corrected image to decrease. In order to obtain a better visual effect, we set λ = 0.15 in the experiment. The size of the Gaussian kernel in the Laplacian pyramid is G= [1/16, 1/4, 3/8, 1/4, 1/16].

4.2 Subjective evaluation of real low-light images

In this section, we subjectively evaluate the different experimental results. First, we compared the proposed method with the traditional low-light enhancement method, as shown in Fig. 6. In the traditional method, we selected the Retinex-based methods (SSR (1997) [8], MSR (2014) [9], MSRCR (1997) [10]), histogram-based method (HE) (1990), and linear contrast method. As shown in Fig. 6, the brightness of the image enhanced by linear contrast method (Fig. 6f) is still relatively low, and the enhancement effect is not evident. The MSRCR enhancement method in Fig. 6d has an over-enhancement problem. The enhanced image is oversaturated, and the noise in the enhanced image is amplified. In Fig. 6b and e, the image color changed after the enhancement. As shown in the second line, the image seems to add a purple filter, resulting in the color distribution of the enhanced image not matching the original image.

Fig. 6
figure 6

Examples of comparison with traditional low-light enhancement methods. a Original image. b SSR. c MSR. d MSRCR. e HE. f Linear conrast. g Ours

The enhancement effect in Fig. 6e and g is better, the overall brightness is improved, and the basic requirements of low-light enhancement have been achieved. However, by further observing the details, we find that the image enhanced by the HE method in Fig. 6e has been excessively enhanced, such as the river in the first row. The color of the river after the improvement becomes dark blue, which is not in line with the real condition. Overall, the enhancement effect of the method proposed in this study is better. Under the premise of maintaining the original image color, the brightness of the image is improved, and no significant noise is generated. Thus, the resulting low-light enhancement effect is better.

In addition to comparing with traditional methods, we also compare the proposed method with advanced low-light enhancement methods: SRIE [6], LIME [17], JED [11], MBLLEN [30], KinD [32], and FFM [27]. Figure 7 shows the comparison results. The images in the figure are called “Balloons,” “Belgium House,” “Cadik Lamp,” “Candle,” “Chinese Garden,” “House,” “kluki,” “Lamp,” and “Landscape,” respectively.

Fig. 7
figure 7

Comparison of the proposed method and state-of-art low-light enhancement methods. a Original image. b SRIE. c BIMEF. d LIME. e JED. f MBLLEN. g KinD. h FFM

Figure 7a is the low-light image, and Fig. 7b is the result of low-light enhancement using the SRIE method. The figures show that the brightness of the image processed using the SRIE [6] method is not significantly improved. The originally underexposed areas are still not well exposed, and the details of the image are not well displayed. Figure 7d and e have an excessive enhancement problem. Similar to the candle in the fourth row, the color of the desktop becomes red after the improvement. The method used in Fig. 7g has a weak effect on the indoor image in the sixth row, and the enhanced image still has evident underexposure problems, which critically affects the view. In the enhancement process, the method used in Fig. 7f smoothens the image to suppress noise. As shown in the tenth line of the figure, the white clouds in the sky may lose some details after processing in Fig. 7f, but the image brightness adjustment is better. The method used in Fig. 7h is better for processing outdoor images, but indoor images’ processing may have a noise amplification problem. The processing result in Fig. 7c is relatively good, but the accuracy of image brightness and information is reduced. Figure 7i also has an excessive noise amplification problem when processing specific indoor images. However, the brightness adjustment and detail restoration effect of outdoor images are better.

In Fig. 8, we used different low-light enhancement methods to enhance the same image. For a finer comparison of the processing results from different methods, we analyzes the blue frame area. The overall brightness of the image processed by the SRIE method in Fig. 8a is not evident, but the white cloud details in the sky area are well preserved. The overall brightness of the image obtained by the BIMEF method in Fig. 8b is improved, but the outline of white clouds in the sky has become blurred, and details are lost. The image obtained by the LIME method in Fig. 8c has an over-enhancement problem, causing the white clouds in the sky to have the color of sunset. The outline information of white clouds in the atmosphere becomes more blurred, and detailed information is lost. The JED method in Fig. 8d also has this excessive enhancement; the color of seawater has also changed to that of sunset, and the details of the white clouds in the sky have been seriously lost. The MBLLEN method of Fig. 8e has a better overall effect on the image, but the details of the white clouds in the sky have been lost. Figure 8f and g handle the details of white clouds in the sky better, and the overall brightness of the image is also improved. These two methods reveal the blue color of the sky slightly, but not very clear. Figure 8h is the image enhanced with the method proposed in this study. As observed from the blue frame, the outline of the enhanced white cloud is visible, and the details remain good. The proposed method restores the blue color of the sky clearly, and there are no areas of excessive enhancement.

Fig. 8
figure 8

Examples of details of different enhancement methods. a SRIE. b BIMEF. c LIME. d JED. e MBLLEN. f KinD. g FFM. h Ours

In Fig. 9, we used histogram statistics to compare the overall brightness of the enhanced image. The intensity of the image is a concentrated expression of the height of each pixel, and the RGB value reflects the light of the pixel. When RGB is 255, the brightness of this point is the highest, and when RGB is 0, the pixel is black and its intensity is the lowest. In histogram detection, the wider the distribution, the stronger the brightness and contrast of the image.

Fig. 9
figure 9

Examples of enhanced image histogram statistics. a SRIE. b BIMEF. c LIME. d JED. e MBLLEN. f KinD. g FFM. h Ours

In the low-light enhancement task, we hope to improve the brightness and contrast of the image through different methods. By analyzing the histogram in Fig. 9, we found that the histogram of the image processed by most methods is mainly concentrated on the left and middle parts, and the image has a double-peak characteristic. This shows that there is a big difference between the objects in the image and the background. However, the histogram of the enhanced image obtained by the method proposed herein is more evenly distributed, indicating that the image contrast is better.

4.3 Objective evaluation of real low-light images

In the previous section, we subjectively analyzed the experimental results. Subjective evaluation depends on human subjective vision and is not sensitive to slight gaps in an image. To analyze the subtle differences in an image more rationally, we used average gradient [31], information entropy (IE), BRISQUE [47], and NIQE [48] to objectively evaluate the enhanced image.

The average gradient is a description of grayscale changes near the sides of the image boundary, that is, the image’s grayscale change rate. The magnitude of this change rate evaluates the clarity of an image to a certain extent. If detailed information in the image is available, a significant difference will exist in gray levels near the boundary or junction. Thus, the higher the grayscale change rate, the more minute the details in the image. Average gradient is defined as follows:

$$ AG = \frac{1}{{M \times N}}\sum\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{N} {\sqrt {\frac{{{{(\partial f/\partial x)}^{2}} + {{(\partial f/\partial y)}^{2}}}}{2}} } } , $$
(15)

where M and N represent the width and height of the image, respectively; f/x represents the horizontal gradient; and f/y represents the vertical gradient.

Image IE is a statistical form of features that reflects the average amount of information in an image. Then, the IE of an image is expressed as follows:

$$ H = - \sum\limits_{i = 1}^{L} {p({a_{i}}){{\log }_{2}}p({a_{i}})} , $$
(16)

where ai is a randomly output signal in an image. According to the IE theory, the more abundant the detailed information of an image, the higher the information of the image, and the greater the IE of the image.

BRISQUE is a non-reference image evaluation method that scores by comparing the difference between the test image and the natural image. BRISQUE uses locally normalized brightness coefficients to procedure the corresponding parameter characteristics, and compares the characteristics of the test image with the standard natural image. When the difference is large, the higher the BRISQUE score, the worse the image quality.

NIQE is also a typical non-reference image evaluation method. NIQE extracts the spatial domain features on the test image and uses Gaussian distribution to describe the spatial domain features. Comparing the characteristics of the test image with the standard natural image. The larger the NIQE value, the greater the gap between the test image and the standard natural image, and the image enhancement effect is poor.

Table 1 shows the average gradient of the enhanced images using different methods in Fig. 7. The table demonstrates that the image processed by KinD [32] and the method proposed in this study have a higher average gradient, which shows that the enhanced image contains detailed information. Conversely, the average gradient of the image after SRIE [6] enhancement is low, and some details may be lost during the enhancement process.

Table 1 Average gradient of the image enhanced by different algorithms

Table 2 shows the IE of the images processed using different methods in Fig. 7. According to equation (16), the larger the IE of an image, the more detailed the information in the image. From the table, BIMEF [12], LIME [17], JED [11], KinD [32], and the proposed method have achieved the maximum IE on some images. However, the proposed method has more significant IE on most images. Thus, we can say that the proposed method has a better enhancement effect on most images.

Table 2 Comparison of IE of different enhancement algorithms

Table 3 shows the BRISQUE scores of images enhanced by different methods. As can be seen from the table, whether it is for indoor image or outdoor image enhancement, the proposed method has achieved good performance on most images.

Table 3 Comparison of BRISQUE of different enhancement algorithms

Table 4 shows the NIQE scores of images enhanced by different methods. The method proposed in this paper has a relatively low NIQE score and good image enhancement quality.

Table 4 Comparison of NIQE of different enhancement algorithms

4.4 Image enhancement under extreme conditions

Besides, to prove the proposed method’s correction effect on over-exposed areas, we use different methods to perform experiments on multiple images in the ExDark dataset [46]. The experimental results are shown in Fig. 10. It can be seen from the figure that the proposed method not only performs well on low-light images with dim content but also has good performance in low-light situations caused by light degradation. The images in the ExDark dataset were mainly taken in extremely dark conditions. Street lights, sun, and other light source objects in the image show overexposure/w-ell-exposure. However, in low-light enhancement, the light source object will be enhanced again, resulting in the problem of excessive enhancement of the image. For example, in the MBLLEN method, the excessive enhancement will occur at the enhanced image light source. For example, the light sources in lines 4, 6, 7, 8, and 12 in Fig. 10f are over-enhanced, and details are lost. Besides, the images enhanced by the LIME and JED methods are excessively enhanced, causing visual discomfort. Although the SRIE and BIMEF methods did not have the problem of excessive enhancement, the brightness of the enhanced image was not greatly improved, and the dark areas were not enhanced. Part of the image enhanced by KinD and FFM has a fog layer, and some areas even have artifacts. The overall exposure of the image processed by the proposed method is moderate, there is no serious overexposure, and the image after enhancement has less noise .

Fig. 10
figure 10

Examples of extreme darkness enhancement. a Original mage. b SRIE. c BIMEF. d LIME. e JED. f MBLLEN. g KinD. h FFM. i Ours

4.5 Subjective evaluation of synthesized low-light images

In the previous section, we subjectively and objectively evaluated real low-light images. Here, we enhance synthesized low-light images and analyze the processing results of different methods. We use contrast scaling and gamma correction to synthesize low-light images artificially [31]. The formula is as follows:

$$ {I_{d}} = {C_{\lim }}I_{b}^{\gamma} , $$
(17)

where \({C_{\lim }}\) is the upper limit of Id intensity, and γ represents the gamma correction value. Different combinations of \({C_{\lim }}\) and γ can produce different levels of low-light images. In this article, we set \({C_{\lim }}{\text { = 100}}\) and γ = 3 to generate a low-light image, as shown in Fig. 11b.

Fig. 11
figure 11

Examples of synthesized low-light image enhancements. a Original image. b Synthetic images. c SRIE. d BIMEF. e LIME. f JED. g MBLLEN. h KinD. i FFM. j Ours

Following the low-light enhancement, the brightness of the processed image is significantly improved than the original low-light image. The overall brightness of the image after SRIE processing in Fig. 11c is slightly lower than other algorithms, but the image detail is better retained. Compared with the SRIE method, the BIMEF method in Fig. 11d has improved brightness. The overall brightness of the image after LIME processing in Fig. 11e is significantly improved, and the visual effect is better. The image processed by the JED method in Fig. 11f has the problem of excessive enhancement, and the image saturation is too high. TThe image processed by the MBLLEN method in Fig. 11g has a problem of excessive enhancement, such as the color of the land in the third row of the figure. The overall processing effect of KinD method in Fig. 11h is better. The FFM method in Fig. 11i improves the brightness of the image less, and the image is darker overall. The overall enhancement effect of the image processed in the method proposed in Fig. 11j is better, and the detail of the image remains better. The enhanced visual effect is closer to the original image.

4.5.1 Objective evaluation of synthesized low-light images

To analyze the detailed information in Fig. 11, we used peak signal-to-noise ratio (PSNR), structural similarity (SSIM), visual information fidelity (VIf) [49], and feature similarity (FSIM) [50] for objective evaluation. PSNR evaluates image quality by calculating the error between pixels. When the error between the enhanced image and the original image is small, the PSNR value is more significant, indicating that the image enhancement effect is better. The calculation of PSNR is expressed in formula (18).

$$ PSNR = 10 \cdot {\log_{10}}\frac{{PQ}}{{\left\| {y - \hat y} \right\|}}, $$
(18)

where P is the clear image size, Q is the original image size, y is the original low-light image, and \(\hat y\) is the clear image after enhancement.

SSIM is used to measure the similarity between the original low-light image and the reconstructed clear image. SSIM uses mean to estimate brightness, standard deviation to estimate contrast, and covariance to measure structural similarity, as shown in formula (19). The more significant the SSIM value, the less the image distortion and the better the reconstruction effect.

$$ SSIM = \frac{{(2{\mu_{y}}{\mu_{\hat y}} + {C_{1}})(2{\sigma_{y\hat y}} + {C_{2}})}}{{({\mu^{2}}_{y} + \mu_{\hat y}^{2} + {C_{1}})({\sigma_{y}^{2}} + \sigma_{\hat y}^{2} + {C_{2}})}}, $$
(19)

where μy is the average gray value of the original low-light image, σy is the variance of the original low-light image, \({\mu _{\hat y}}\) is the average gray value of the clear image after enhancement, \({\sigma _{\hat y}}\) is the variance of the clear image after enhancement, \({\sigma _{y\hat y}}\) is the covariance of the original image and the enhanced clear image, and C1,C2 are constants.

The similarity between the enhanced image and the original image can be evaluated by measuring the amount of information shared by the two images. VIf is to compare the information content between the original image and the low-light enhanced image, and put forward the concept of information fidelity. The more information content the enhanced image shares with the original image, the larger the VIf value and the better the image enhancement effect.

The main purpose of image enhancement is to increase the brightness of the image. In order to evaluate the effect of different methods on the brightness adjustment of the image, we use FSIM for objective evaluation. We chose FSIM, which only considers the luminance component. When the FSIM value is larger, the enhanced image is more similar to the original image.

We labeled the images in Fig. 11 from top to bottom with Image 1–Image 6. We calculated the SSIM of the images processed by different methods, and the results are shown in Table 5. The table shows that LIME, KinD, and the proposed method have achieved higher SSIM values. For Image 1, the SSIM value of the proposed method reaches 0.891, indicating the best performance. For Image 2, the SSIM value of the KinD method reaches 0.914, while it is only 0.791 for the proposed method. Compared with the proposed method, it performs better. In Image 5, the LIME method’s SSIM value reaches 0.881, while the SSIM value of the proposed method is 0.812, a difference of 0.069. Overall, the proposed method has achieved good performance on most images.

Table 5 Comparison of SSIM with different enhancement algorithms

Table 6 presents the PSNR values of the images in Fig. 11. In Fig. 11, Image 4’s texture is more complex and contains more information. The table shows that the proposed method has achieved the best performance on Image 4, which shows that it is more complete in processing detailed textures. In addition, in comparison to other images, the proposed method has certain advantages and achieves excellent results.

Table 6 Comparison of PSNR with different enhancement algorithms

Table 7 shows the calculation result of VIf. As shown in Table 7, most of the image information fidelity methods proposed in this paper have high fidelity scores. This is because the multi-exposure fusion enhancement method fuses the images with different exposure degrees, better preserve the detailed information of the image, and make the image information fidelity better.

Table 7 Comparison of VIf with different enhancement algorithms

Table 8 shows the calculation result of the FSIM value. As can be seen from the table, the similarity between the enhanced image and the original image proposed by this method is high, and the FSIM scores are all greater than 0.90. This is because we use gamma correction to obtain images with different degrees of exposure, and use the classic Laplacian pyramid method for multi-scale fusion. The method proposed in this paper tries to obtain the best exposure for each area, so the image’s FSIM score is higher.

Table 8 Comparison of FSIM with different enhancement algorithms

4.6 Comparison with other fusion-based enhancement methods

Researchers have achieved low-light enhancement in previous work studies through the fusion of different exposure scales [16, 51]. Further, a method based on the fusion of different scale exposures is widely used in image dehazing [43]. In [16], a low-light image is decomposed into a reflection image and a light image. A contrast-enhanced version is derived from the light image using the Sigmoid function and adaptive HE as inputs. The derived enhanced version and weights are combined in a multi-scale manner to generate an adjusted lighting map based on the derived input design weights. Finally, the illumination and reflection maps are combined to obtain the final low-light enhanced image. In [51], the weight matrix of image fusion is designed using light estimation technology. The multi-exposure image is synthesized manually using the corresponding model of the camera, and finally the multi-exposure image is combined according to the weight matrix to obtain the final enhanced image.

The proposed low-light image enhancement based on multi-illumination estimation does not require the estimation of the illumination or reflection maps separately. We considered the low-light image as input, and then used gamma correction and inversion to generate exposure images of different scales. Then use illumination estimation to perform exposure correction on exposure images of different scales. We determined the weights by rapidly estimating the contrast and saturation in the image scene. Subsequently, we used the multi-scale image fusion method based on classical Laplacian pyramid to combine the weight map and the exposure images of different scales and finally generated the enhanced low-light images.

In Fig. 12, we compare the enhancement results of the proposed method with the methods in [16] and [51]. The figure shows that the enhanced image of Fu [16] is still underexposed and the color of some areas still cannot be displayed normally. Ying [51] enhanced the image brightness significantly, but the contours of the ripples in water are somewhat blurred, and some details may be lost during the enhancement process. Figure 12d presents the result of the proposed method. As in the figure, the proposed method handles the sky better: It restores the original light blue areas in the sky, and retains the details of the image better.

Fig. 12
figure 12

Examples of comparison with other fusion-based enhancement methods. a Original image. b Fu. c Ying. d Ours

5 Conclusion

In this study, we proposed a low-light image enhancement based on multi-illumination estimation. We performed gamma correction and inversion on the original low-light image to derive an enhanced version with better contrast and saturation. By calculating weights for images with different exposure levels, a multi-scale image fusion method based on Laplacian pyramid was used to combine the weight map with varying degrees of exposure to generate the final enhanced image. As this method selects the areas with different exposure levels for fusion, the enhanced image achieves satisfactory results in terms of improving brightness, enhancing contrast, and preserving details. We conducted experiments on real and synthetic low-light images. Qualitative and quantitative analyses of the experimental results showed that the proposed method has certain advantages. In the future, we can combine this low-light fusion strategy with deep learning to improve the algorithm’s efficiency.