Keywords

1 Introduction

Computer vision is an area of artificial intelligence (AI) that enables computers and systems to extract useful information from digital images. The quality of the image will depend on a number of factors, including illumination, contrast and brightness. Images that are captured in an environment having low illumination or low light are categorized as low-light images. In many real-time applications, this low-light condition may occur. So, to overcome this, many low-light image enhancement methods are used. This survey paper’s main goal is to investigate the various image improvement techniques for low-light images. Image enhancement is a technique that helps to improve the quality of an image. The parameters that define the image quality are color, contrast, brightness, illumination, etc. During the image acquisition, sufficient light intensity is needed. If the light intensity is low, the captured image will give less information than the original image. In many applications, there is a possibility of low-light conditions. It is necessary to create an enhancement method that is more suited for low-light images in order to get around this. The popular low-light image enhancement methods are Gamma transformation, Histogram equalization, Retinex methods, machine learning and deep learning methods. In recent years, the availability of various learning models introduces a large exploration of low-light image enhancement methods. This survey paper divides the algorithm into two classes, traditional methods and learning-based methods. This learning-based algorithms are again classified into machine learning-based and deep learning-based methods. Section 2 describes a few existing low-light applications. Section 3 explains the classification of enhancement methods.

2 Low-Light Images

Medical image processing has been widely used in research in recent years to diagnose a variety of disorders. When considering various medical imaging techniques, this low-light environment could affect the accuracy of the diagnosis. One of the most important methods for identifying abnormalities of the larynx is laryngeal endoscopy. Due to the anatomical structure of the human body, it is difficult to get illuminated images of this region. As a result, low-light images are obtained.

This enhancement scheme is also applicable for the enhancement of chest x-ray for the detailed analysis of Covid-19 cases. Figures 1, 2 show the larynx endoscopy image and chest x-ray image. Night traffic monitoring is a major challenge in today’s world. These types of enhancement algorithms are useful for improving the analysis of monitoring systems. The other important areas where this low-light condition may exist are underwater images, foggy images, satellite images, etc. (Figs. 3, 4).

Fig. 1
2 endoscopic images of the larynx present a triangular shaped hollow structure surrounded by flesh. The shape in part A, is small while in part B, the conical part of the triangular shape is elongated.

Larynx endoscopy image

Fig. 2
An x-ray of the chest reveals hazy areas in the lungs along with other debris leading to a more solid appearance in the x-ray.

Covid-19 chest x-ray image

Fig. 3
A photograph presents flora and fauna underwater.

Underwater image

Fig. 4
A photograph presents a car on a road with dense fog and very low visibility with few trees on the side of the road.

Foggy image

3 Methodologies

This survey paper introduces a distinction between traditional and learning-based low-light image enhancement technique (Fig. 5).

Fig. 5
A chart of low-light image enhancement methods divided into traditional and learning based methods with gamma transformation, histogram, and retinex theory and machine learning and deep learning, respectively comprising p c m and s v m, and c n n, g a n, cycle g a n, and autoencoder, respectively.

Classification of low-light image enhancement techniques

The traditional methods are Gamma transformation, Histogram equalization and Retinex-based methods. The learning methods are machine learning (ML) and deep learning methods (DL). Methods based on machine learning have only recently become available. Machine learning is a subset of artificial intelligence. They are capable of learning by themselves without being explicitly programmed. The limitations of ML algorithms are, they require supervision for feature extraction and handle only thousands of data points. Commonly preferred ML algorithms are principal component analysis (PCA), regression, support vector machine (SVM), etc.

Several deep learning-based image enhancement methods have also emerged since 2016. DL is a subset of the ML algorithm. Millions of data points are processed by DL algorithms. As a result, a large number of features are extracted without supervision. Convolution neural networks (CNNs) have been used as the foundation of deep learning frameworks in a variety of research papers. Deep learning-based methods can achieve excellent results in low-light image enhancement. Section 3.4 describes about deep learning algorithms.

3.1 Gamma Transformation

A Gamma function is a nonlinear transformation. Gamma correction is a technique used for image enhancement.

$$g(x,y) = f(x,y)^{\surd }$$
(1)

where ‘√’ represents the gamma correction parameter. By varying the parameter, several different transformation curves can be obtained. When ‘√’ > 1, the transformation will broaden the dynamic range of the low-gray value areas of the image and compress the range of the high-gray value areas. When ‘√’ < 1 the transformation will have the low gray values and stretch the high gray values. When ‘√’ = 1 output remains unchanged (Fig. 6).

Fig. 6
A graph for output grey level versus input grey level plots a diagonal for y = 1.00 from (0, 0) to (1.0, 1.0). Above the diagonal 5 concave-down curves are plotted for y = 0.04, 0.10. 0.20, 0.40, and 0.67 and below it 5 concave-up curves are plotted for 1.50, 2.50, 5.00, 10.00, and 25.00.

Gamma transformation

A pair of complementary gamma functions by fusion is one of the methods used for low-light image enhancement (Li et al. 2020). The pair of complementary functions are as follows,

$$y_{1} = 1 - \left( {1 - \left( x \right)^{{\sqrt {} }} } \right)$$
(2)
$$y_{2} = (1 - (1 - (x)))^{1/\surd }$$
(3)

where x—input pixel value, y1 and y2—transformed output pixels.

The input red, green, blue (RGB) image is transformed into a hue, saturation, value (HSV) image. The brightness of the image is determined by the value component (V), which depends on the amount of light intensity present in the environment. The value component is enhanced by the above transformation equations. Then two enhanced ‘V’ components are combined by,

$$I_{1} = c_{1} y_{1} + c_{2} y_{2}$$
(4)

where \(c_{1} = V_{i} /\mathop \sum \nolimits V_{i}\).

I1” is the first input for the fusion process. The identical value component is subjected to sharpening and histogram equalization to produce the second input for the fusion. The second input for fusion is,

$$I_{2} = \left( {V + 2H\left( V \right) - G*H\left( V \right)} \right)/2$$
(5)

The value components I1 and I2 are fused by the image fusion process. This overall process improves the brightness of the low-light images by adjusting the dark region and compressing the bright region. The advantage of using this gamma function is that it generates even brightness.

3.2 Histogram Equalization

Histogram equalization (Narendra and Fitch 1981; Abdullah-Al-Wadud et al. 2007) is one of the traditional methods for low-light image enhancement. The pixels are the basic building blocks of an image. Each pixel holds a specific intensity value. The histogram is a plot that shows the number of pixels versus their intensity values. The histogram equalization algorithm uses the cumulative distribution function (CDF) to adjust the output gray level to have a uniform distribution (Fig. 7).

Fig. 7
2 photographs with their respective histograms. The photos have coffee-bean like shape that appears dark in the leftmost photo with its histogram with closer bars and the right photo has lighter appearance and its histogram with wider bars.

Example of histogram equalization

I1’ will serve as the input image, and ‘L’ will serve as the gray value. ‘N’ is for the overall number of pixels in a picture, ‘I(i, j)’ stands for the gray value at the point with coordinates (i, j), and ‘nk’ stands for the number of pixels at gray level k. The likelihood that a specific gray level ‘k’ will occur is,

$$P\left( k \right) = n_{k} /N;\quad {\text{where}}\;k = 0,1, \ldots ,L - 1$$
(6)

The cumulative distribution function (CDF) of the gray level of an image ‘I’ is given by,

$$C\left( k \right) = \mathop \sum \limits_{0}^{k} p(r);\quad k = 0,1, \ldots ,L - 1$$
(7)

The histogram equalization algorithm maps the original image to an enhanced image with a uniform gray-level distribution based on CDF (Table 1). The enhanced output image is represented as follows:

$$f\left( k \right) = \left( {L - 1} \right)*C(k)$$
(8)
Table 1 Different histogram methods

3.3 Retinex Theory

Retinex theory (Land 1977) is one of the major strategies employed in low-light image enhancement. As per the Retinex theory, the observed image is represented as the product of reflectance and illumination component (Fig. 8).

Fig. 8
A schematic presents a polygon shaped object labeled r of x, y on which illumination l of x, y falls from the sun and reflects with s of x, y rays on the retinex.

Retinex model

As per Retinex theory,

$$S\left( {X,Y} \right) = R\left( {X,Y} \right)*L(X,Y)$$
(9)

where

S(X, Y)—Observed image,

R(X, Y)—Reflectance component,

L(X, Y)—Illumination component.

A low-light image is characterized as it is captured in a low illuminance region. Illuminance is the measure of how much incident light illuminates the surface. For images taken in dim lighting, illuminance is below the standard level. As per the Retinex theory, the reflectance component is considered as the enhanced image, \(R = S/L\). By choosing the proper illumination map, the required enhanced image is obtained. Most of the research work is carried out based on this equation. By using the illumination component, the enhanced outputs are obtained by performing division operations. To overcome the difficulty in this division operation an inverse term is used. The inverse term is expressed in the given equation,

$$R = S* L^{ - 1}$$
(10)

Using an inverse illumination map (L−1), the enhanced image (R) is obtained. Many of the deep learning model uses this Retinex theory as the basic theory. As per the theory, illumination map is constructed by various CNN models. Current research works are carried out in deep learning without Retinex theory also. Deep learning models will play an important role in the enhancement of low-light images.

3.4 Deep Learning-Based Methods

Deep learning has been applied to computer vision tasks such as low-light image enhancement in recent years due to its excellent representation and generalization abilities. Many deep learning models use Retinex theory for their operation. A convolutional neural network (CNN) is a deep learning network architecture that learns directly from data. CNNs are especially useful for detecting patterns in images in order to recognize objects, classes and categories.

Figure 9 shows the basic architecture of convolutional neural network (CNN). The function of the convolution layer is to extract meaningful information by applying a sliding window on the input matrix. The pooling layer reduces the height and width while maintaining the depth information to conduct dimensionality reduction. Based on the application, different types of pooling are preferred. These are maximum pooling, average pooling and minimum pooling. Fully connected layer will perform the classification.

Fig. 9
A diagram of a convolutional neural network with interconnected layers of input, convolution, pooling, fully connected nodes, and output nodes. Input, convolution, and pooling are for feature extraction while fully connected and output are for classification.

Convolutional neural networks

A generative adversarial network (GAN) (Goodfellow et al. 2014) is an unsupervised deep learning-based model. It uses unlabelled data for training. GAN contains two competing neural networks called generator and discriminator, which compete against one another and may evaluate, discover and follow variations within the dataset. The generator generates fake samples of images and tries to fool the discriminator. During the training phase, the generator and discriminator run in competition with each other. The model is trained to function more effectively during each epoch.

A Retinex-based attention network (Huang et al. 2020) uses Retinex as the basic theory for the learning of deep neural networks. This technique calculates an improved image from a reflectance map. Illumination extraction block is developed using an attention mechanism module, resulting in an illumination map prediction network. In order to gain more precise illumination information for the input image, this attention technique is inserted between the convolution layer and batch normalization. On both low illumination images with uniform light and uneven illumination, this model lessens the impact of noise and the augmented information that results.

A Multiscale Attention Retinex Network (MARN) (Zhang and Wang 2021) is designed to predict a detailed inverse illumination map of the input image. When compared with various CNN algorithms, the Multiscale Attention Retinex Network gives better feature extraction. This MARN improves the generalization capability of the network. Instead of using more image priors, an illumination attention map is used to learn the model. It improves the quality of the image in various lighting conditions. This utilizes reconstruction loss, structure similarity loss and detail loss. If the inverse illumination is predicted, the reflectance map is calculated by using Retinex theory and then this reflectance map is estimated as an enhanced image.

A simple generative adversarial network with a Retinex model (Ma et al. 2021), a decomposition network is used to decompose the low-light image into illuminance and reflection maps. For training the GAN structure unpaired datasets are used. This provides a better generalization to the model. By using this structure, reduced training complexity and reduced training time is achieved. This model is applied to mobile phones with small memory.

An enlighten GAN is a modified GAN structure (Jiang et al. 2021). It introduces Enlighten GAN structure that can be trained without image pairs. Even with unpaired datasets, this structure is generalized very well for various real-time images. This model introduces a global and local discriminator structure that handles spatially varying light conditions in the input image. The results of Enlighten GAN are compared with several state-of-art methods. All results show the superiority of Enlighten GAN.

Various approaches have been used to improve image segmentation (Long et al. 2015). Segmentation is the process of dividing an image into its various parts. These basic operations are performed in many computer vision tasks. Segmentation shows good performance during daytime or in bright light. In the case of low-light images, segmentation is not performed well because of the presence of noise, blurredness, etc. The process of segmentation can be divided into a single-class and multi-class segmentation. In single-class segmentation (Wang and Ren 2018), only one object or one feature is considered for segmentation. In multi-class segmentation (Dai and Gool 2018), multiple features are considered. In Cho et al. (2020), semantic segmentation of low-light images with modified Cycle GAN is introduced. The modified Cycle GAN is trained using paired dataset and the L1 loss function is added to the existing Cycle GAN for improving the performance of the segmentation.

Table 2 summarizes the low-light image enhancement techniques.

Table 2 Low-light image enhancement techniques

4 Conclusion

Various state-of-art methods are discussed in this paper for low-light image enhancement. Many of the deep learning structures use Retinex as the basic theory of operation. The illumination map is modified by using various learning architectures, CNN, GAN, Cyclic GAN, etc., which are a few illustrations of deep learning models. This survey presents some works which are more suitable in a noisy environment also. In many real-time applications, low-light conditions may occur due to the unavailability of environmental light. Low-light image enhancement thus plays a crucial role in each of these scenarios. Low-light image enhancement can be extended to the enhancement of low-light video also.