1 Introduction

The computer vision is defined as a field of study that deals in developing techniques to help computers gain high-level understanding from digital images or videos. It automates various tasks and extracts useful information from images/videos with the help of artificial intelligence systems. There are numerous computer vision applications, including smart transportation systems, video surveillance, object detection, weather forecasting, etc. [1] that require high-quality input images or videos to “see” and analyze the contents. Unfortunately, poor weather conditions (haze, fog, rain, etc.) diminish the visibility and lead to the failure of these applications. The image captured under these circumstances suffers from various degradations, namely low contrast, faded colors and most importantly reduced visibility. These degradations occur in the captured image due to the scattering of atmospheric particles (aerosols, water droplets, molecules, etc.) suspended in the atmosphere.

The role of image dehazing is to improve the visual quality of a degraded image and remove the influence of the weather. Therefore, the image dehazing algorithm acts as preprocessing tool for many computer vision applications, as shown in Fig. 1.

Fig. 1
figure 1

An application of image dehazing

Fog, mist, and haze are the atmospheric phenomena that reduce the visibility of the image. Fog and mist both occur when the air has wet particles or water droplets. Both the terms are almost the same and the only difference is how far we can see. Fog is the term generally referred to when visibility is less than 1 km. If we can see more than 1 km away, it is considered as mist. Haze is a slightly different phenomenon, in which extremely small, dry particles, for example, air pollutants, dust, smoke, chemicals, etc. are suspended in the air. These dry particles are invisible to the naked eyes but sufficient to degrade the quality of the image in terms of visibility, contrast, and color. The visibility is less than 1.25 miles in the presence of haze. These dry particles are generated through various sources including farming, traffic, industry, and wildfires. Figure 2 shows the example image of fog, mist, and haze and also various sources of hazy image formation.

Fig. 2
figure 2

Images of a fog, b mist, c haze, d Source: air pollutants, e Source: farming, f Source: wildfires

The hazy effect in the captured image is expressed by the atmospheric scattering model (ASM) or the physical model of hazy image formation, as shown in Fig. 3. When incident light is reflected from the object, reflected light is attenuated due to the distance between observer and scene. In addition, due to the scattering of particles, airlight is also introduced into the camera. Therefore, a hazy image is composed of direct attenuation and airlight. Direct attenuation distorts the color whereas airlight reduces the visibility. The physical model is given as follows [2]:

$$I_{{_{hazy} }}^{c} (x) = J_{haze - free}^{c} {\kern 1pt} {\kern 1pt} T_{r} (x) + A_{t}^{c} (1 - T_{r} (x))$$
(1)

where \(c \in \{ r,g,b\}\) is the color channel, \(I_{hazy}^{c}\) is the captured hazy image, \(J_{haze - free}^{c}\) is the haze-free image, \(A_{t}^{c}\) is the atmospheric light, \(T_{r}\) is the transmission medium, and x is a pixel position. The transmission describes the portion of light, directly reaching the camera without scattering. The value of the transmission medium lies in the range of [0, 1]. Furthermore, it is expressed as an exponential function of distance and depends on two parameters: distance d and scattering coefficient \(\beta\), as follows:

$$T_{r} (x) = e^{{ - \beta {\kern 1pt} \;d(x)}}$$
(2)
Fig. 3
figure 3

Physical model of hazy image formation

Haze-free image \(J_{haze - free}^{c}\) can be obtained in the inverse way as follows:

$$J_{haze - free}^{c} (x) = \frac{{I_{hazy}^{c} (x) - A_{t}^{c} }}{{T_{r} (x)}}{\kern 1pt} {\kern 1pt} + {\kern 1pt} A_{t}^{c}$$
(3)

Single image dehazing (SID) is an ill-posed problem because we have to estimate two key parameters \(A_{t}^{c}\) and \(T_{r}\) from \(I_{hazy}^{c}\) to find hazy-free image \(J_{haze - free}^{c}\). The performance of a dehazing method depends on the estimation of key parameters.

In the past, many dehazing methods came into existence that utilizes various prior knowledge or assumptions to compute the depth information. However, the performance of these methods depends on the validity of these priors and may lead to various issues, such as color distortions, incomplete haze removal, halo artifacts, etc. In the literature, image enhancement based dehazing methods were also reported which do not require the estimation of the transmission and its costly refinement process. Since it does not consider the degradation mechanism into account while recovering an image. They suffer from the problem of over/under enhancement, over-saturation, and loss of information and are also unable to deal with dense hazy images. To overcome the problem of restoration and enhancement-based methods, many machine learning and deep learning methods are successfully implemented to compute an accurate transmission map. These methods require a vast amount of hazy and corresponding clean images to train the model. However, it is very difficult to obtain hazy images and their GT image in the real world. The related work section describes the recent dehazing methods of each category along with their pros and cons.

In this review article, we have mainly focused on haze removal methods from a single image proposed in 2016 and onwards. The major contributions are as follows:

  1. (1)

    This paper provides an extensive study of various recent the state-of-the-art dehazing methods. It classifies these methods into twelve categories: Image enhancement, Image restoration with prior, Image fusion, Superpixel, Machine learning, Deep learning, Polarization, DCP based, Airlight estimation, Hardware implementation, Non-homogenous and Miscellaneous. All these methods are investigated on various dehazing parameters, namely key technique, dataset, issues of dehazing, evaluation metrics, etc.

  2. (2)

    It provides a comprehensive study of various datasets used in image dehazing to date. It also discusses datasets of various haze densities from thin haze to very dense haze including real hazy images and synthetic hazy images. These datasets are assessed on various parameters, namely haze concentrations, number of images, and performance of recent dehazing methods.

  3. (3)

    This paper also explores different metrics introduced in recent works for the evaluation of dehazing algorithms with their merits and demerits.

  4. (4)

    Furthermore, this paper focuses on the latest technology advancement and development in this field from the perspective of non-homogenous haze removal, dense haze, hardware architecture, ensemble networks and deep learning methods.

  5. (5)

    Finally, it provides research gaps in single image dehazing where recent the state-of-the-art methods are lacking.

There are few papers available in the field of single image dehazing, however, they are limited to certain aspects. For instance, [3] concentrated on discussing various haze removal methods and quantitative results. Later, Wang et al. [4] added a description of different evaluation metrics. Singh et al. [5] explained numerous categories of dehazing methods with their pros and cons and analyzed methods based on issues of dehazing. However, it did not provide the qualitative and quantitative analysis of dehazing methods. In addition, it did not talk about standard dehazing datasets available for assessment. In the year 2020, two survey papers [6, 7] were reported. However, they take into consideration only a few recent papers from the year 2017 to 2020. This article considers approximately 150 recent papers in comparison to 46 in [7] and 51 in [6]. The comparison with existing survey/review papers is illustrated in Table 1. In this table, we can visualize the strength on various parameters of image dehazing. In addition to the previous research, this paper explores various untouched haze removal techniques for handling the most challenging problems of dehazing such as removal of non-homogeneous haze, superpixels, dense haze, and real-time applicability (hardware implementation). This article provides an extensive review of recent and popular dehazing techniques based on qualitative and quantitative comparisons, challenges in dehazing, available datasets, and evaluation metrics. This paper aspires to serve as a guide in all aspects of image dehazing for the researchers to find a path for their work.

Table 1 Comparison with existing survey/review papers

2 Applications of Image Dehazing

Image dehazing is an important area of research. The output of dehazing algorithms acts as an input to various vision applications. Some of the motivations are shown in Fig. 4 and discussed as follows:

Fig. 4
figure 4

Applications of image dehazing, a video surveillance, b fog related road accidents, c road transportation, d railway transportation, e air transportation, f underwater image enhancement, g remote sensing

2.1 Video Surveillance

A video surveillance system is a key component in the field of security. The effectiveness and accuracy of the visual surveillance system depend on the quality of visual input. However, the poor weather condition affects the quality of input. The video captured by the camera of a surveillance system degrades due to scattering and absorption of light by the atmosphere. For example, video recorded in hazy weather has limited visibility which could be problematic for police, investigating a crime. Thus, these systems do perform poorly in hazy weather conditions. Hence, a robust surveillance system is required.

2.2 Intelligent Transportation System

The foggy weather conditions affect the driver’s capabilities and increase the risk of accidents and the travel time significantly due to limited visibility. In past years, fog-related road fatalities have increased significantly. Road crashes, injuries or deaths on account of poor weather conditions like thick fog run in thousand every year on highways. The bad news is that this number is increasing every year [8].

In addition to roads or highways, fog also affects other transportation systems like airplanes and railways. Generally, takeoff and landing of airplanes become a very challenging task in a hazy environment. Due to which many flights get delayed or sometimes, they are canceled. Similarly, in the case of railway transportation, thick foggy conditions are a hazard to the passengers and crew members that could easily result in loss of life. The driver may miss the signals due to impaired visibility. Therefore, we require an intelligent transportation system that can provide a clear view to the driver in these transports to save life and property.

2.3 Underwater Image Enhancement

Underwater imaging often suffers poor visibility and color distortions. The poor visibility is produced by the haze effect due to the scattering of light by water particles multiple times. Color distortion is due to the attenuation of light and makes an image bluish. Therefore, an underwater vision system requires an image dehazing algorithm as a preprocessing so that a human can see the underwater objects.

2.4 Remote Sensing

In remote sensing, images are captured to obtain information about objects or areas. These images are usually taken from satellites or aircraft. Due to the high difference of distance in the camera and the scene, the haze effect is introduced in the captured scene. Therefore, this application also demands image dehazing as a preprocessing tool to improve the visual quality of an image before analysis.

Besides these applications, image dehazing also plays an important role in other applications, such as astronomy, medical science, agronomy, border security, archaeology, environmental studies and many more.

Therefore, it is important for computer vision applications to improve the visual quality of the image and highlight the image details. With respect to hardware aspects of camera sensors, many super-telephoto lenses are designed to incorporate scientific filtering and coating to enhance the contrast of the image. However, these lenses are very expensive and bulky and not applicable in daily life. Therefore, the restoration of hazy images or videos has attracted increasing interest in the last few years.

3 Issues/Challenges of Image Dehazing

The dehazed image may suffer from various types of issues like color shift, over enhancement, structure damage or incomplete haze removal, as shown in Fig. 5.

Fig. 5
figure 5

[9]: Various issues of image dehazing a incomplete haze removal, b structure damage, c color shift, d over enhancement

3.1 Under/Over Enhancement

Restoration of hazy images often leads to two phenomena: under enhancement and over enhancement, as shown in Fig. 6. In under enhancement, haze is not completely removed from the original image. Hence, the visibility is not improved as desired. In case of over enhancement, the original information is changed in haze-free regions and color shift is caused in hazy regions during dehazing process [9]. This problem is generally observed in dense hazy regions which are having low contrast. Over dehazing makes the color much darker and causes saturation of pixels.

Fig. 6
figure 6

Hazy and haze-free images related to over/under enhancement problems. a Much darker color by dehazing method [15]. b Under enhancement problem by method [64]. c Saturation of pixels by method [51]. d Distortion of colors by method [13]

The image dehazing algorithms must keep the information of haze-free regions unchanged, meanwhile, capable enough to improve the visibility in hazy regions without color distortions.

3.2 Halo Artifact and Noise Amplification

The existing image dehazing method generally used patch-based method to estimate the transmission to recover the hazy image. Inaccurate estimation of the transmission may lead to distortions in the dehazed image, as shown in Fig. 7. Most of the method is also based on the assumption that local patches have similar depth. Depth discontinuities or abrupt jumps in an image will cause halo artifacts. To remove the problem of halo artifacts various refinement methods like Guided filtering, contextual regularization, total variation, etc. are utilized in many works. Still, the problem exists, halo artifacts are reduced but they are not completely removed.

Fig. 7
figure 7

Various distortion in dehazed image. a Halo artifacts by method [63]. b Blurring effect by method [51]. c Noise amplification by method [66]

Moreover, in presence of dense haze, noises and artifacts are not visible in the hazy images. The existing methods may amplify these noises and artifacts depending upon the depth and concentration of the haze during dehazing process [10]. Some of them introduce other distortion like the blurring effect in the dehazed images.

3.3 Dense Fog Removal/ Different Foggy Weather Conditions

In state-of-the-art dehazing methods, till now, there is not even a single method that can remove the effect of varying and challenging weather conditions like removal of all types of haze ranging from thin haze to very thick haze, night-time haze removal, non-homogenous haze (uneven distribution of haze), etc. as shown in Fig. 8.

Fig. 8
figure 8

Examples of challenging weather conditions. a Dense hazy images. b Non-homogenous hazy images. c Night-time hazy images

Most of the methods work well in daytime scenes; they fail in night-time hazy conditions due to inaccurate estimation of the airlight. Generally, an airlight is estimated by the brightest pixels. This estimator faces two challenges when it is applied to night-time scenes (1) it is estimated globally over the entire image, whereas there are multiple local sources of light and they are non-uniform in nature. (2) It selects the white pixels which are the brightest pixels in the hazy image. But, it is not true for night-time scenes that exhibit strong color lighting [11].

The majority of the methods are able to remove the mild or thin haze. In presence of dense haze, either they fail to remove the haze completely or may result in loss of information in form of saturation of pixels. In the case of thick fog, scene reflection becomes very small due to the small value of the transmission. The reason for small transmission is due to the large scattering coefficient, meanwhile, the proportion of airlight increases significantly. Therefore, it is a very challenging task to remove the thick haze considering minuscule reflection.

3.4 Adaptive Parameter Setting

The performance of the dehazing methods greatly depends on the selection of the different parameters, namely patch size, dehazing controlling parameter, Gamma correction, size of the filter, regularization term, scaling factor, number of superpixels, etc. For e.g., if the patch size is small, it may underestimate the transmission, especially for the regions with bright and white objects and may lead to over enhancement. By contrast, if the patch size is large, it may introduce the halo artifacts at depth discontinuities and also will increase the computation [12]. Therefore, for a good recovery result, patch size must be selected adaptively depending upon the pixels. Another parameter that is used by most of the methods is dehazing controlling parameter, as shown in Fig. 9.

Fig. 9
figure 9

Restored images with different δ by method [30]. a Original image, b δ = 1.0, c δ = 0.8, d δ = 0.6, e δ = 0.4

All these parameters are set manually according to the experimental setup. They may not fit for different degrees of haze present in images. These parameters must be set adaptively to improve the performance because haze density on a given image varies from image to image and atmospheric veil.

3.5 Speed of Dehazing

Another drawback with existing dehazing methods is the computational complexity of the dehazing process. It is still a very challenging task to dehaze an image/video in real-time by which various vision applications, such as intelligent transportation systems or video surveillance can be benefited. The time complexity can be reduced by joint estimation of airlight and transmission and to avoid the costly refinement process of the transmission.

4 Related Works

In recent years, significant progress is made in the field of image dehazing. We present recent and popular dehazing methods in this section. For convenience, we have divided these methods into the following categories: (1) image enhancement based, (2) image restoration with priors (3) image fusion based (4) non-homogeneous haze (5) hardware implementation based (6) polarization based (7) traditional learning based (8) deep learning based (9) superpixel based. Furthermore, subcategories of each category are identified, as shown in Fig. 10.

Fig. 10
figure 10

Different categories of image dehazing methods

4.1 Image Enhancement based Methods

Image enhancement-based method can be divided in two sub-categories (1) the methods do not consider the atmospheric scattering model or degradation mechanism to enhance the visual quality of the hazy images. Therefore, they do not estimate the transmission and atmospheric light. (2) image enhancement operations are utilized in estimation of transmission or airlight. Hence, they may fall in methods of other categories too, such as restoration or fusion-based. Both sub-categories use various image enhancement techniques, including histogram equalization [13, 14], Bi-histogram modification [15], weighted histograms[16], Gamma correction [13, 17,18,19,20], multi-scale retinex [21], wavelet decomposition [22,23,24,25], multi-scale gradient domain contrast enhancement [17], texture filtering [26], bilateral filter [26, 27], white balance method [26, 28], median filtering [28, 29], Linear Transformation [30], morphological constructions [31], Discrete cosine transform [14], Guided filter [32,33,34,35,36,37,38,39], anisotropic diffusion [40, 41], contrast enhancement [42,43,44], quadtree Decomposition: [30, 45, 46], Contextual Regularization: [45, 47,48,49], weighted L1-norm regularization [50], and total variation [51,52,53] (Table 2).

Table 2 Comparison of existing image enhancement-based methods

Wang et al. [21] proposes a multi-scale Retinex based algorithm with color restoration to compute the transmission. The author estimated the atmospheric light by dark channel image and a decision image according to a threshold. However, dehazed image contains small halos and also appears dark in the regions of small gradients and bright areas. Cui et al. [50] proposed a SID method based on the region segmentation which separates the hazy image into bright and non-bright regions. This removes the problem of overestimation of the transmission in non-bright regions and underestimation of the transmission in bright regions of the DCP method. Weighted L1-norm regularization is used for refining the transmission. However, this method suffers from over-saturation. Moreover, it underestimates the transmission for the object similar to the dense haze and leads to the over enhancement. Liu et al. [53] proposed a solution for two challenging problems of existing dehazing methods. These two challenges are (1) halo artifacts due to insufficiency of edges in estimated transmission and (2) amplification of noise and artifacts in presence of dense haze. This method estimates the initial transmission by boundary constraint and its refinement is done by non-local total variation (NLTV) regularization. However, this method fails in the presence of white objects such as clouds, dense haze, etc. and as a result, the dehazed image looks darker. Furthermore, to improve the quality of the haze-free image, a post-processing method is required. Moreover, lower values of SSIM AND CIEDE2000 indicate that performance of this method is not satisfactory on synthetic hazy images. Raikwar et al. [47] estimate a lower bound on the transmission by considering the difference between the minimum channel of a hazy and haze-free image. A lower bound is characterized by a bounding function and a quality control parameter. The bounding function is estimated by a non-linear model and a control parameter is used to control the degree of dehazing. However, this method is unable to increase the contrast of dense hazy images. Wu et al. [54] proposed a variational model to remove artifacts due to noise present in the hazy image. They proposed a transmission-aware non-local regularization that suppresses the noises and provides the fine details of the dehazed image without amplification of noises. In addition, to smooth the transmission, semantic-guided regularization is proposed. This method provides satisfactory results without amplification of noises. However, this method fails on non-homogeneous hazy images. Furthermore, when objects are in the same plane and look similar, vanishing lines are falsely estimated and unable to update the segmentation process. In this case, it wrongly estimates the transmission, scene radiance and the segmentation map of a hazy image.

In summary, the image enhancement-based methods don’t use the physical model of haze formation and also don’t concentrate on the image quality. They only highlight certain details of the image while may reduce or remove some information from the dehazed image. These methods suffer from the problem of over-saturation of pixels and over enhancement. In addition, they are not able to remove the dense haze. However, when image enhancement-based techniques are combined with a physical model like [22, 30, 45], their performance is improved a lot.

4.2 Image Fusion Based Methods

Image fusion is an image processing technique that selects the best regions from multiple images and combines them into a single high-quality image. A fused image is generated in a transformed domain such as Gaussian and Laplacian pyramids, Gradient-domain, Linear, High boost filtering, Guided filtering, Variation based, etc.(Table 3).

Table 3 Comparison of existing image fusion-based methods

In [52] proposed a multiple prior based method to estimate the global atmospheric light. Three priors: color saturation, brightness and gradient map are combined to judge a pixel whether it belongs to an atmospheric light or not. This method computes two coarse transmission maps: pixel-based transmission (PTM) and block level transmission map (BTM). A fusion procedure is employed to combine these two transmissions as follows:

$$F_{i} = P_{i} \left( {1 - \left( \frac{i}{N} \right)^{3} } \right) + B_{i} \left( \frac{i}{N} \right)^{3}$$
(4)

Laplacian Pyramid is used to compute the transmission map in which N is the number of decomposition levels in the Laplacian pyramid. Pi and Bi denote the decomposition result of PTM and BTM, respectively. Fi is the linear fusion of two transmissions Pi and Bi. Furthermore, fused transmission is refined by a total variation. This method suffers from various problems-e.g., incomplete haze removal, unable to highlight the local details of the image and also not being able to remove dense haze.

The existing deep learning methods are trained on synthetic indoor hazy images. Therefore, their performance is not satisfactory on outdoor hazy images. Park et al. [55] proposed a heterogeneous generative adversarial network (GAN), consisting of a CycleGAN and a conditional GAN for restoring a haze-free image with the preservation of texture details. In Phase 1, a cycleGAN is trained on unpaired outdoor synthetic hazy images. Phase 2 utilizes various networks, such as atmospheric light estimation, transmission map estimation, and a fusion CNN. Finally, these three networks are trained through adversarial learning. Fusion CNN combines the output of Phase 1 and Phase 2 to achieve the dehazed image.

Zhu et al. [56] proposed a fusion-based algorithm to solve the image dehazing problem without considering the degradation mechanism. A set of under-exposed images are generated using Gamma correction coefficients. A Guided filter is used to decompose an under-exposed image into local components and global components. For the local components, the exposure quality of the image is measured by applying the average filter to the luminance component. Global components reflect the structure information of the image and its weight is calculated using initial global components and quadratic function of average luminance. Once the weights are ready for under-exposed images, they are fused in a pixel-wise manner. Global components Bi and global components Di of multiple gammas corrected input images are fused as shown:

$$F = \sum\limits_{i = 1}^{n} {W_{i}^{B} B_{i} } + \alpha \sum\limits_{i = 1}^{n} {W_{i}^{D} D_{i} }$$
(5)

where \(\alpha \ge 1\) controls the local details in the fused image. Finally, to improve the quality of the dehazed image in terms of color quality, saturation adjustment is performed. The framework of this method is shown in Fig. 11. The overall performance of this method is good and achieves satisfactory results with computational efficiency.

Fig. 11
figure 11

The framework of the method [56]

Yuan et al. [57] proposed a transmission fusion strategy for handling normal and bright regions of the hazy image. They propose soft segmentation based on image matting to segment the image. Means and variances of local patches are calculated and binary classification is performed to generate the trimap. In the next step, image matting segments the hazy image into normal and bright regions. For normal regions, the transmission is calculated by DCP while transmission for bright regions is calculated by the atmospheric veil correction method. Finally, the fuzzy fusion method fuses these two transmissions obtained by DCP and AVC. The proposed framework of the method [57] is shown in Fig. 12. This method is tested on various challenging hazy images. However, this method has high computation complexity due to the estimation of two transmissions, binary classification and fuzzy fusion. It also suffers from the problem of over enhancement and halo artifacts.

Fig. 12
figure 12

The framework of the method [57]

Ma et al. [58] proposed a method to enhance the visibility of sea fog images. In the fusion process, the first image is obtained by a linear transformation. The second image is generated by a high-boost filtering algorithm based on a Guided filter. A simple fusion process is followed to combine these two images. The dehazed image is obtained by performing white balancing on a fused image. However, this method produces halo artifacts and is unable to remove noises in the dehazed image.

Son et al. [59] proposed a near-infrared fusion model to deal with the color distortions and removal of haze. This method develops the color and depth regularizations with the traditional degradation model of haze. The color regularization assigns colors to the haze-free image based on colors from the colorized near-infrared image and visible color image. The depth regularization estimates the depth of the colorized near-infrared image. Finally, both regularizations transfer the visibility and colors into a dehazed version of the captured visible image. Shibata et al. [60] focused on developing an application adaptive importance measure image fusion method that can be applied to many applications, including night vision, temperature-perceptible fusion, depth-perceptible fusion, haze removal, image restoration, etc. This method is a learning-based framework that extracts various features (Gabor, intensity, local contrast, gradient) from the decomposed images and learns the important area of the image without knowing the application. Zhao et al. [61] handle two problems of dehazing: misestimation of transmission and oversaturation. It first identified the edges called TME which are misestimating the transmission. Accordingly, a hazy image is divided into two regions: TME and non-TME regions. Multi-scale fusion is used to fuse both patch-wise transmission and pixel-wise transmission. This method greatly enhances the visibility of the hazy image. However, it has a high computation time. Moreover, two post-processing methods (Fast Gradient Domain GIF and exposure enhancement) are utilized on a fused image to obtain the final haze-free image.

Agrawal et al. [62] proposed a fusion based method based on the joint cumulative distribution function (JCDF). This method dehazed the long shot hazy image without color distortions in nearby regions and at the same time, it can enhance the visibility in faraway regions. This method generates multiple images from different modules, such as faraway, nearby, CLAHE. Finally, these multiple images are fused into a single high-quality and artifacts-free image in the gradient domain.

The method uses the following JCDF equation to generate multiple images in nearby and faraway modules:

$$F_{Z} (z) = e^{ - \lambda z} ( - 1 - \lambda z) + 1$$
(6)

where \(z = x_{1} + x_{2} = x^{{d_{\min } }} + x^{{d_{\max } }}\),\(d_{\min }\) deals with the fog in nearby regions whereas \(d_{\max }\) deals with the fog in faraway regions. The parameters \(d_{\min }\) and \(d_{\max }\) are set to 2 and 10, respectively. \(\lambda\) is the dehazing parameter and used to generate the images for the fusion process. It generates 1 image with \(\lambda = 2\) in faraway region and 3 images in the nearby region with \(\lambda = 5,8,40\) to avoid the problem of over-saturation and color distortions. Furthermore, to increase the contrast, CLAHE is used to generate 1 more image. Finally, all these images are fused in a single dehazed image in the gradient domain, as shown in Fig. 13.

Fig. 13
figure 13

Image generation in each module. a Hazy image, b Image generated in faraway region. ce Images generated in nearby region. f CLAHE image, g Dehazed image

Recently, several effective fusion-based techniques were introduced which combine the multiple images generated from image enhancement or restoration-based methods. These methods successfully solve the problem of DCP, edge preservation, dense haze removal and halo artifacts. However, the fusion procedure may be complex and the dehazing speed may be decreased due to the generation of multiple images from enhancement-based operators.

4.3 Superpixel Based Dehazing

Another category of dehazing method introduced recently is superpixel based. The superpixels are utilized in dehazing methods in two ways. First, they are used to segment the sky and non-sky regions to remove the problem of color distortions or color artifacts of DCP in sky regions. Another use of superpixel segmentation is to replace the patch-based operations with a superpixel. It offers two advantages: good dehazing speed and reduction of the halo artifacts (Table 4).

Table 4 Comparison of existing superpixel based methods

The two problems are associated with superpixels based approaches: over enhancement and time complexity. In superpixel based approaches, the number of superpixels is decided manually. The higher number of superpixels may introduce the problem of darkening of color while a smaller number may not sufficient to remove the haze. Another problem is the selection of a superpixel segmentation algorithm. Some algorithms have high computational complexity. Therefore, it is advised to select an algorithm that extracts the superpixels in real-time.

4.4 Prior Based Methods

The restoration-based method uses the physical model or haze formation. They compute the transmission map or depth map based on priors/ assumptions, such as dark channel prior [63], color attenuation prior [64], average saturation prior [65], non-local prior [66], gradient profile prior [27], color ellipsoid prior [67], etc.

Berman et al. [66] proposed a non-local prior as opposed to priors based on local patches. According to this prior, a haze-free image can be expressed by a few hundred colors from the RGB cluster and these pixels of RGB clusters are spread over the entire image. Each cluster in RGB space can be represented using lines termed haze-lines. These haze lines are used to estimate the atmospheric light, distance map and haze-free image. The failure case of this method is the non-uniform lighting which may lead to over enhancement and artifacts (Table 5).

Table 5 Comparison of existing restoration-based methods with priors

Singh et al. [40] handles the problem of preserving the texture details in the presence of complex background and large haze gradient. They proposed a new prior called gradient profile prior to evaluate the depth map. The transmission map is refined by the anisotropic diffusion and iterative learning base image filter. The image gradient gives the direction and magnitude and is calculated as follows:

$$\Delta I = \left( {\frac{\partial I}{{\partial m}},\frac{\partial I}{{\partial n}}} \right)$$
(7)

where \(\frac{\partial I}{{\partial m}}\) represents partial derivatives of an image in m direction while \(\frac{\partial I}{{\partial n}}\) shows partial derivatives for n direction. \(\frac{\partial I}{{\partial m}}\) is calculated as differences at one pixel, before it and after it and calculated as follows:

$$\frac{\partial I}{{\partial m}} = \frac{I(m + 1,n) - I(m - 1,n)}{2}$$
(8)

and similarly, \(\frac{\partial I}{{\partial n}}\) is written as:

$$\frac{\partial I}{{\partial m}} = \frac{I(m,n + 1) - I(m,n - 1)}{2}$$
(9)

The maximum gradient values in I are considered as global atmospheric light and is estimated as follows:

$$A = I(\mathop {\max }\limits_{c} (I_{m}^{c} )$$
(10)

and transmission map is estimated as follows:

$$t(j) = 1 - \beta \Delta n \in \Omega (j)\left( {\Delta c\frac{{I_{m}^{c} (n)}}{{A_{l}^{c} }}} \right)$$
(11)

where \(\Delta n \in \Omega (j)\left( {\Delta c\frac{{I_{m}^{c} (n)}}{{A_{l}^{c} }}} \right)\) is the gradient profile prior of the normalized image. It overcomes the sky region problem of the DCP method as it is computed toward 1 and t(j) will be toward 0. Some haze \(\beta\) is added to the image to look more natural.

Most of the prior based methods follow a physical model of haze formation which assumes single scattering under the homogeneous haze. However, in a realistic environment, haze behavior is non-homogeneous and there are multiple sources of scattering [68]. Besides, the dehazing results depend on the validity of priors. If assumptions or priors do not hold, it may result in various issues, such as incomplete haze removal, color distortions or artifacts due to the wrong estimation of the transmission.

4.5 Polarization Based Dehazing

The polarization-based methods utilized the polarized characteristic of the light. Therefore, it restores the depth information of the hazy image using multiple images with different degrees of polarization, generally represented as I0 and I90. Some methods based on this category are listed in Table 6.

Table 6 Comparison of existing polarization-based methods

Polarization based dehazing methods have a great advantage in terms of high efficiency and low computational complexity. These methods are effective in all kinds of turbid media, including haze, fog, water, etc. They are also capable to restore dense hazy images with detailed information. However, it requires a precise selection of image regions such as the sky region to estimate the key parameters which are not applicable in the real world. Also, a photon noise, a well-known quantum–mechanical effect is ignored by most of the existing polarization-based methods, resulting in amplification of noise in the dehazed image.

4.6 DCP Based Dehazing

Dark channel prior (DCP) is very simple and popular prior for haze removal. This prior is based on the observation of the haze-free images that at least one-color channel is significantly dark i.e. minimum color channel in a haze-free image is very close to 0 except the sky regions. This prior was introduced in the year 2010. Since 2010, a lot of research work is going on to improve the performance of DCP. In this section, we discuss recent methods based on DCP along with which problem of DCP they have solved (Table 7).

Table 7 Comparison of existing DCP based methods

Atmospheric particles degrade the quality of the image in terms of blurring, distortion, color attenuation and cause low visibility. The method [69] proposed an improved version of DCP to handle the artifacts in the original DCP method. This method defines α as a square window of size l and calculates the dark channel as follows:

$$I_{{\left( {x - \left\lfloor {l/2} \right\rfloor \ldots x + \left\lfloor {l/2} \right\rfloor ,y - \left\lfloor {l/2} \right\rfloor \ldots y + \left\lfloor {l/2} \right\rfloor } \right)}}^{dark} = \max \left( {\alpha (1 \ldots l),(1 \ldots l),I_{{\left( {x - \left\lfloor {l/2} \right\rfloor \ldots x + \left\lfloor {l/2} \right\rfloor ,y - \left\lfloor {l/2} \right\rfloor . \ldots y + \left\lfloor {l/2} \right\rfloor } \right)}}^{dark} } \right)$$
(12)

where α is a square window of size l and calculated as follows:

$$\alpha = ones(l,l)*\mathop {\min }\limits_{z \in \Omega (x,y)} \left( {\mathop {\min }\limits_{c \in (R,G,B)} I^{c} (z)} \right)$$
(13)

This method is managed to remove the artifact but it is not comparable to the DCP method in quantitative evaluation.

Chen et al. [51] proposed a DCP based method for suppressing artifacts and noises using gradient residual minimization. However, due to ambiguity between artifacts and objects, it is unable to increase the contrast for the objects located at a far distance also slightly blurs the details.

In summary, many researchers addressed the problems of DCP and according presented their solution. For example, the method [31] proposed an alternative method for fast computation of the transmission map using morphological reconstruction. Since the performance of DCP is not good in the sky regions, the method [46] proposed a solution using quadtree decomposition and a region-wise transmission map. The method [70] removes the problem of color distortions for bright white objects using superpixels. The method [71] removes the problem of halo artifacts of DCP using energy minimization.

4.7 Airlight Based Methods

The existing dehazing methods focus on estimating the transmission only and ignore the contribution of airlight in the dehazing process. These methods produce over smoothed image without fine details. Two factors: wrong estimation of airlight and ignorance of multiple scattering contribute toward this problem. Besides, inaccurate airlight is also responsible for color distortions in the dehazed image. Therefore, recently, some works related to the estimation of airlight are reported in the literature (Table 8).

Table 8 Comparison of existing airlight based methods

Therefore, the estimation of the airlight is as important as the estimation of the transmission. Inaccurate estimation of the atmospheric light may cause a haze-free image to look unrealistic and color distortions in the dehazed image.

4.8 Hardware Implementation Based Methods

In recent years, significant progress is made toward the development of real-time dehazing applications. Real-time dehazing is highly demanded in smart transportation systems and advanced driver assistance systems (ADAS). These applications demand a higher frame rate, low-cost hardware and power consumption. To date, the methods which fulfill these requirements are very rare. Image dehazing consists of many steps: estimation of transmission, airlight, refinement of the transmission, recovery of haze-free image and an optional step post-processing operation on a haze-free image which leads to computational complexity. Many hardware such as Cortex A8 processor, field-programmable gate array (FPGA), TSMC 0.13-μm, TSMC 0.18-μm, DSP Processor, Graphics Processing Unit (GPU), application-specific integrated circuit (ASIC), etc. Therefore, dehazing method requires hardware implementation for resource-constrained embedded systems to meet the real-time challenge. This section discusses the state-of-the-art methods in aspects of hardware architecture (Table 9).

Table 9 Comparison of existing hardware implementation-based methods

Shiau et al. [72] proposed an extremum approximate method to estimate the atmospheric light that uses a 3*3 minimum filter to obtain the dark channel and contour preserving estimation to calculate the transmission. This method is implemented on 11 stage pipeline architecture for real-time applications. The architecture is divided into four modules: register bank, atmospheric light estimation, transmission estimation and scene recovery, as shown in Fig. 14. It can process one pixel per clock cycle. It can achieve 200 MHZ with 12,816 Gate counts by TSMC 0.13-μm technology. The power consumption is 11.9 mW.

Fig. 14
figure 14

General framework of Hardware based implementation [72]

The register bank modules provided 9-pixel values of the current 3*3 window as an input to the atmospheric light estimation module. Line buffers are used to store the pixel values of 2 rows of an input hazy image. Because of the independent nature of ALE and TE, clock gating help to switch between them for power saving.

4.9 Supervised Learning/Machine Learning Based Methods

Despite numerous methods proposed in the literature, they are restricted to only hand-crafted features. However, effective and reliable restoration of a hazy image is still an open challenge. The accuracy of the restoration-based methods depends on the validity of the prior. In a failure of prior, they may cause various issues, such as residual haze or an unrealistic hazy image. Therefore, the effort had been made toward developing machine learning methods for reliable estimation of the transmission for restoring a haze-free image. However, these techniques require a vast amount of data of hazy and their ground truth image, which is not available. For training the model, a lot of synthetic data using Eq. 1 is generated which limits the performance when they are tested on natural or realistic hazy images. For ease of understanding, machine learning methods are further categorized as traditional or simple learning and deep learning. This section focuses on simple learning techniques. These techniques used linear and non-linear regression, support vector regression, linear model, radial basis function, conditional random field, etc. (Table 10).

Table 10 Comparison of existing traditional machine learning based methods

4.10 Deep Learning Based Methods

Recently deep learning based had attracted the researcher and successfully implemented in dehazing. These techniques not only remove the haze from an image but also offer a fast and quality dehazed image. Two types of methods exist in the literature for deep learning, one which utilize physical model [73,74,75], and another is without physical model [76,77,78,79,80]. Furthermore, some techniques [73,74,75, 77, 79] require mapping of hazy and their corresponding GT image for training the model while other techniques do not need hazy images and corresponding haze-free images for training [76, 80, 81]. Several deep learning base techniques are reported, including multi-scale convolutional neural network (MSCNN) [73], Dehaze Net [74], All-in-One Dehazing Network (AOD-Net) [75], Cycle-Dehaze [76], Gated Fusion Network [77], Generic Model-Agnostic (GMAN) [78], back projected pyramid network [79], Double DIP [80] (Table 11).

Table 11 Comparison of existing deep learning-based methods

In [82], proposed a variational and deep CNN based dehazing method for estimating transmission, airlight and dehazed image simultaneously. The deep CNN is employed to teach haze-relevant priors (fidelity terms and prior terms). Furthermore, an iterative optimization method based on gradient descent is utilized to solve the variational model.

The method [83] proposed a GAN based method that jointly learns the transmission and haze-free image using loss functions (perceptual loss and Euclidean distance). In the first step, the transmission is estimated by a hazy image and it is combined with high dimension features. Afterward, both features and transmission are fed to the Guided dehazing module to recover a haze-free image. This approach is shown in Fig. 15.

Fig. 15
figure 15

A framework of the GAN based image dehazing method [83]

The traditional methods used hand-crafted features such as contrast maximization, dark channel, etc. The method [84] used an encoder-decoder based structure called gated context aggregation network (GCANet) to directly recover a haze-free image. This architecture utilized smoothed dilated convolution to avoid the artifact. Moreover, a subnetwork is proposed to fuse the features at different levels.

Zhang et al. [85] presented a multi-scale dehazing network called the perceptual pyramid deep network. This encoder and decoder-based method directly learn the mapping between a hazy and a clear image without estimating the transmission map. An encoder is constructed through the dense block and residual block while a decoder consists of a dense residual block with a pyramid pooling module to retain contextual information of the scene, as shown in Fig. 16. The network is optimized by mean squared error and perceptual losses.

Fig. 16
figure 16

Encoder-decoder structure framework of image dehazing [85]

Qin et al. [86] proposed FFA-net (feature fusion attention network) to obtain a haze-free image. This method consists of three modules: feature attention module (which combines channel attention and pixel attention and focuses on thick haze removal), local residual learning (deal with thin haze) and feature fusion attention (adaptively learns the weights from the feature attention module. As shown in Fig. 17, a hazy image is provided input to a shallow feature extraction module. After that, it is fed into an N block structure with skip connection and output is fused into a feature fusion module. Finally, global residual learning is used to restore a haze-free image.

Fig. 17
figure 17

Feature fusion attention network [86]

The prior based methods estimate the transmission on the basis of haze-relevant priors. As a result, dehazed image may suffer from darkened or brightened artifacts.

Recently, end to end CNN based deep learning methods had shown great potential in image dehazing. However, these methods fail to handle non-homogenous haze. In addition, the existing popular multi-scale approaches are utilized to solve various issues of dehazing, namely color distortions, artifacts and some of them also can handle dense haze, but they are not computationally and memory efficient. Deep learning methods produce a visually pleasing result for most hazy images. However, their performance relies heavily on several training samples and the quality of these sample images.

4.11 Miscellaneous Category

In this section, we present the miscellaneous category of dehazing methods. This category includes semi-supervised, unsupervised and ensemble network. In semi-supervised learning, both approaches supervised and unsupervised are utilized in deep CNN. For example, in [87] supervised learning is performed using supervised loss (mean squared, adversarial and perceptual loss) of clean image and hazy image for synthetic images and unsupervised learning is exploited using DCP and gradient prior on real images.

Unsupervised learning does not require the hazy and haze-free image pairs for training the deep neural network. These methods avoid the need for a large-scale synthetic dataset required for training the model. Recent learning-based methods utilized a deep learning model to establish the relationship between hazy and clear images. However, it is difficult to collect a vast amount of hazy and clear images for the training. Therefore, these models are trained on synthetic images, generated using indoor images and corresponding depth images. The performance of these methods is degraded on outdoor hazy images. Some research works use unsupervised learning which does not require hazy images and corresponding GT images during the training phase [88]. It uses only a single captured hazy image to learn and inference the haze-free image.

Another interesting category of image dehazing method is the ensemble, where multiple deep CNN are exploited. For example, in method [89], multiple neural networks were utilized to estimate the transmission to solve the problem of overfitting. Yu et al. [90] proposed three ensemble models: EDN-AT, EDN-EDU and EDN-3J. One of them, EDN-EDU is an ensemble (Encoder-decode and U-net) of two sequential hierarchical different dehazing networks. The ensemble networks can remove the non-homogeneous haze (Table 12).

Table 12 Comparison of miscellaneous category

The atmospheric model assumes the global airlight and scattering coefficient. Therefore, it introduces unrealistic color distortions in dehazed images. The method [91] proposed a color constrained dehazing model to produce a realistic haze-free image. This method solves the dehazing problem as an optimization problem where cost function considers color, local smoothness of transmission and airlight. Moreover, this method can be developed as a semi-supervised dehazing model. It is modeled as three networks by training on synthetic datasets for estimating airlight, transmission and haze-free image. The proposed loss function considers loss in the reconstruction of the hazy image, reconstruction loss of haze-free image, smoothing loss of airlight and transmission map. Golts et al. [92] proposed a deep energy method that offers an unsupervised energy function that replaces the supervised loss. This deep neural network performs training on real world input without the requirement of manually annotated labels. This method is used in three different tasks: Single image dehazing, image matting and seeded segmentation. Experiments are performed on RESIDE dataset.

Li et al. [93] proposed an unsupervised and untrained neural network for image dehazing, called as you only look yourself (YOLY). This method utilized three subnetworks to decompose the hazy image into three latent layers, i.e., haze-free layer, transmission layer and airlight.

Figure 18 shows the input hazy image x is decomposed into three layers using three joint subnetworks. This approach feed x simultaneously into a haze-free estimation network (J-net), a transmission network (T-net) and airlight network (A-net). After that, a hazy image is reconstructed through an atmospheric scattering model. In this way, it is learned in an unsupervised manner, and networks are optimized by the loss function. For the J-net network, a loss function considers the minimization of loss by taking the difference of brightness and saturation.

Fig. 18
figure 18

General framework of unsupervised image dehazing [93]

4.12 Non-Homogeneous Haze

Although deep learning-based methods had been successfully implemented in image dehazing, one of the most challenging problems is to remove the non-homogeneous haze. Most of the method works effectively in presence of homogeneous haze. However, in a real scenario, haze is not homogeneous i.e., not evenly distributed across the image. A dehazing method is required to enhance the visibility without color distortions under the non-uniform airlight (Table 13).

Table 13 Comparison of Non-homogeneous haze removal

The traditional methods either directly recovering haze-free image (J) with image enhancement or fusion based methods or restoration-based method which estimate transmission map and airlight, fail in case of non-homogenous haze where there is an uneven distribution of haze in the image, i.e., some part of the image is covered with denser haze and other parts with the thin haze. The method [94] takes advantage of both methods to estimate a weight map w. w combines the result of directly estimated J by a physical model. This architecture uses one encoder and four decoders to estimate dehazing parameters J, A, t and w, as shown in Fig. 19. Channel attention is added to generate unique feature maps for these decoders. Moreover, dilation inception is proposed to fill the missing information by non-local features.

Fig. 19
figure 19

U-net structure for non-homogeneous haze removal [94]

Wu et al. [95] proposed a knowledge transfer dehazing network (KTTD) which consists of two networks, i.e., teacher network and dehazing network, as shown in Fig. 20. The teacher network learns the knowledge about clear image and transfers this knowledge to the dehazing network. Furthermore, a feature attention module comprises channel attention and pixel attention is employed to extract important details of the image. Finally, an enhancing module is developed to refine the texture details.

Fig. 20
figure 20

The dual network (knowledge transfer dehazing network) for non-homogeneous haze removal [95]

5 Datasets Used for Image Dehazing

At the beginning of this field, there were very limited datasets available and also the size of these datasets was very small. The researcher used only a few images for validating the performance of their proposed haze removal algorithm. They download the hazy images from the Internet for the dehazing task. The drawback of this approach is that these images do not contain the ground truth images. The lack of ground truth images manifests a great challenge for the researchers in evaluating their methods qualitatively and quantitatively. Therefore, various blind dehazing metrics were introduced but these metrics were not accepted by the global community to conclude due to a lack of haze-free images.

Now a day, two types of datasets are used in this field: a natural hazy image without reference image known as a real image and a synthetic hazy image along with the depth image or ground truth image. The assessment methods are also different for both types of hazy images, which will be discussed in the next section. We discuss all the datasets used in this field based on various parameters, namely the process of hazy image generation, number of images, types of hazy images, etc. The performance of different dehazing methods on these datasets is also explained in the experiment and results section.

5.1 Frida Dataset [96]

The dataset foggy road image database consists of 90 synthetic images of 18 urban road scenes. Frida2 comprises 330 synthetic images of 66 diverse road scenes. Each fog-free image contains 4 foggy images and a depth map, as shown in Fig. 21. The dataset considers four types of fog: uniform fog, heterogeneous fog, cloudy fog, and cloudy heterogeneous fog. Uniform fog is synthesized according to the physical model and Perlin’s noise between 0 and 1 is added to simulate heterogeneous fog. This dataset is helpful to improve the performance of a camera-based driver assistance systems whose objective is to provide a clearer view of the road in the presence of fog to minimize accidents.

Fig. 21
figure 21

Images a without fog, b with uniform fog, c with inhomogeneous fog, d with fog and clouds, e with clouds and inhomogeneous, f Depth map

5.2 Fattal’s Dataset [97]

This is the most popular dataset available to the research community for the assessment of dehazing capability. This dataset provided 12 synthetic hazy images along with 31 realistic hazy images. This dataset contains various benchmarks hazy images, consisting of several challenges: night-time haze, heavily dense haze, white objects, depth discontinuities, different illumination conditions, sky regions, etc. Some sample images from this dataset are shown in Fig. 22a.

Fig. 22
figure 22

Sample images of datasets a [97], b [98], c [99]

5.3 Waterloo IVC [98]

The dataset consists of 25 realistic hazy images of diverse scenes in an outdoor and indoor environment. There are 22 outdoor real-world hazy images, captured in different haze concentrations while 3 indoor images are simulated using physical mode. This dataset is widely used in single image dehazing to evaluate performance. Some sample images from this dataset are shown in Fig. 22b.

5.4 500 Foggy Images [99]

The dataset consists of 500 natural foggy images, used in many research papers for evaluation of their method. These images comprise different sizes, different fog densities ranging from light fog to dense fog, and diverse image contents. Some sample images from this dataset are shown in Fig. 22c.

5.5 D-Hazy [100]

This dataset contains 1400+ pairs of synthetic hazy and haze-free images of indoor scenes. This dataset is generated using Middlebury and NYU depth datasets, containing their corresponding depth maps. For each image, the transmission map is computed based on atmospheric light and the scattering coefficient. Atmospheric light is assumed to be pure white [101] and the scattering coefficient is set by default as 1. Some sample images from this dataset are shown in Fig. 23a.

Fig. 23
figure 23

Haze images from datasets, a [100], b varying visibility scenes from foggy Cityscapes [55]

5.6 Semantic Understanding of Foggy Scenes [102]

Sakaridis et al. [102] presented two distinct datasets: foggy cityscapes and foggy driving. The foggy cityscapes dataset was derived from the cityscape dataset and contains outdoor synthetic hazy images with different scattering coefficients. It preserves the semantic annotation of the original images. Foggy driving was comprised of 101 real-world foggy road scenes with annotation and a maximum resolution of 960*1280 pixels, as shown in Fig. 23b.

5.7 Haze RD Dataset [103]

This dataset contains 15 outdoor scenes with realistic hazy conditions. Each hazy scene is simulated with five different weather conditions, ranging from thin haze to dense haze and visible range from 50 to 1000 m, as shown in Fig. 24. These images are of high resolutions and justify the scattering theory of the physical model. A depth map of each hazy scene is estimated by fusing structure from motion and lidar.

Fig. 24
figure 24

HazeRD samples from left to right, a Haze-free image, b depth map, simulated hazy images with the visual range of c 50 m, d 100 m, e 200 m, and f 500 m, respectively

5.8 I-Haze Dataset [104]

The dataset contains 35 indoor image pairs of hazy and corresponding haze-free images. The real haze appearance is produced by a professional haze machine and captured in a controlled environment under the same illumination for both hazy and haze-free images. Some sample images along with their GT images from this dataset are shown in Fig. 25a.

Fig. 25
figure 25

Sample hazy images along with GT images from a [104], b [105], c [106]

5.9 O-Haze [105]

This dataset is an outdoor scene dataset comprised pairs of real hazy and corresponding haze-free images. O-haze contains 45 different outdoor scenes in which real haze is produced by a professional haze machine that simulates a hazy environment. These scenes were captured on cloudy days, morning, sunset or when wind speed was below 3 km/h. Some sample images along with GT images from this dataset are shown in Fig. 25b.

5.10 Dense-Haze [106]

Ancuti et al. [106] proposed a Dense-haze dataset containing real-world hazy images, characterized by dense and homogenous haze. It consists of 33 pairs of real hazy and their corresponding haze-free images. Some sample images along with GT images from this dataset are shown in Fig. 25c.

5.11 RESIDE [107]

This is the recent and large-scale dataset of hazy images containing both synthetic and realistic hazy images, called realistic single image dehazing (RESIDE). This dataset is available in RESIDE standard and RESIDE-β. The standard RESIDE contains three subsets: indoor training test (ITS), synthetic objective testing set (SOTS), and hybrid subjective testing set (HSTS). The ITS contains 13,990 synthetic hazy images generated using 1399 haze-free images from NYU2 and Middlebury stereo indoor datasets. For each haze-free image, 10 synthetic hazy images are generated. Atmospheric light is taken uniformly randomly in between [0.7, 1.0] and the scattering coefficient is also randomly uniform in between [0.6, 1.8]. Testing sets are designed for evaluation purposes. The SOTS contains 500 different images with white scenes and dense haze synthesized from NYU2 which are not used in the training set. HSTS selects 10 synthetic outdoor hazy images, together with 10 realistic hazy images. Besides, RESIDE-β provides two more subsets: outdoor training set (OTS) and real-world task-driven testing set (RTTS). The OTS contains 72,135 hazy images and RTTS contains 4322 images.

This dataset provided a new dimension in the single image dehazing for the evaluation of various dehazing methods on a large-scale dataset in terms of full reference metric, no-reference metric, and human subjective rating in visual analysis. The sample images from each part of the RESIDE datasets are shown in Fig. 26.

Fig. 26
figure 26

Sample images from different category of RESIDE dataset [107] a ITS, b SOTS, c HSTS, d OTS, e RTTS

5.12 NH-Haze [108]

In the previous datasets, haze is characterized as homogeneous over the entire image. Since, haze is not distributed uniformly across the scene in reality, Ancuti et al. [108] proposed a non-homogenous realistic dataset. This dataset contains 55 real outdoor hazy images along with their corresponding haze-free images, generated by a professional haze machine by simulating the real conditions, as shown in Fig. 27.

Fig. 27
figure 27

Non-homogenous hazy image and GT image from [108], a hazy images, b GT images

Table 14 illustrates the different datasets used in the state-of-the-art methods. Two types of datasets are available for evaluation: real hazy images and synthetic hazy images. For real images, no GT image or depth map is available. Many works are reported on these datasets. After analysis of the dataset used by the recent methods, we found that Fattal’s dataset [97] and RESIDE [107] are the first choices for real and synthetic images, respectively.

Table 14 Standard datasets description

6 Evaluation Metrics

There are several evaluation metrics used for testing the capability of the dehazing algorithms (DHA). At present, the images used in assessment can be divided into two categories: when ground truth image is available and when ground truth image is not available. Therefore, two categories of quantitative metrics depending upon the availability of images are introduced: full reference metric and no-reference metric, as shown in Fig. 28. Since it is difficult to obtain a haze-free image of the same scene. Therefore, no-reference metrics are often used for the assessment of DHA.

Fig. 28
figure 28

Assessment criteria of real and synthetic hazy images [9]

During dehazing, various issues may remain unresolved, including residual haze, structure damage, color distortions, over enhancement, halo artifacts, noise amplification, blurring effects, edge preservation, etc. To measure these distortions, many dehazing quality assessment methods were introduced in the literature. In this section, we will explore all these metrics.

6.1 No-Reference Metrics

A good DHA must ensure the following qualities in the dehazed image: improved visibility, removal of artifacts, over enhancement, contrast enhancement, structure preservation, and edge preservations. By considering all these qualities, many dehazing metrics were introduced. Unfortunately, there is no single DHA that can test all the dehazing capabilities. In this section, we discuss some well-known and dehazing metrics introduced in recent years.

6.1.1 Blind Contrast Enhancement Assessment [110]

The contrast of the image under adverse weather conditions is reduced significantly due to the scattering of the particles. This method is widely accepted in many dehazing works where the reference image is not available. This method is based on the assessment of contrast in terms of visible edges before and after restoration. It uses three descriptors: rate of new visible edges (e), the gain of visibility level (r), and saturated pixel ration (σ). The value of the e metric specifies the ability of the dehazing method in terms of new visible edges in the restored image that are not seen in the original hazy image. It is calculated as follows:

$$e = \frac{{n_{hf} - n_{h} }}{{n_{h} }}$$
(14)

where \(n_{h}\) and \(n_{hf}\) represent the cardinality of visible edges in hazy and haze-free images, respectively.

The second metric r is the ratio of the visibility level of objects in the restored image and the visibility level of objects in a hazy image. This metric considers visible and invisible edges both in the hazy image as follows:

$$\overline{r} = e^{{ - }{\left( {\frac{1}{{n_{hf} }}\sum\limits_{{\,p_{i} \in \,\,\,\psi_{hf} }} {\log r_{i} } } \right)}}$$
(15)

where \(\psi_{hf}\) represents the set of visible edges in a haze-free image and ri is the gradient of pi and the corresponding pixels in a hazy image.

The third metric is the saturated pixel ratio. This metric talks about pixels which become saturated (black or white) after applying the dehazing process.

$$\sigma = \frac{{n_{s} }}{{\dim_{x} \times \dim_{y} }}$$
(16)

where ns is the number of saturated pixels and dimx and dimy represent the width and height of the image, respectively.

A high value of e and r indicates good quality of a dehazed image in terms of edges preservation and contrast enhancement while a small value of σ is an indication that a dehazed image has fewer saturated pixels or color distortions than a hazy image.

6.1.2 Non-Reference Image Quality Assessment based Blockiness and Luminance Change (BALC) [111]

This metric is designed to measure the two distortions in an image: blocking artifacts and improper luminance change. It is a no-reference metric and obtains the quality score of a dehazed image based on these distortions. These distortions in the dehazed image are estimated based on gradient. Usually, halo artifacts appear in the image at depth discontinuities. This method divides the image into 8*8 non-overlapping blocks. The blockiness of a block is measured by taking the average of discontinuities along the four boundaries of the block. For luminance change or blurring effect, it calculates the average of gradients inside the block. Finally, two measures are combined into a single metric as follows:

$$BALC = B_{hf} *L_{hf}^{ - \lambda }$$
(17)

where \(B_{hf}\) and \(L_{hf}\) denote artifact and blurring effect of the haze-free image. \(\lambda \ge 0\) is a parameter, used to adjust the importance of these two distortions.

A small value of BALC indicates the good quality of the haze-free image in terms of artifacts and blurring effects.

6.1.3 Blur Metric [112]

After the dehazing process, some methods introduce a blurring effect in the haze-free image. To check the quality of the dehazed image in terms of blur perception, many recent works used this metric.

This metric applies the low-pass filter on the dehazed image to obtain a blurred version of this image. The comparison of intensity variations between two images (the dehazed image and the blurred dehazed image) indicates blur annoyance. Thus, a high variation in intensity values between these two images signifies that the dehazed image is not blurred whereas a small difference indicates that the dehazed image is blurred.

Blur metric provides a score ranging from 0 to 1 which represents the best and the worst quality, respectively in aspects of blur perception.

6.1.4 Blind Image Quality Assessment (BLIIND-II) [113]

BLIIND-II is a no-reference image quality assessment metric based on a probabilistic model that predicts the quality score of an image. This metric uses the natural scene statistics (NSS) model which relies on discrete cosine transform coefficients. NSS model is built from undistorted natural scenes and requires a small number of training examples. The estimation of the predicted score consists of four stages. In the first stage, the image is divided into n*n blocks, then computing the DCT coefficients for each block. In the second stage, a generalized Gaussian density model is applied to each block that provides the model parameters. Four features: shape parameter, coefficient of variation, energy subband ratio measure and orientation features are extracted in the third stage from model parameters. Finally, the fourth stage consists of a Bayesian model that predicts the perceptual quality of the dehazed image. The steps for the computation of this metric are shown in Fig. 29.

Fig. 29
figure 29

Steps of computing BLIIND-II [113] metric

It considers various types of distortions, such as artifacts, white noise, Gaussian blur, fast fading channel, etc. in the estimation of a quality score. The values of this metric are in the range of [0, 100]. A higher value of BLIIND-II indicates the poor quality or distortions in the image. During dehazing, many periodic patterns (checkerboard and blocking artifacts) are generated in the haze-free image. Therefore, this metric can be used to identify these distortions in the image.

6.1.5 Blind/No-Reference Image Spatial Quality Evaluator (BRISQUE) [114]

Mittal et al. [114] proposed a blind/no-reference image spatial quality evaluator (BRISQUE) metric which measures the losses of naturalness of an image without calculating the distortion-specific features, such as blocking, artifacts, blur, ringing artifacts, etc. It computes the local luminance coefficients and observed that these normalized luminance coefficients follow a Gaussian distribution for the natural scene. They extracted 36 natural scene statistics features at 2 scales-18 features per scale, used to identify all types of distortions. Finally, a regression module, support vector regression is used to calculate the quality score of an image. This model is tested on a LIVE IQA database which consists of 29 reference images and 779 distorted images spanning different types of distortions.

6.1.6 Fog Aware Density Evaluator (FADE) [99]

This metric is specially designed for the evaluation of DHA to judge the visibility of the restored image. This fog-aware density evaluator (FADE) metric does not consider the various approaches used previously, such as estimation of the transmission, salient region, human-related opinion, etc. This makes the judgment of visibility based on deviations in the spatial domain, seen in hazy and haze-free images. A set of fog-aware statistical features, namely MSCN (mean subtracted contrast normalized) coefficients, sharpness, contrast energy, colorfulness, color saturation, image entropy and dark channel prior are extracted from foggy images. It used 500 foggy and 500 fog-free images to extract these features. A test foggy image is divided into p*p patches and average feature values for statistical features for each patch are extracted. A multivariate Gaussian (MVG) probability density in the d dimension is computed between a test foggy image and 500 natural fog-free images as follows:

$$MVG(f) = \frac{1}{{(2\pi )^{d/2} \left| \mu \right|^{1/2} }}\exp \left( { - \frac{1}{2}(f - \sigma )^{t} \,\mu^{ - 1} (f - \sigma )} \right)$$
(18)

where f represents fog aware features while \(\mu\) and \(\sigma\) denote mean and covariance, respectively. In the next step, the Mahalanobis distance measure is computed between the MVG fit to features extracted from a test foggy image and the MVG model of 500 fog-free images as follows:

$$D_{f} (\mu_{1} ,\mu_{2} ,\sigma_{1} ,\sigma_{2)} = \sqrt {(\mu_{1} - \mu_{2} )^{t} \left( {\frac{{\sigma_{1} + \sigma_{2} }}{2}} \right)^{ - 1} (\mu_{1} - \mu_{2} )}$$
(19)

where \(\mu_{1} ,\mu_{2}\) and \(\sigma_{1} ,\sigma_{2}\) are the mean and covariance of the MVG model of the 500 fog-free images and a test foggy image, respectively. Similarly, \(D_{ff}\) is calculated between the MVG of 500 foggy images and a test image. Finally, the fog density of a hazy image is calculated as follows:

$$D = \frac{{D_{f} }}{{D_{ff} + 1}}$$
(20)

Constant 1 is added to the denominator to prevent divide by zero exception. Smaller values of D represent lower fog density, i.e. A DHA is improving the visibility of the hazy image to great extent.

A smaller FADE value indicates less residual haze present in the dehazed result. The residual haze, artifacts and noises, on images reduce the FADE scores. However, the bright scenes may be mistaken as residual haze by FADE and increase the value of FADE.

6.1.7 Natural Image Quality Evaluator (NIQE) [115]

This is another no-reference metric used in DHA for measuring the distortion during the dehazing process. This metric provides a natural image quality evaluator based on quality-aware features of the natural scene statistics model. These features are extracted from a corpus of undistorted natural images. The 36 features are extracted from a dehazed image (whose quality is to be analyzed) by dividing the image into p*p patches and then comparing its MVG fit to the MVG model.

6.1.8 Dehazing Quality (DHQ) [116]

Min et al. [116] proposed an objective measure for the quantitative evaluation of dehazed images. To assess overall dehazing quality, first, they constructed a database of 1750 dehazed images generated from 250 real hazy images using 7 dehazing algorithms of different haze densities. Afterward, subjective quality evaluation is conducted on this dataset. Finally, the regression module predicts the dehazing quality (DHQ) by extracting several features from a dehazed image. The overall dehazing quality is measured in three aspects: haze removal, preservation of structure and over enhancement, as shown in Fig. 30.

Fig. 30
figure 30

Quantitative evaluation to measure overall issues of dehazing in real hazy images using non-reference based metric DHQ [116]

Haze removing features aim to design haze-relevant descriptors to evaluate haze removing effect. It considers five features: pixel wise DCP, image entropy, local variance, normalized local variance and contrast energy. Another important parameter is structure preservation used to judge the quality of the dehazed image. The dehazing process sometimes can introduce structure degradation or artifacts. To account for structure preservation, various features, such as variance similarity, normalized variance similarity and normalized image similarity are used. The third important quality indicator of the dehazing process is the identification of over enhancement problem in dehazed images. During the dehazing process, details in low contrast areas are darkened; colors are distorted or may introduce structural artifacts. It is measured in the form of low contrast areas and blockiness.

6.2 Full-Reference Metrics

Full-reference metrics are used to evaluate a method when a GT image is available. This method is applicable to test the performance of synthetic images. Recently, several metrics: PSNR, SSIM, LPIPS, CIEDE 2000 and SHRQ had been utilized in many works. In this section, we have explored all such metrics.

6.2.1 Learned Perceptual Image Patch Similarity Metric (LPIPS) [117]

Pixel-wise metrics such as PSNR and SSIM disagree with human judgment in assessing the perceptual quality of the dehazed image. Therefore, Zhang et al. [118] proposed a learned perceptual image patch similarity metric (LPIPS) that establishes the perceptual similarity between two images that resemble human opinion. It is based on deep features, trained on some well-known deep learning frameworks like supervised, self-supervised, unsupervised, etc. This metric can identify a wide range of distortions in the image, including photometric (color shift, contrast, saturation), noise (white, artifacts), blur, and compression. Three network architectures including AlexNet, SqueezeNet and VGG are considered for supervised training. The overall framework of this metric is shown in Fig. 31.

Fig. 31
figure 31

Deep learning framework to measure perceptual quality of the dehazed image

This diagram shows how the distance between two patches x (patch of GT image) and x0 (patch of dehazed image) is calculated by a network F. The features are extracted from many layers, normalize in channel dimension, scale each channel by vector w and compute the l2 norm. Finally, the average of spatial and channel-wise is taken. G is a network trained to predict perceptual quality h from distance pair d0 and d1.

The lower LPIPS score indicates a higher similarity between the two images.

6.2.2 Peak Signal to Noise Ratio (PSNR) [119]

Peak signal to noise ratio (PSNR) measures the degree of signal distortion between a haze-free image obtained by a DHA and GT image. A high value of PSNR signifies the good quality of the dehazed image. It is calculated as:

$$PSNR = 10\log_{10} \left( {\frac{{255^{2} }}{MSE}} \right)$$
(21)

where MSE is used to calculate the error between dehazed image and ground truth image. It must be minimized and calculated as follows:

$$MSE = \frac{1}{M \times N}\sum\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{N} {\left( {G(i,j) - I_{hf} (i,j)} \right)^{2} } }$$
(22)

where G and Ihf are the ground truth and dehazed images, respectively.

6.2.3 Structural Similarity Index Metric (SSIM) [120]

Since PSNR is not effective in terms of human visual judgment. Therefore, many researchers utilized the structural similarity index metric (SSIM) which evaluates the dehazing performance in terms of contrast, luminance and structure between ground truth and dehazed images. It is calculated as follows:

$$SSIM(r,i) = \left( {\frac{{2\mu_{r} \mu_{i} + c_{1} }}{{\mu_{r}^{2} + \mu_{i}^{2} + c_{1} }}} \right)\left( {\frac{{2\mu_{ri} + c_{2} }}{{\sigma_{r}^{2} + \sigma_{i}^{2} + c_{2} }}} \right)$$
(23)

Here, \(\mu_{r}\) and \(\mu_{i}\) are means of r (restored image) and i (GT image), respectively. \(\sigma_{r}^{2}\) and \(\sigma_{i}^{2}\) are the variances of r and i and \(\mu_{ri}\) is the cross-variance between r and i. Default values of \(c_{1}\) and \(c_{2}\) are 0.01 and 0.03.

SSIM yields a decimal score between 0 and 1. The score value of 1 indicates that the two images are identical. SSIM is highly sensitive to variations of contrast and illumination. Therefore, it can judge the issues of dehazing, such as incomplete haze removal or over-saturation of pixels.

6.2.4 CIEDE 2000 [121, 122]

During the dehazing process, color distortions may be introduced in a restored image. It cannot be reliably evaluated by PSNR or SSIM. Therefore, researchers in this field also used an accurate color difference metric CIEDE 2000 which assesses dehazing in terms of color restoration closer to human eye perception in color difference.

It yields values in the range [0,100] with smaller values indicating better color preservation, and values less than 1 corresponding to imperceptible by the human eye. A value of 100 indicates that colors are the opposite of two images.

6.2.5 Synthetic Haze Removing Quality (SHRQ) [9]

Min et al. [123] proposed a full reference metric called synthetic haze removing quality (SHRQ) to evaluate the overall quality of a dehazed image. The proposed dehazing quality evaluator integrates many quality parameters raised during the dehazing process. These issues of dehazing are structure recovery, color rendition and over-enouncement. The author first creates an SHRQ database that consists of two subsets: regular and aerial images. The regular image dataset consists of 45 haze-free images while the aerial dataset contains 30 high-quality aerial images. The ASM model is utilized to get the synthetic hazy images. These hazy images are processed by eight state-of-the-art methods. The overall quality of a dehazed image is estimated as follows:

$$Q = \frac{1}{z}\sum\limits_{i,j} {S_{sim} } (i,j).[C_{ren} (i,j)]^{\alpha } .O$$
(24)

where \(S_{sim}\) is the structure map, \(C_{ren}\) is the color rendition map and O represents over-enhancement in low contrast areas. z represents the total number of pixels, α is set empirically to adjust the importance of color information.

7 Experimental Results

In this section, experimental results are presented in three ways. First, we evaluate the recent state-of-the-art methods based on dehazing assessment criteria. Second, we discuss the qualitative or visual analysis of dehazing methods. Finally, we discuss the performance of different methods quantitatively on different datasets.

7.1 Comparison of the State-of-the-art Methods based on Dehazing Assessment

This section presents the assessment criteria of different dehazing methods based on parameter setting during experimentation, dataset(s) selected and evaluation metrics used for the assessment. For comparison, we have collected this data from their manuscript. This analysis is illustrated in Table 15.

Table 15 Comparison of experimental analysis of existing methods using different parameters

Table 15 demonstrates the datasets and metrics used by the respective dehazing method. We can notice in the table; the recent state-of-the-art methods utilize a variety of dehazing metrics including full reference and no reference metrics for comparison purposes. Most of the methods focus on selecting the number of metrics for evaluation. In this regard, the method [43, 56, 124] utilized sufficient metrics for evaluation. Besides, a good DHA must be tested on diverse datasets of different haze concentrations including dense haze, non-homogeneous haze, sky regions, night-time hazy conditions, mild haze, etc. The method [40, 61] is tested on a large number of datasets as compared to other methods. We can also notice in this table that all DHA requires some parameters to be adjusted adaptively or manually, irrespective of their category. The number of parameters increases the overhead and reduces the efficiency of a method. Hence, they must be minimized as methods [40, 125].

7.2 Qualitative Evaluation

Figure 32 shows the visual analysis of restoration-based methods with prior on two hazy images from HSTS of RESIDE dataset along with GT image. We can observe in this figure that all the methods are unable to preserve the color and contrast of the image. All the dehazed images in the sky regions are darker than the GT image. In addition to color distortions, the DCP [63] also suffers from halo artifacts. However, the dehazed image by [47] has fewer color distortions as compared to other methods and resembles the GT image.

Fig. 32
figure 32

Reside HSTS: Prior based restoration methods, a Hazy image, b DCP [63], c NLD [66], d CAP [64], e BCCR [49], f CEP [67], g LBF [47], h GT

Furthermore, Fig. 33 shows the qualitative results of different machine learning and deep learning methods on the HSTS dataset along with the GT image. It is observed in this figure that dehazed result by the Deep DCP method has residual haze. The methods [73, 75], and [76] have color distortions in the image. The other methods [74] and [126] have fewer color distortions. The dehazed image obtained by [89] resembles the GT and also all the details are visible.

Fig. 33
figure 33

RESIDE HSTS: learning based methods, a hazy image, b AOD net [75], c DehazeNet [74], d deep DCP [88], e MSCNN [73], f PQC [126], g cycle-dehaze [76], h DFIDSE [89], i GT

Figure 34 shows a visual comparison of state-of-the-art methods on two hazy images taken from O-Haze datasets. We can notice in this figure that NLD [66], PDN [127] highly distort the color of the image. The AOD-net and DCPDN are unable to remove haze completely. The method GFN [77] is managed to remove haze and also has fewer color distortions. The dehazed image achieved by method [128] is much closer to the GT image.

Fig. 34
figure 34

Haze removal results by various methods on hazy images from O-HAZE a hazy image, b NLD [66], c AOD-net [75], d PDN [127], e GFN [77], f DCPDN [142], g DM2F-Net [128], h GT

The visual analysis in Fig. 35 reveals that removing dense haze is still a challenging task. The performance of most of the methods (deep learning and prior based) on this dataset is not satisfactory. The details of the images are imperceptible under the dense haze. All earlier methods [73,74,75] and [85] are unable to remove the haze. However, the restoration-based method DCP and NLD attempt to remove the haze at the cost of high color distortions. The method [84] produces dark images in which details are not visible. The method [79] and [129] perform better than other methods except for the first image.

Fig. 35
figure 35

Qualitative comparison of results on images from the Dense-Haze dataset, a Hazy image, b DehazeNet [74], c MSCNN [73], d AOD-Net [75], e PPDNet [85], f HR-Dehazer [129], g DCP [63], h NLD [66], i BPPNet [79], j GCANet [84], k GT

Figure 36 represents non-homogeneous hazy images, different from other dehazing datasets in which haze is characterized by homogeneous haze. The performance of most state-of-the-art methods drops significantly due to the non-homogeneous nature. The color distortions problem is noticed in dehazed images by the DCP method due to the homogeneous assumption of the physical model. In addition to color distortions, the method [74] also introduced the noise in the dehazed image. The AOD and GCA net are unable to remove the haze in dense hazy regions. The DCPDN is succeeded to remove the haze without color distortion. However, some artifacts are observed. The method [95] generates pleasing results and is able to deal with non-homogeneous haze in presence of dense haze to some extent.

Fig. 36
figure 36

Quantitative comparisons of the state-of-the-art dehazing methods on NTIRE-2020 challenge: NH-HAZE. a Hazy image, b DCP [63], c DehazeNet [74], d AOD net [75], e GCAnet [84], f DCPDN [142], g KTDN [95], h GT

Figure 37 shows the qualitative analysis of different methods on a sample image taken from HazeRD datasets. The results of DCP, CAP, PDN and DehazeNet suffer from color distortions while the haze-free obtained by the method DCPDN and GFN are over brightened as compared to GT. The MSCNN and NLD leave some haze in the dehazed result. The method [130] and [75] perform satisfactorily. However, they are also not able to restore the color of sky regions in addition to other methods.

Fig. 37
figure 37

Comparison with state-of-the-art methods on a hazy image from HazeRD dataset. a hazy image, b GT, c DCP [63], d CAP [64], e NLD [66], f MSCNN [73], g DehazeNet [74], h AOD-net [75], i GFN [77], j DCPDN [142], k PDN [127], l DHRNT [130]

In Fig. 38, hardware architecture-based methods are tested on three real images from Fattal’s dataset. The methods [72] and [131] used the simple concept of DCP to remove the haze. Therefore, their dehazed images are having the problem of color distortions and over-saturation of pixels. The dehazed images of the method [132] are over brightened also suffer from over-saturation of pixels. The method of [133] generates pleasing results. However, visibility in long-range regions is not up to the mark.

Fig. 38
figure 38

Hardware based methods a Hazy image, b Shiau et al. [72], c Zhang et al. [131], d Shiau et al. [132], e Kumar et al. [133]

Finally, we present dehazing results on some sample images from the dataset [99] in Fig. 39. The quantitative results are also illustrated in Table 22. Here, we consider four popular categories of methods: image enhancement [22], image fusion [13], and [62], machine learning: [134] and [64] and restoration with priors [63, 66] and [51]. Fusion based method [13] distorts the color also leaves haze in some parts of the images while another method [62] better preserves the color in nearby regions and enhances the visibility in faraway regions. Machine learning methods [134] and [64] do not distort the color but they failed to remove the haze completely. In comparison to restoration with prior methods, DCP has pleasing results as compared to NLD with fewer color distortions. The RASD method better handles the artifacts but it blurs the details of dehazed images due to gradient residual minimization. The enhancement-based method [22] on DCP has a better-dehazed image as compared to the restoration-based method.

Fig. 39
figure 39

Hazy images with sky region and their dehazed images by different methods a Hazy image, b AMEF [13], c CAP [64], d NLD [66], e ESIDD [22], f DCP [63], g RASD [51], h MLP [134], i The JCDF method [62]

7.3 Quantitative Evaluation

This section provides a comparison of recent and popular methods of dehazing on different standard datasets. Tables 16, 17, 18, 19, 20, 21, 22, provide the quantitative evaluation of HazeRD, RESIDE, I-Haze, O-Haze, Dense-haze and D-Hazy, respectively. Since all these datasets are having GT images. Their assessment is done using full-reference metrics: PSNR and SSIM. Moreover, Table 22 provides the quantitative analysis of the real images used in Fig. 39. GT images are not available for these images; therefore, evaluation is done by a variety of non-reference metrics including FADE, Blur, BALC,\(\sigma\), e, r, NIQE, BRISQUE, BLIINDSII and BIQI.

Table 16 PSNR and SSIM comparison of existing techniques on HazeRD dataset
Table 17 PSNR and SSIM comparison of existing techniques on RESIDE dataset
Table 18 PSNR and SSIM comparison of existing techniques on I-Haze dataset
Table 19 PSNR and SSIM comparison of existing techniques on O-Haze dataset
Table 20 PSNR and SSIM comparison of existing techniques on Dense-Haze dataset
Table 21 PSNR and SSIM comparison of existing techniques on D-Hazy dataset
Table 22 Quantitative Comparison of different methods using well known no reference quality assessment metrics

We have opted for different methods in comparison tables because we have considered the top performers in respective datasets. We conclude from the quantitative analysis of datasets that a method that is ranked no 1 on one dataset is not the best on other datasets too. The haze density is also different when moving from one dataset to another dataset. Considering this fact in the mind, the performance of methods differs according to the level of the haze.

Table 16 illustrates the performance of the most popular and recent dehazing method on the HazeRD dataset. This dataset contains synthetic images of different haze concentrations. For the assessment of dehazing quality, we use two metrics: PSNR and SSIM. Most of the methods have lower PSNR and SSIM values except for one or two methods. The lower PSNR and SSIM values indicate that these methods are not able to remove the haze completely or there is a higher color distortion. The higher values of PSNR and SSIM indicate that the dehazed image by the method LDP [82] is visually closer to the GT images and is ranked no 1among all the compared methods.

Table 17 illustrates the performance of recent dehazing methods on the most popular RESIDE dataset. This dataset contains both real and synthetic images with a mild haze. The table presents the results of the SOTS indoor and SOTS outdoor part of RESIDE dataset. The evaluation metric used is PSNR and SSIM. We can observe that DCP suffers from the problem of color distortions due to invalidity of prior for white brighter objects or high depth regions. AOD-Net has residual haze and dehazed images are having low brightness. The dehazed images by the Dehazenet method are over brightened as compared to the GT. GCANet has higher PSNR and SSIM values and indicates better-dehazed images as compared to other methods except for FFA-Net [86] and DM2F-Net [128]. However, its performance is degraded at high-frequency components such as edges or blue sky. The dehazed result of [86] and [129] are better than state-of-the-art methods with a large margin of PSNR and SSIM values. GMAN method [78] performs better on SOTS outdoor but average on SOTS indoor. The performance of DM2F-Net [128] is also noticeable on SOTS indoor which is in the second position after the FFANet [86]. The dehazing capability of other methods is not satisfactory. The FFANet has good dehazing capability on both datasets and ranked no 1 and deals with many problems of sky regions, avoiding darkening of colors, color fidelity and image details.

Tables 18 and 19 illustrate the PSNR and SSIM values of recent and popular dehazing methods on I-Haze and O-Haze datasets, respectively. These datasets contain high-resolution images with a mild haze density. The restoration-based methods [63, 66, 135,136,137] and [138] again suffer from the problem of color distortions and are unable to preserve the structure of the image due to the invalidity of priors. Earlier, simple machine learning and deep learning methods: [64, 73,74,75,76,77], etc. are unable to remove the haze effect completely. In comparison with other methods, the overall results of [79, 85, 139] and [128] are better with higher PSNR and SSIM values. From Tables 18 and 19, we conclude that BPPNet [79] is the top performer on the I-Haze dataset while DM2F-Net [128] is the best among all the methods on the O-Haze dataset.

Table 20 shows the results of the comparison on the Dense-Haze dataset. This dataset greatly differs from other datasets (I-Haze and O-Haze) in terms of increased haze levels. This dataset contains hazy images with very dense fog. The state-of-the-art methods are generally trained on images having sparse haze. For example, the method [85] is trained on O-Haze (Mild hazy images). Therefore, its performance is degraded when tested on dense hazy images, as indicated by the PSNR and SSIM values of PPDNet. Moreover, PSNR and SSIM values of most of the methods are very low indicating higher color distortions and incapable to deal with dense haze except for two or three methods. In comparison with other methods, BPPNet, HR-Dehazer and Feature Forwarding methods have got the satisfactory values of PSNR and SSIM because these methods are trained on dense hazy images. BPPNet is ranked no 1 and capable to remove the dense haze. However, the color restoration of dehazed images does not resemble the GT images. The quantitative analysis of the Dense-Haze dataset confirms the qualitative analysis in Fig. 35.

Furthermore, we compared the state-of-the-art methods on the D-Hazy dataset. The D-Hazy dataset is divided into two parts: NYU depth and Middlebury (MB) portions. This dataset contains synthetic images with medium haze. Table 21 presents the quantitative results. The analysis of this table demonstrates that the dehazing results of a learning-based method [73,74,75, 140] and [141] are better than prior based methods [63] and [64]. In the comparison of PSNR and SSIM values, DFIN [140] and DPDP-Net [141] are ranked no 1 in NYU-depth and MB portion, respectively.

Furthermore, we have analyzed the performance of the state-of-the-art methods on natural images using multiple metrics, to identify the pitfalls of these methods, available as non-reference image quality assessment. The dehazing results are better suggested by FADE. A smaller FADE indicates less residual haze present in the dehazed result; BLIINDS-2 and BRISQUE are the indicators of perceptually pleasing results; a higher Gradient Ratio implies that more edge details are preserved after dehazing. The small value of NIQE represents that the haze-free image is more natural and realistic.

Finally, Table 22 shows the quantitative results of the real images shown in Fig. 39 using different metrics listed in the table. In this table, the red color of numbers denotes the first position, the green color the second position and the blue color represents the third position. The smaller values of all metrics except the e and r metrics denote the good dehazing capability in terms of distortions (blur, BALC), perceptual quality (NIQE, BLIINDSII, etc.), visibility after haze removal (FADE score), preservation of edges in restored images (e and r), color distortions (saturated pixel ratio). Different categories of methods are involved in the comparison. We can notice in this table that method [62] is at the first position in the overall quality of the dehazed image. The dehazed image has improved visibility, with no over-saturation of pixels and artifacts. In the second place, the DCP method is there with good perceptual quality and preservation of edges. However, it suffers from the halo artifacts problem at depth discontinuities. The performance of the NLD method is reported at the third position with the highest FADE score (no residual haze). However, it suffers from the problem of over-saturation of pixels and lacks perceptual quality. The performance of other methods is average.

8 Conclusions and Future Direction

The haze removal methods have drawn the attention of researchers in the recent years due to the various applications in computer vision, especially in video surveillance and transportation systems. In this paper, the recent haze removal methods are investigated. First, for better understanding, these methods are grouped into different categories based on their similar characteristics. From each group, the prominent methods are selected for analysis on various issues of dehazing. It also introduces many recent categories including non-homogeneous haze removal, hardware architecture, superpixels, ensemble, etc. Then, this survey paper explores most of the evaluation metrics and datasets used by the recent works. Finally, qualitative and quantitative analysis on many datasets including Reside, I-Haze, O-Haze, D-Hazy, Dense-Haze conducted. Although, this field has achieved remarkable progress. However, many problems or open challenges need to be addressed as follows:

(1) In most of the dehazing methods, a large number of parameters are selected empirically or manually. It limits the dehazing performance and may suffer from various issues of dehazing, such as incomplete haze removal, color distortions or halo artifacts when they are tested on hazy images of different haze concentrations. Adaptive selection of these parameters can cope up with these issues.

(2) There are very limited metrics available and designed especially for dehazing. The researcher in this field used many individual metrics for the assessment of their method. In the future, a single image quality assessment method is required to design that can deal with residual haze, over enhancement, artifacts, color distortions, structure damage, perceptual quality, etc. instead of using multiple metrics.

(3) After a literature study, we found that there exists no single method which can handle different weather conditions such as dense fog, night-time, non-homogeneous, etc. Most of the existing methods are capable to remove mild fog or homogeneous fog. Therefore, fusion based methods and ensemble learning methods may be investigated to meet these challenges that will integrate the advantages of restoration based and deep learning-based methods.

(4) Most of the methods focus on the removal of fog from a single image. There are limited methods that remove the fog of the video with a moving camera. Video fog removal (e.g., video surveillance and transportation system) requires good recovery results with real-time processing. In this direction, hardware implementation-based methods require more attention which processes high-resolution video with low-cost hardware and power consumption.