1 Introduction

Outdoor scene images may be degraded through various reasons in which the most notable source is bad weather conditions such as haze, smoke and fog [5]. The outdoor surveillance systems effect is obviously limited through fog [18]. Recently, many challenges have been encountered in computer vision and graphics communities to remove fog by using minimal inputs, i.e. a single image [13]. In outdoor environments when images are captured from optical devices, light after reflecting from an object leads to scattering of light in the atmosphere prior to reaching the camera [43]. In the presence of pollutants or atmospheric particles in the air in forms of dust, smoke, fog, rain and snow, these atmospheric particles result in two fundamental phenomena called ‘direct distortion’ and ‘distortion contributed by atmospheric light’ [33]. Direct distortion reduces the contrast and atmospheric light distortion adds whiteness in the scene which creates a problem of color ambiguity [21]. As a result, images taken under such conditions are characterized by having poor contrast and visibility [20]. Therefore, elimination of air-light and restoration of contrast is essential for outdoor vision application used for object recognition, tracking and navigation [32].

A reliable visibility restoration method requires accurate estimation of air-light and transmission map [13]. From the past few decades, several methods have been proposed to remove fog using many numbers of images. In [21, 27], the authors removed fog by taking more than two images of the same scenes (e.g. one in dense fog and another in normal fog). However in general scenarios, these strategies are not practical since weather may remain unchanged for several minutes or even hours [26]. Another class of methods is the polarization-based technique proposed by Schechner et al. [26]. The major disadvantage of this method is camera setting which captures two strictly aligned polarized images only [2]. Another disadvantage is that it requires dedicated hardware for rotating the polarizer. The relevant methods in [21, 26,27,28] are not able to adapt themselves to practical scenarios since only one degraded image is available as an input; thus, more flexible approaches are preferable [26]. As a summary, the main disadvantage of these multiple image restoration methods is that several images with identical scene and dissimilar weather environments are required.

Single image restorations are a difficult task [23]. Nowadays, only one degraded image is available on which various algorithms can be applied [33]. In [9], an independent-component analysis algorithm was suggested. This method is related to local statistic and requires color information and variance. It achieves good results when the image is affected by thin fog but has trouble with dense fog where color is light-sensitive and the variance is not reliable to estimate the transmission map. The method in [9] is not suitable for grayscale images and requires deep knowledge of color. In another work [16], the restored image looks over saturated and unnatural looking. A different approach proposed in [11] called the deep photo system heavily relies on 3D model where no dehazing was performed if no model on the scene is available [1, 7, 45].

A more popular method that has been explored in recent years is using the dark channel prior (DCP) [19], which is one of the most simple, elegant and effective in single image defogging. Yong et al. [42] suggested the enhancement of Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm on the basis of archive monitoring and low illumination. Yadav et al. [40] argued how a digitally filtered image can be enhanced using CLAHE algorithm in order to improve its contrast. Other works on image handling can be seen in [22, 29, 30, 35,36,37,38].To sum up, some drawbacks of the single image restoration methods are shown as follows:

  • Prominence of the reinstated image is very low.

  • There are some Halo artifacts on the subsequent images.

  • Some of these mentioned methods are unacceptable when a scene object is akin to air light like vehicle head lights, snow-white ground, etc.

  • Correct prediction of transmission map is not estimated.

  • Estimation of air light is not measured accurately.

  • Improvement of the contrast of an image is difficult.

  • More computational time.

It has been realized that CLAHE achieves good results compared to the other algorithms. In this article, we aim to enhance this algorithm by proposed a new method that combines Guided Filter and CLAHE for visibility improvement in fog removal. Our main contributions in this research are as follows:

  • The proposed work uses Guided Filter for improvement of transmission map obtained from Median dark channel prior (MDCP), which is an improvement of the DCP with the new median operation. It has few data driven parameters and constants.

  • The proposed method uses Contrast-Limited Adaptive Histogram Equalization (CLAHE) for visibility improvement of fog removal. In spite of the fact that there are different strategies for enhancing the view by expelling mist; however the greater part of them obliterates the shape and appearance of foggy images prompting a misdiagnosis. Thus, in this work CLAHE is utilized for identification of mist. The difference, particularly in homogeneous zones, can be constrained to abstain from intensifying any commotion that may be available in the image. It replaces each pixel in a given image by histogram of the surrendered areas to estimate pixels. Moreover, the CLAHE calculation is a broadly utilized system which brings differentiation improvement of images. Each tile’s complexity is upgraded so that the histogram of the yield locale roughly coordinates that indicated by the ‘Dispersion’ parameter. CLAHE works on a little locale, called tile, as opposed to the whole image. It causes no harm on symptomatic outcomes. Alternate strategies utilized for the most part are: wavelet change, un-sharp concealing, and morphological administrator.

  • The proposed method will be validated on the datasets in various types of climatic images against the relevant works namely Kopf et al.; Fattal’s; Tan et al.; He et according to the evaluation metrics such as VM, PNSR and AMBE. The dataset contains more than 100 images of outside scenes and for each scene there are 5 unmistakable climate conditions are rehashed.

  • The proposed method preserves small details while remaining robust against density of fog, and recovers scene contrast simultaneously. It is simple but effective in removing fog from a single degraded image. The proposed method operates at a high speed than existing ones and can minimize the halo artifact.

  • To the best of our knowledge, none of the existing methods work well in three major factors (Contrast, Halo effect and transmission map) of fog removal simultaneously. Thus, we believe that the proposed method will put some extra value in this research field.

This article is organized into eight sections. Section 2 showsthe related works for fog removal. Section 3 shows the basic theory of atmospheric dichromatic model to describe the formation of fog. In Sections 4 and 5, we illustrate the proposed approach and its instantiation respectively. Experimental environment and performance evaluations of the proposed method are discussed in Sections 6 and 7 respectively. Finally, Section 8 concludes the research article and delineates further studies.

2 Related work

In this section, we discuss some previous approaches about fog removal techniques and their pros and cons (See Table 1).

Table 1 A brief review of the related works

3 Background

In computer vision, the atmospheric dichromatic model [21] is used in description of formation of foggy images as indicated in Fig. 1:

$$ \widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b}\right)=\mathrm{X}\left(\mathrm{a},\mathrm{b}\right)\mathrm{t}\left(\mathrm{a},\mathrm{b}\right)+{\mathrm{a}}_{\mathrm{ir}}\left(1-\mathrm{t}\left(\mathrm{a},\mathrm{b}\right)\right). $$
(1)
Fig. 1
figure 1

The atmospheric dichromatic model

RGB color of the fog image \( \widehat{\mathrm{X}} \) at pixel position (a. b) is expressed in Eq. 2:

$$ \widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b}\right)={\left[\ \widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b},\mathrm{r}\right),\widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b},\mathrm{g}\right),\widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b},\mathrm{b}1\right)\right]}^{\mathrm{T}}. $$
(2)

RGB color image without fog at pixel position (a, b) is shown in Eq. 3:

$$ \mathrm{X}\left(\mathrm{a},\mathrm{b}\right)={\left[\ \mathrm{X}\left(\mathrm{a},\mathrm{b},\mathrm{r}\right),\mathrm{X}\left(\mathrm{a},\mathrm{b},\mathrm{g}\right),\mathrm{X}\left(\mathrm{a},\mathrm{b},\mathrm{b}1\right)\right]}^{\mathrm{T}}. $$
(3)

In Eq. 1, the first term X(a, b) t(a, b) is called ‘direct distortion’ which produces a multiplicative distortion of scene radiance and reduces the contrast. The later term air(1 − t(a, b)) is called the ‘local atmospheric light distortion’ which produces additive effects and adds the whiteness in a scene. Intuitively, an image received by observer is the combination of the attenuated version of underlying scene radiance with an additive atmospheric light [23]. The atmosphere is assumed to be homogenous [18]. This has two simplifying consequences: the atmospheric light is constant throughout the image (which means that it has to be estimated only once), and the transmission t(a, b) follows the Beer-Lambert law in Eq.4:

$$ \mathrm{t}\left(\mathrm{a},\mathrm{b}\right)={\mathrm{e}}^{-\upbeta \mathrm{d}\left(\mathrm{a},\mathrm{b}\right)}. $$
(4)

Here, β is the extinction coefficient and d(a, b) represents the distance from the observer to the scene at pixel (a, b). When we assume the atmosphere is homogenous, it restricts β to be constant. The medium transmission coefficient has a scalar value within 0 ≤t(a, b)≤1 for each pixel which attenuates the target color. Putting value of t(a, b) in Eq. 1, we gain Eq. 5:

$$ \widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b}\right)=\mathrm{X}\left(\mathrm{a},\mathrm{b}\right){\mathrm{e}}^{-\upbeta \mathrm{d}\left(\mathrm{a},\mathrm{b}\right)}+{\mathrm{a}}_{\mathrm{ir}}\left(1-{\mathrm{e}}^{-\upbeta \mathrm{d}\left(\mathrm{a},\mathrm{b}\right)}\right). $$
(5)

The main aim of fog removal is finding X(a, b) which also requires information regarding of distance (d) between the scene and camera, extinction coefficient (β) and air light (air). Eq. 5 indicates that the distance attenuates exponentially [13]. The depth up to an unknown distance can be recovered by recovering the transmission.

The ultimate goal of ‘single image fog removal’ is to recover a true image X(a, b) from the observed fog-image \( \widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b}\right) \) which requires knowledge of three unknown parameters, β, d and air. He et al. [11] introduced a successful algorithm of dark channel prior (DCP). Low value of intensity is caused by lacking of color in a channel which may be due to shadows of buildings, trees and dark objects shown in Eq. 6.

$$ {\uptheta}_{\mathrm{D}}\left(\mathrm{a},\mathrm{b}\right){=}_{\mathrm{m},\mathrm{n}\in \Omega \left(\mathrm{u},\mathrm{v}\right)}^{\mathrm{m}\mathrm{in}}\left({}_{\mathrm{c}\in \left(\mathrm{r},\mathrm{g},\mathrm{b}\right)}^{\mathrm{m}\mathrm{in}}\frac{\mathrm{X}{\left(\mathrm{a},\mathrm{b}\right)}^{\mathrm{c}}}{{\mathrm{a}}_{\mathrm{ir}}\left(\mathrm{c}\right)}\right). $$
(6)

Thus, θD is used as a prior to estimate the transmission-map given by Eq. 7:

$$ \overset{\sim }{{\mathrm{t}}_{\mathrm{D}}}\left(\mathrm{a},\mathrm{b}\right)=1-\mathrm{w}{\uptheta}_{\mathrm{D}}\left(\mathrm{a},\mathrm{b}\right). $$
(7)

The parameter w(0 ≤ w ≤ 1) is introduced to prevent removing fog thoroughly and to keep the feeling of depth in a distant object as 0.95. The transmission obtained by Eq.7 is only a coarse estimation. If we try to recover X(a, b) directly from Eq. 7, the result contains severe halo-artifact. Thus, it is necessary to refine the transmission. In order to remove these artifacts, transmission–map obtained by Eq. 7 is refined by using matting Laplacian in Eq.8:

$$ \mathrm{E}\left(\mathrm{t}\right)={\mathrm{t}}^{\mathrm{T}}\mathrm{Lt}+\uplambda {\left(\mathrm{t}-\overset{\sim }{{\mathrm{t}}_{\mathrm{D}}}\right)}^{\mathrm{T}}\left(\mathrm{t}-\overset{\sim }{{\mathrm{t}}_{\mathrm{D}}}\right). $$
(8)

The solution is found by solving for tD from the following Eq. 9:

$$ \left(\mathrm{L}+\uplambda \mathrm{U}\right){\mathrm{t}}_{\mathrm{D}}=\uplambda \overset{\sim }{{\mathrm{t}}_{\mathrm{D}}}. $$
(9)

The (l,p) element of L is defined as in Eq. 10:

$$ \mathrm{L}\left(\mathrm{l},\mathrm{p}\right)={\sum}_{\left(\mathrm{k}|\mathrm{l},\mathrm{p}\right)\in {\mathrm{w}}_{\mathrm{k}}}\left({\updelta}_{\mathrm{l},\mathrm{p}}-\frac{1}{\left|{\mathrm{w}}_{\mathrm{k}}\right|}\right)\left(1+{\left({\mathrm{X}}_{\mathrm{i}}-{\upmu}_{\mathrm{k}}\right)}^{\mathrm{T}}{\left(\sum \mathrm{k}+\frac{\upvarepsilon}{\left|{\mathrm{w}}_{\mathrm{k}}\right|}{\mathrm{U}}_3\right)}^{-1}\left({\mathrm{X}}_{\mathrm{j}}-{\upmu}_{\mathrm{k}}\right)\right), $$
(10)

where L is the (MN × MN) Matting Laplacian matrix for an image of size (M × N), δl, pis the Kronecker delta function, μk and ∑k are the mean and covariance of the pixel in window (wk) centered around k, |wk| is the number of pixels in each window, ε is a small regularization parameter (10−3 to 10−4), U is an identity matrix with the same size as (L) and λ is a small value (10−3 to 10−4) so that tD is softly constrained by \( \overset{\sim }{{\mathrm{t}}_{\mathrm{D}}} \). Figure 2 indicates the result of recovered scene radiance using DCP followed by matting Laplacian. We can clearly see in Fig. 2d that the recovered de-foggy image contains some block artifact near the complicated edge structure.

Fig. 2
figure 2

Recovering the scene radiance using the transmission map refined by Matting Laplacian. a Original Fog Image, b Estimated Transmission Map from DCP, c Refined Transmission Map by Matting Laplacian, d Recovered Scene Radiance

In the proposed method, we conserved small details part of images when outstanding vigorous against concentration of fog, and recuperate scene contrast instantaneously. The new procedure is simple but actual in eliminating fog from a single tainted image. The proposed method controls at a high speed than existing approaches and can diminish the halo effect. Guided filter improves the transmission map acquired from Median dark channel prior (MDCP), which is an improvement of the Dark Channel Prior DCP by the use of median operation. All of the parameters used in our method are data driven.

4 The proposed work

4.1 Median dark-channel prior (MDCP)

We proposed a visibility restoration algorithm based on the utilization of median filtering operation. The proposed method is an improvement to the Dark Channel Prior [11] by replacing the second minimum operator in Eq. 6 with a median operator. The median operator performs a non-linear filtering operation which can effectively suppress impulsive noise components while preserving edge information in detailed areas and permitting dehazing [3, 8, 25] in smooth areas. The flow diagram of the algorithm is indicated in Fig. 3.

Fig. 3
figure 3

Block Diagram of fog removal method

From a given foggy image, the atmospheric light, and transmission map are estimated. Once air light and transmission map are obtained, it is then refined by using the guided filter and scene radiance. To improve the overall contrast of an output image, CLAHE is performed as the post processing operation. The proposed median dark channel prior (MDCP) is given as Eq. 11:

$$ {\uptheta}_{\mathrm{M}}\left(\mathrm{a},\mathrm{b}\right){=}_{\mathrm{m},\mathrm{n}\in \Omega \left(\mathrm{a},\mathrm{b}\right)}^{\mathrm{m}\mathrm{ed}}\left({}_{\mathrm{c}\in \left(\mathrm{r},\mathrm{g},\mathrm{b}\right)}^{\mathrm{m}\mathrm{in}}\frac{\mathrm{X}{\left(\mathrm{m},\mathrm{n}\right)}^{\mathrm{c}}}{{\mathrm{a}}_{\mathrm{ir}}\left(\mathrm{c}\right)}\right). $$
(11)

Here Ω represents a patch of size 15 × 15. Likewise, θM is then used to estimate the transmission-map given by Eq. 12:

$$ \overset{\sim }{{\mathrm{t}}_{\mathrm{M}}}\left(\mathrm{a},\mathrm{b}\right)=1-\mathrm{w}{\uptheta}_{\mathrm{M}}\left(\mathrm{a},\mathrm{b}\right). $$
(12)

The transmission map estimation using Eq. 12 can be used for further refinement. Despite several techniques soft matting [10], bilateral filter [14] and anisotropic diffusion [31] were proposed to enhance the transmission map, they are computational intensive and may make the fog removal algorithm impracticable.

4.2 Transmission map refinement

The Matting Laplacian, which was used for refinement of transmission map in [11], produces visually satisfying results, but the computational cost is very high as it involves solving a large linear system. Therefore to speed up the defogging process, the transmission map obtained by Eq. 13 is refined by using the guided filter [12]. In the guided filter, kernel can be built by using a guidance image Ig and then applied to the target image \( \tilde{t_M} \) through a standard linear filtering process in the pixel domain. In general, the filtering process can be described by Eq. 14:

$$ {\mathrm{t}}_1={\sum}_{\mathrm{p}}{\mathrm{W}}_{\mathrm{l},\mathrm{p}}\left({\mathrm{I}}_{\mathrm{g}}\right)\tilde{{\mathrm{t}}_{\mathrm{J}}}, $$
(13)
$$ {\mathrm{W}}_{\mathrm{l}\mathrm{p}}\left({\mathrm{I}}_{\mathrm{g}}\right)=\frac{1}{{\left|\mathrm{w}\right|}^2}{\sum}_{\mathrm{k}:\left(\mathrm{l},\mathrm{p}\right)\in {\mathrm{w}}_{\mathrm{k}}}\left(1+\frac{\left({\mathrm{I}}_{\mathrm{l}}-{\upmu}_{\mathrm{k}}\right)\left(\mathrm{I}-{\upmu}_{\mathrm{k}}\right)}{\sigma_{\mathrm{k}}^2+\varepsilon}\right), $$
(14)

where l and p are pixel indexes, t is the final result, W is the weight depending on the guidance image Ig, and \( \overset{\sim }{{\mathrm{t}}_{\mathrm{j}}} \) is the input image. For a color image, derivation of the guided filter relies on a linear assumption between the guidance image Ig and the refined transmission map t in Eq. 15.

$$ {\mathrm{t}}_{\mathrm{l}}={\mathrm{a}}_{\mathrm{k}}^{\mathrm{T}}{\mathrm{I}}_{\mathrm{g}}+{\mathrm{b}}_{\mathrm{k}},{\forall}_{\mathrm{l}}\in {\mathrm{w}}_{\mathrm{k}}, $$
(15)

where tl are the filter output, (ak, bk) are some linear coefficient’s assumed to be constant in a window (wk). Windows size is typically defined by their radius (r), which is the pixel distance from the center pixel to the outer pixel. Since square window are used, the total window size is therefore (2r + 1) × (2r + 1). The guided filter seeks for coefficient (ak, bk) that minimizes the difference between the output and input, by using following cost function in Eq. 16:

$$ \mathrm{E}\left({\mathrm{a}}_{\mathrm{k}},{\mathrm{b}}_{\mathrm{k}}\right)={\sum}_{\mathrm{l}\in {\mathrm{w}}_{\mathrm{k}}}\left(\left({\mathrm{a}}_{\mathrm{k}}{\mathrm{I}}_{\mathrm{g}}+{\mathrm{b}}_{\mathrm{k}}-\tilde{\mathrm{t}}\right)+{\upvarepsilon \mathrm{a}}_{\mathrm{k}}^2\right), $$
(16)

where ε is a regularization parameter to prevent ak from being too large. The solution of Eq. 16 is found as in Eq. 17(a & b):

$$ {\mathrm{a}}_{\mathrm{k}}={\left({\sum}_{\mathrm{k}}\upvarepsilon \mathrm{U}\right)}^{-1}\left(\frac{1}{\mid \mathrm{w}\mid }{\sum}_{\mathrm{l}\in {\mathrm{w}}_{\mathrm{k}}}\mathrm{Ig}\tilde{{\mathrm{t}}_{\mathrm{J}}}-{\upmu}_{\mathrm{k}}\overline{{\mathrm{t}}_{\mathrm{k}}}\right) $$
(17a)
$$ {\mathrm{b}}_{\mathrm{k}}=\overline{{\mathrm{t}}_{\mathrm{k}}}-{\mathrm{a}}_{\mathrm{k}}^{\mathrm{T}}{\upmu}_{\mathrm{k}} $$
(17b)

Here, μk and \( {\upsigma}_{\mathrm{k}}^2 \) are the mean and variance of Ig in a window wk, and |w| is the number of pixels in wk. \( \overline{{\mathrm{t}}_{\mathrm{k}}} \) is the mean of t in wk. ∑k is the 3 × 3 covariance matrix of Ig in wk and U is a 3 × 3 identity matrix. Since a pixel l belongs to many windows, the final output tl is averaged over all possible windows. After computing all filter coefficients (ak, bk) in the image, the final output is:

$$ {\mathrm{t}}_{\mathrm{l}}=\frac{1}{\mid \mathrm{w}\mid }{\sum}_{\mathrm{k}:\left(\mathrm{l}\right)\in {\mathrm{w}}_{\mathrm{k}}}\left({\mathrm{a}}_{\mathrm{k}}^{\mathrm{T}}\mathrm{Ig}+{\mathrm{b}}_{\mathrm{k}}\right) $$
(18)

Thus, the guided filter simply measures the normalized correlation between two pixels. Spatial distance is taken into account by the fact that when two pixels l and p are close together, they share more windows compared to when they are far apart.

4.3 Behavior of guided filter

The input images are filtered by edge preserving smoothing filter, fast, accurate, non-iterative guided filter. The main benefit of this guided filter is that it has good behavior near edges and does not suffer from the problem of gradient-reversal artifact. Guided filter take advantages of reducing the problem of solving a large linear system of equations to a simple filtering process. It is demonstrated below.

figure f

4.4 Recovering the scene radiance

The map (t), and air light (air) are known. The scene radiance can be estimated by using Eq. 19:

$$ \mathrm{X}\left(\mathrm{a},\mathrm{b}\right)=\frac{\widehat{\mathrm{X}}\left(\mathrm{a},\mathrm{b}\right)-{\mathrm{a}}_{\mathrm{ir}}}{\max \left(\mathrm{t},{\mathrm{t}}_{\mathrm{o}}\right)}+{\mathrm{a}}_{\mathrm{ir}} $$
(19)

Using Eq. 19, to is typically bounded to a low number such as 0.1 to avoid instability. Figure 4 shows the result of a recovered scene radiance using the guided filter.

Fig. 4
figure 4

Behavior of guided filter after filtering, the edges of the guide image is transferred to the filtering image: a Noisy Image, b Guidance image, c Image after guided filtering

4.5 Post-Processing using CLAHE

It is found that image after removal of fog loses some contrasts and appears dim. This can improve the visibility. The most common method for visibility enhancement is the histogram equalization and histogram stretching. Histogram equalization is the most well-known methodby using the gray-scale transformation. However, its major disadvantage is that it over-enhances the image and shifts mean brightness and consequently creates an unnatural look. While in histogram stretching, we have to be careful of clipping; otherwise it eliminates visual information in bright and in dark region. Therefore, instead of using the histogram equalization and histogram stretching which affect the whole image, we use the CLAHE algorithm as indicated in Fig. 5a and b.

Fig. 5
figure 5

Recovering scene radiance using the MDCP and Guided Filter: a Enhanced Transmission Map by Guided Filter, b Restored Scene Radiance

CLAHE is a visibility enhancement algorithm which can provide optimal equalization and overcome the problem of standard histogram equalization. As proposed in [34], CLAHE divides an images into small contextual region [8 × 8], and applies the histogram equalization to each region for contrast-enhancement. It also combines neighboring blocks into an image bilinear interpolation to eliminate artificially induced boundaries. There are two parameters in CLAHE to control image quality namely block-size and clip-limit. Block-size specifies the size of contextual-region, and clip-limit is a scalar parameter in range [0,1] specifying the contrast enhancement limit and preventing over-saturation. The CLAHE method can be derived from following steps in Eq. 20:

$$ {\mathrm{N}}_{\mathrm{avg}}=\frac{{\mathrm{N}}_{\mathrm{x}}\times {\mathrm{N}}_{\mathrm{y}}}{{\mathrm{N}}_{\mathrm{Gray}}}. $$
(20)

Based on Eq. 20, the NCL can be calculated by Eq. 21:

$$ {\mathrm{N}}_{\mathrm{CL}}={\mathrm{N}}_{\mathrm{clip}}\times {\mathrm{N}}_{\mathrm{avg}}, $$
(21)

where,

Navg:

Average Number of Pixels.

Nx:

Number of pixels in X-axis.

Ny:

Y-axis.

NGray:

Number of gray levels.

NCL:

Actual Clip-Limit.

Nclip:

Maximum multiple of average pixels in each gray level.

The original and clipped histograms are indicated in Fig. 6a. We can observe in Fig. 6b that if pixels are greater than Nclipthen they result in clipping of the pixels. N∑clip Indicates the total numbers of clipped pixels in each gray level given in Eq. 22:

$$ {\mathrm{N}}_{\mathrm{acp}}=\frac{{\mathrm{N}}_{\sum \mathrm{clip}}}{{\mathrm{N}}_{\mathrm{gray}}}. $$
(22)
Fig. 6
figure 6

Representation of the original and clipped histogram (CLAHE): a Original histogram b Clipped histogram

The contextual region can be calculated by using the following rules:

  • Rule 1: If Nx(i) > NCL, Nx(i) = NCL

  • Rule 2: Elseif Nx(i) + Nacp ≥ NCL, Nx(i) = NCL

  • Rule 3: Else Nx(i) = Nx(i) + Nap

The step of distributed pixels can be calculated by using the Eq. 23:

$$ \mathrm{S}=\frac{{\mathrm{N}}_{\mathrm{Gray}}}{{\mathrm{N}}_{\mathrm{RP}}}, $$
(23)

where NRP denotes the remaining number of clipped pixel.

4.6 Description of the proposed algorithm

The algorithm is shown in Table 2 where I is the input image and IF is the output. The time complexity of the guided filter algorithm (Steps 3 and 4) is O (n) where n is number of the pixels in the image. The time complexity of other procedures like MDCP, transmission map refinement, recovering scene radiance and CLAHE is O(1). Therefore, the total time complexity is (O (n) + O (1)) = O (n).

Table 2 The proposed method

5 Instantiation

In any case, using a clock rate comparable to pixel landing rate, the mean number of required mapping RAMs is 32. For this circumstance, when all operations are driven at the pixel passage rate, there is a prerequisite for another extra eight mapping RAM’s to fill in as a pad for authentic pipe covering of all operations. The time laps between utilization of the fundamental regional mapping sequentially and the last common mapping in a comparative segment, for 512 × 512 pictures, is, at most, proportional to the arrival time of 7 × 64 = 448 pixels. Under best conditions, using a higher clock rate, the base required number of mapping RAM’s is 24. The accompanying and last operation is to framework of the image, as shown by the most ideal blend of the regional mappings. For the underlying 32 sections of an image, only eight commonplace mappings are required. The final pixel mapping is planned based on the assumption that regional mappings are available for the pixel under process. Realization of this engine depends on the way that six products are calculated. It is possible to simplify hardware realization as follows. Let η and μ be defined as

$$ \upeta =\frac{X}{X+Y}, $$
$$ \upmu =\frac{S}{R+S}. $$

This mapping can be solved by using the following given equation:

$$ {\mathrm{P}}_{\mathrm{new}}=\upmu \left[\upeta {\mathrm{f}}_{\mathrm{i}-1,\mathrm{j}-1}\left({\mathrm{P}}_{\mathrm{old}}\right)+\left(1-\upeta \right){\mathrm{f}}_{\mathrm{i},\mathrm{j}-1}\left({\mathrm{P}}_{\mathrm{old}}\right)\right]+\left(1-\upmu \right)\left[\ \upeta {\mathrm{f}}_{\mathrm{i}-1,\mathrm{j}}\left({\mathrm{P}}_{\mathrm{old}}\right)+\left(\ 1-\upeta \right){\mathrm{f}}_{\mathrm{i},\mathrm{j}}\left({\mathrm{P}}_{\mathrm{old}}\right)\right]. $$

Since the scales in the multipliers, for square districts with size of forces of two, are dyadic numbers, the multipliers may have more productive equipment usage. The quantity of timekeepers expected to finish each mapping operation relies upon how the multipliers are executed. In the event that the multipliers are likewise legitimately pipelined, with an inertness of at most 30 pixels, for every pixel entry, one-pixel mapping is finished. On account of 512 square images, the locales are of size 64 by 64 and the scaling factors are numbers separated by 64.

6 Experimental setup

This section presents an assessment of the proposed method by using MATLAB 7.0.4, 64-bit Intel Core i3–2600 processor with memory of 2GB. To compare the performance of our method, benchmark images such as foggy images namely Mountain ‘01’, Tower ‘02’, ‘Sweden’ are derived from well-known sources in Google database. The Haze RD dataset contains 15 actual outdoor scenes, and for each scene there are 5 different climate conditions are replicated. Besides, we also collected more than 50 extra images with different outdoor scene and added to the Haze RD dataset for the experiment. Thus, in this experiment we use more than 100 images for verifying the proposed method. The objective of evaluation is to check how well the images are restored by the algorithm. In order to do so, image quality metrics are necessary to assess the quality of images. Performance of the algorithms are measured in terms of Visibility Metric (VM), Absolute Mean Brightness Error (AMBE), Peak Signal to Noise Ratio (PSNR), and Run Time (tcomp).

6.1 Visibility metric (VM)

The visually enhancement performance is defined by visibility metric. Visibility is a measure of image quality and used to tell how well an observer can view a texture and color of image. Visibility metric gives an objective measure of detail enhancement. The enhancement in contrast and edge sharpness is useful for measuring the enhancement. The visibility metric is calculated in Eqs. 24, 25, 26, 27 and 28.

$$ \mathrm{n}\left(\mathrm{l},\mathrm{p}\right)=\mathrm{r}\left(\mathrm{l},\mathrm{p}\right)-\mathrm{e}\left(\mathrm{l},\mathrm{p}\right), $$
(24)
$$ {\upmu}_{\mathrm{r}}=\frac{1}{\mathrm{M}\times \mathrm{N}}{\sum}_{\mathrm{i}=0}^{\mathrm{M}-1}{\sum}_{\mathrm{i}=0}^{\mathrm{N}-1}\mathrm{r}\left(\mathrm{l}.\mathrm{p}\right), $$
(25)
$$ {\upmu}_{\mathrm{n}}=\frac{1}{\mathrm{M}\times \mathrm{N}}{\sum}_{\mathrm{i}=0}^{\mathrm{M}-1}{\sum}_{\mathrm{j}=0}^{\mathrm{N}-1}\mathrm{n}\left(\mathrm{l},\mathrm{p}\right), $$
(26)
$$ {\upsigma}_{\mathrm{n}}^2=\frac{1}{\left(\mathrm{M}\times \mathrm{N}\right)-1}{\sum}_{\mathrm{i}=0}^{\mathrm{M}-1}{\sum}_{\mathrm{i}=0}^{\mathrm{N}-1}{\left(\mathrm{n}\left(\mathrm{l},\mathrm{p}\right)-{\upmu}_{\mathrm{n}}\right)}^2, $$
(27)
$$ \mathrm{VM}=\mathrm{CNR}\left(\mathrm{r},\mathrm{e}\right)=\frac{\upmu_{\mathrm{r}}-{\upmu}_{\mathrm{n}}}{\upsigma_{\mathrm{n}}}. $$
(28)

r (l, p) and e(l, p) denote the image taken as reference and improved images (M × N), n(l,p) visibility of the image, μn, μr and \( {\upsigma}_{\mathrm{n}}^2 \), are mean and variance. The effectiveness is described by high value of visibility metric.

6.2 Absolute mean brightness error (AMBE)

A qualitative measure of visibility enhancement is checked in terms of AMBE. To examine how the appearance of image has changed after removal of fog, the deviation of the reference image is computed in Eq. 29.

$$ \mathrm{AMBE}=\left|{\upmu}_{\mathrm{r}}-{\upmu}_{\mathrm{n}}\right|. $$
(29)

For better similarity between two images, the AMBE must be as small as possible.

6.3 Peak signal to noise ratio (PSNR)

There are many version of SNR, but the PSNR are simpler and widely used for fidelity measurement. PSNR, calculated in decibels units, measures the ratio of the peak signal (Eq. 30) and the Mean-Square-Error between two images. It is defined in Eq. 31:

$$ \mathrm{MSE}=\frac{1}{\mathrm{M}\times \mathrm{N}}{\sum}_{\mathrm{i}=0}^{\mathrm{M}-1}{\sum}_{\mathrm{i}=0}^{\mathrm{N}-1}{\left(\mathrm{r}\left(\mathrm{l},\mathrm{p}\right)-\mathrm{e}\left(\mathrm{l},\mathrm{p}\right)\right)}^2, $$
(30)
$$ \mathrm{PSNR}=10{\log}_{10}\left(\frac{{\left(\mathrm{L}-1\right)}^2}{\mathrm{MSE}}\right), $$
(31)

where r(l,p) denotes the original foggy image and e(l,p) denotes the enhanced fog-free image. (M × N) represents the size of image and L is dynamic range of pixel values (256 for 8-bit grayscale image).

6.4 Run time(t run)

Run time (trun) is the period during which the computer program is executing to remove fog from image. Therefore, the efficiency of an image depends on how fast the algorithm is. For the fog removal,trun is related to the size of image. For fast execution,trun must be low. (Table 3).

Table 3 Run time (seconds) comparison of the proposed method with the other methods

7 Results and discussions

Firstly, intermediate steps of proposed method are illustrated in Fig. 7.

Fig. 7
figure 7

Intermediate steps of proposed method: a Original fog image ‘train’ b Estimated Transmission map using MDCP, c Enhanced transmission map using Guided filter d Restored image e Final fog free image followed by post- processing

To check the effectiveness of the proposed method, we carry out simulation on various foggy images: Mountain’01′, Tower ‘02’, ‘Sweden’ in Figs. 8, 9 and 10.

Fig. 8
figure 8

Visual comparison of defogging result with recent state-of-the-art methods: a Original foggy image ‘mountain 01’ recovered by the algorithm; b Kopf et al.; c Fattal’s; d Tan et al.; e He et al.; f The proposed algorithm

Fig. 9
figure 9

Visual comparison of defogging result with recent state-of-the-art methods: a Original hazy image ‘tower 02’ recovered by the algorithm; b Kopf et al.; c Fattal’s; d Tan et al.; (e) He et al.; f The proposed algorithm

Fig. 10
figure 10

Visual comparison of defogging result with recent state-of-the-art methods: a Original hazy image ‘Sweden’, recovered by the algorithm; b Kopf et al.; c Fattal’s; d Tan et al.; e He et al.; f The proposed algorithm

The Visibility Metric in comparison with other methods is shown in Table 4 in which the 2nd column represents the actual visibility of fog images whereas the restored visibility are depicted in last column of this table. We observe that actual visibility of the fog image- Mountain ‘01’ is 70.59 which is increased to 115.62 after fog removal. Similarly, in Tables 5 and 6, we compare the PSNR and AMBE with the other state-of-art methods. The effectiveness of the proposed method is described by high value of VM and low value of AMBE, ηand trun.

Table 4 Visibility metric comparison produced by the proposed method and others
Table 5 PSNR comparison produced by the proposed method and others
Table 6 AMBE comparison of the proposed method and others

Likewise, from Table 4, the visibility metric produced by the proposed method yields out 40% more enhanced visibility than the others. The PSNR produced by the proposed method is 12.5% more than those of the existing works (Table 5). Especially for Mountain 01, the proposed method achieves 36% higher value of PSNR than that of He et al. However, AMBE comparison in Table 6 indicates that the proposed method works well with the ambience of the other but does not yield out effective results. The proposed method gives from 3% to 5% of PSNR more than Kopf et al., and Tan et al. for Tower ‘02’. The experimental result clearly indicates that the proposed method has the best result over all other approaches. Knowing the ground truth of any degraded images have been a difficult issues since long. However, we have adopted a common and well-known manually investigated image pixels into foreground and background classes. These include presence of degradation effects such as lack of contrast, interfering patterns and color fading. We have identified these images as degraded in the following way:

  1. a)

    The difficulty in taking a decision that which pixel should be considered as text or background as the pixels are mislabeled is such images. Hence, they are mentioned in our work as haze images (Misty, Foggy, Cloudy Images).

  2. b)

    The confusion between text and background rises which makes the separation task quite difficult (gray-scale intensity). We found these RGB (Red, Green Blue) images with redundant luminance information. Thus, these segments appear as if they are fog. This happens in most of the historical and old images.

Now, we compare all algorithms on the HazeRD dataset [44] with respect to the Average VM (Avg_VM), Average PSNR (Avg_PSNR) and Average AMBE (Avg_AMBE). In Table 7, it is clear that the proposed method produces the highest average VM and PSNR values and lower AMBE values. It has been shown that the proposed method gives the best result over all.

Table 7 Comparison on the Haze RD dataset in terms of the basic comparative factors

Figures 11 and 12 demonstrate defogging of the proposed method and the others. An illustration of refining the transmission map using the guided filter with various window sizes is indicated in Fig. 13a and b. Herein, we adapt a simple strategy to adjust rdark according to the area of image. When the number of pixels in the image is less than (2 × 105), the radius is fixed as 15 preventing the patch size become too small. When the number of pixels is more than (2 × 105), the radius is fixed as 30, preventing the patch size from growing too large. For the guided filter, the regularization parameter ε fix to be 10−2. The parameter to is set to be 0.1, but its value needs to be increased when an image contains sky regions. The parameter λ = 10–4 is chosen corresponding to the best PSNR. The value of parameter wis depended on applications but we fix it to 0.95 for all result reported in this paper.

Fig. 11
figure 11

Defogging of the proposed method vs. He et al. method a Input image, b He’s result, c Proposed method’s d Input image, e He’s result, f Proposed method’s

Fig. 12
figure 12

Defogging for heavy fog images by the proposed method: a Foggy ‘Monastery’ image, b Foggy ‘Hostel’ image, c Restored ‘Monastery’ image, d Restored ‘Hostel’ image

Fig. 13
figure 13

Refining the transmission map using guided filter with various window sizes (ε = 10−2)

An important parameter for computing the dark channel is patch size. We denote its radius as rdark. Since the computational cost in soft matting is quite high, we cannot use it for refinement of large images. In [12], the patch size is fixed as 15 × 15 which is relatively small. However, the complexity of the proposed method is quite low, and it is capable of processing the large image with short running time. Thus, it is not appropriate to use fixed patch size. From the local linear model in Section 4, large radius r implies that filtering output is linear to guidance images in order to reduce the halo artifact in recovered image. But if r is too large, the transmission map will capture too much detail from the guidance; making the recovered image over saturated. Failure of transmission refinement using the guided filter is indicated in Fig. 14a and b.

Fig. 14
figure 14

Failure of suggested image: a Foggy input image and b Our result

In those images, the uppermost right corner is destroyed because of using DCP. The dark channel prior (DCP) is statistically based; thus it is likely that particular spots in the image do not follow this prior. Indeed, this method fails when some objects in a particular image is inherently gray or white. It assumes that a part of the images is a part of the fog. That is why the white patches are appeared in the image. Our method may misjudge the fatness of the haze (underestimate the transmission). Thus, the color saturation is occurred in the recovered image. In the future, we will try to use Edge Aware Filtering to overcome this limitation. Edge-aware filtering become inherent, e.g. through answering a linear arrangement. The weighted least squares (WLS) filter uses the matrix operations according to gradient of the image. There are two cases in Edge-aware filtering such as Flat patches and Edge or high variance. When a pixel is in the center of a flat patch region, its value is converted into the average of adjacent pixels. If it is in the center of an edge or high variance region, its value will be constant. Because edge-aware filter is time consuming, we can try minimizing the time complexity of this filter by using fast and high quality explicit filter.

8 Conclusion

Through the survey on state-of-the-arts on fog removal, it is a fact that they ignored every phase or circumstances. The existing techniques neglect utilization of the dark channels prior to reduce noises and uneven illuminate problems. To overcome the previous circumstance, we proposed in this paper an integrated algorithm to improve the visibility of fog degraded images. It is a combination of Guided filter and CLAHE, where the Guided filter is an edge-preserving filter which is used for quickly refinement of transmission map and CLAHE is used to improve the local contrast of image by portioning the image into small boxes. The proposed methods has been empirically validated on benchmark images such as foggy images namely Mountain ‘01’, Tower ‘02’, ‘Sweden’, and the HazeRD datasets. Performance of the algorithms were measured in terms of Visibility Metric (VM), Absolute Mean Brightness Error (AMBE), Peak Signal to Noise Ratio (PSNR), and Run Time (tcomp).It was observed that in restored image, no oversaturated region exist due to which there is a negligible appearance of halo-artifact. The advantages of proposed method are multiple: It is simple in principle and hence easy to implement; it provides good result in most cases (homogenous fog), without introducing artifact and efficient for various type of fog images.

The proposed method outperforms the other existing methods by enhancing details in fog degraded images. However, guided image filtering is actually an approximation of soft matting; this method fails when the input image contains abrupt depth changes. Failure of transmission refinement using guided filter is aforementioned. To address the mention problem, we will try to use Edge Aware filtering with weighted least squares (WLS) filter. The idea is intuitively: when a pixel is in the center of a flat patch region, its value is converted into the average of adjacent pixels. If it is in the center of an edge or high variance region, its value will be constant. Because edge-aware filter is time consuming, we can try minimizing the time complexity of this filter by using fast and high quality explicit filter. All of these will be our future works in progress.