1 Introduction

Visibility is a major problem with outdoor surveillance systems. Amid weather conditions create hindrance for visual recognition of a scene in those systems which work under the influence of wide ranging states of weather, such as object recognition in outdoor systems, detection of obstacle, outdoor video surveillance system, object detection using vehicular network, traffic monitoring in outdoor vision systems and intelligent transportation systems. Furthermore, the most computer vision algorithms perform better if the input image is the scene radiance. It is observed that the degradation of images under mediocre weather conditions is due to scattering and absorption of light by the atmospheric particles (i.e. Haze, fog, smoke etc.). The irradiance of a scene point received by the camera, is sum of the light reflected by scene point(direct attenuation) and the light reflected by atmospheric particles(airlight) [5].

Contrast and color of the image degrades at the point of absorption or scattering. The scattering of light by particles is more severe than absorption [44]. Therefore fog removal (defogging) is desired in computer vision field. Aim of defogging method is to remove fog or haze and to construct a clear image from the observed degraded image [5, 28].

Depth of the scene point from camera determines the amount of scattering [5]. The defogging algorithm can give a clue of the scene depth, which can be utilized in most of the prominent vision algorithms [7, 35].

Long established methods of image quality improvement, such as histogram equalization, contrast enhancement, generally cannot find ideal defogging results. The prime focus of these methods is to improve brightness and increase contrast. Most realistic scenes can be restored using these methods. However, these methods are unable to deal with foggy images [9, 10, 27].

A few methods [5, 30, 31, 46] produce prominent results under homogeneous scattering but they are unable to restore degraded edges and even they lose existing edges due to either filtering approach or inaccurate scene depth estimation.

The proposed method is based on an observation that difference of saturation from sum of hue and brightness, increases with depth and therefore, it can be utilized to propose an improved linear depth model to estimate accurate scene depth.

The proposed method preserves the existing edges and restores those edges, which were lost due to amid weather conditions, with an assumption that pixels are locally at same depth. Results obtained through the proposed method are verified and validated using various parameters based on qualitative and quantitative analysis.

The paper is organized as follows. Related work is presented in Section 2. In Section 3, the problem is formulated using physical model of atmospheric scattering. Section 4, introduce the theoretical foundation and verification of the proposed model. Mathematical modeling of the proposed model is given in Section 5. Section 6 introduces properties of the proposed model and describes the process to recover the scene radiance from degraded image. Section 7, presents detailed analysis and comparison of the proposed model with existing literature. Finally, Section 8, concludes the proposed work.

2 Related work

Defogging can be put under the category of big data processing and analysis [39,40,41]. It is a gruelling task because the haze concentration varies with unknown scene depth. Main aim of defogging method is to recover scene depth. On the basis of types of input, the defogging methods can be categorized as: (1) Additional information based methods [21, 29]; (2) Multiple image based methods [18,19,20, 25, 26] and (3) single image based methods [4, 5, 28, 30, 46].

Methods proposed in [19, 20, 25], requires additional information like depth cues or multiple images. In these methods, depth cues are determined through various camera positioning operations or through altitude [21, 29]. Since these methods require additional information through user interaction, therefore they are not good for realistic computer vision applications.

In [25, 26], multiple images captured at different degree of polarization are used to enhance the visibility of image. The methods in [18,19,20] attains additional restriction through several images of a scene in varied weather states. However, these methods are expensive since they require additional hardware or other expenses in order to perform well.

Defogging problem can also be solved by putting many constraints if the input is single image. Currently, single image based defogging methods are getting focus [4, 5, 28, 30, 46]. Efficiency and success of these methods depends upon assumptions. In [28] it has been observed that fog free images have high contrast in comparison to foggy images. Therefore, a method for maximization of local contrast is proposed. However, it produce blocking artifacts near depth discontinuities due to the patch-based operation. In [4], the scene albedo has been estimated and transmission is deduced by assuming that the shading of local surface and local transmission are not correlated. However, it produce wrong results under dense fog.

In the statistics of outdoor images in [5], it is observed that the pixels in local non sky region have very low intensity due to the effect of airlight in foggy conditions. Therefore, these pixels are utilized to estimate quantity of fog. However, it produce erroneous results, if an object with a color equal or more than atmospheric light, is present in the scene.

Robust method based on various depth cues using learning based approach is used in [30]. It used combination of priors to estimate transmission. Results produced by it are influencing. However, it fail in case of dense fog or haze.

A few methods have been proposed to estimate per pixel transmission to reduce blocking artifacts [1, 12, 13]. In [1], a novel method to estimate per pixel transmission is proposed with an assumption that an image can be represented by a few hundred colors. This technique solves problem of blocking artifacts due to per pixel transmission. However, it does not work in the presence of objects brighter than airlight. In [13], a novel method related to noise is proposed. This method also estimates per pixel transmission to remove noise and handle over bright objects in scene. However, the noise is estimated using weighted sum of saturation and brightness, which may produce wrong results in case of dense fog. In [12], a high quality image dehazing method based on perception oriented transmission estimation is proposed. In this method, scattering probability of each pixel is estimated using Bayesian framework, which is utilized to gaze level of dehazing required at a pixel. This method produce high quality results. However it is computationally expensive in comparison to other available methods.

Fast defogging methods have been proposed in [37, 46]. In [37], linear transformation based technique is proposed with an assumption that minimum color channel of foggy image is a linear transformation of minimum color channel of fog free image. However, the algorithm darken the color of restored image with increased defogging level. In [46], the color attenuation prior has been proposed to estimate scene depth. It is observed that the difference of brightness and saturation of a scene point varies with depth. Therefore, a linear depth model is proposed to estimate scene depth. Parameters of the model are learned using machine learning techniques [2, 3, 22, 40]. Results produced by [46] loses existing edges and are unable to recover degraded edges due to minimum filter and inaccurate scene depth estimation. In certain conditions (like heterogeneous fog), it produce wrong results.

3 Problem formulation

As light travels from a scene object to camera (observer), the key characteristics (like intensity, color) changes due to scattering of light by atmospheric particles. Amount of scattering highly depends upon the type, size, distribution and orientation of particles present in the atmosphere [17]. Table 1 shows summary of types of possible atmospheric conditions based on particle type(T), size(S), concentration(C) and humidity level(H) present in atmosphere [17].

Table 1 Types of atmospheric conditions based on presence of types of particles and their physical properties [17]

It can be observed from Table 1 that air molecules are very small in size. Therefore scattering due to air molecules is very less and adds negligible noise in image formation. Atmosphere composed of air molecules is termed as clear day. Volcano ashes, combustion products and sea salts are the haze particles, which are larger in size than air molecules and generally are suspended in gas. These particle acts as center of small water droplets when humidity increases to very high level. Generally, haze results into blueish hue. Fog is similar to haze except when humidity condition reaches to saturation level, some of the nuclei grow due to condensation and results into water droplets. In simple words, increase in humidity turns haze into fog. Cloud is similar to fog except that clouds are formed at higher altitude rather than ground level. Rain is a complex weather condition. Presence of rain in atmosphere causes random spatial and temporal variations in images. Thus dealing with rain is more dynamic. Therefore rainy weather is termed as dynamic weather [17].

Figure 1 describes the process of image formation under environmental influence. As shown in Fig. 1, as a beam of sun light incident on an object , it is attenuated by atmospheric particles. Amount of light observed by camera is sum of the actual radiance of scene point which is not scattered and the radiance caused by scattering of light due to presence of atmospheric particles [17].

Fig. 1
figure 1

Optical model of image formation

Formation of image in poor weather condition is described by the optical model [20] and is represented by (1).

$$ I_{d}(x)=\underbrace{L_{\infty}(x){\rho(x)}{tr(x)}}_{direct~ attenuation}+\underbrace{L_{\infty}(x)(1-{tr(x)})}_{airlight} $$
(1)

Where, direct attenuation is represented by first term and airlight is modeled by second term. Location(coordinates) of the scene point is x, intensity at location x in degraded image is I d (x), L (x) is atmospheric light, ρ is reflectance of an object in the image. Transmission of a scene point is represented by t r(x), which can be expressed as:

$$ tr(x)=e^{-\beta(\lambda) d(x)} $$

Where, d(x)is the distance of a scene point at location x from observer, β(λ) is atmospheric scattering coefficient.

For visible wavelength, the scattering coefficient depends upon wavelength and defined as given in (2) [17].

$$ \beta(\lambda)=\lambda^{-\gamma} $$
(2)

where β(λ) is scattering coefficient, λ is wavelength of incident light and 0 ≤ γ ≤ 4is a constant which depends upon size of particles present in air. Pure air molecules are smaller than wavelengths of light, hence the effect of wavelength will be more because γ = 4. Therefore, sky seems blue under clear day conditions. In foggy conditions, particles size will be larger than wavelength, hence γ = 0. Therefore scattering is independent of wavelength and scattering coefficient β(λ) is constant(homogeneous scattering). In case of thin fog or mild haze, size of particle varies between 10− 4to 10μ m, which produce gamut of varying scattering effects and is known as heterogeneous scattering [17].

It is assumed that no absorption or attenuation takes place in clear day conditions, therefore β(λ) = 0. Thus from (1).

$$ I_{d}(x)=L_{\infty}(x) \rho (x) $$
(3)

Equation (3) represents, the image captured in clear day conditions and can be represented by I r (x).

The proposed method assumes that, (1) atmosphere is composed of small particles (like fog or haze) with homogeneity in shape and size; (2) orientation of particles is constant. Therefore the scattering coefficient β(λ) and atmospheric light L (x) can be taken as global constants β and A respectively. Thus, from (1) and (3),

$$ I_{d}(x)=I_{r}(x)tr(x)+A(1-tr(x)) $$
(4)

where I d (x)is the image degraded in foggy weather conditions, I r (x) is the expected clear day image of same scene, A is global atmospheric light and t r(x) is transmission, which describes the amount of light directly reaches to the observer without scattering.

The objective of defogging is to estimate I r (x),t r(x) and A from an input image I d (x). This is an ill-posed problem since I r (x),I d (x)and A are co-planar vectors and they are co-linear at their end points [5]. From (4),

$$ I_{r}(x)=\frac{I_{d}(x)-A}{tr(x)}+A $$
(5)

Thus (5) can be used to recover clear day image (I r (x)), if t r(x) and A are known. Transmission t r(x)is expanded below.

$$ tr(x)= \left\{\begin{array}{lllllllll} 1 \qquad if \quad d(x)= 0\\ 0 \qquad if \quad d(x)=\infty \\ (0,1) \qquad otherwise \end{array}\right. $$
(6)

Practically, d(x) = is not possible, therefore if d(x) is normalized with in the closed interval [0,1]then t r(x)can be redefined as.

$$ tr(x)= \left\{\begin{array}{lllllllll} 1 \qquad if \quad d(x)= 0\\ e^{-\beta} \qquad if \quad d(x)= 1 \\ (0,1) \qquad otherwise \end{array}\right. $$
(7)

If A and I d (x) are normalized in the closed interval [0,1]then scene radiance I r (x) will be in the closed interval [0,1].

3.1 Problem analysis for restoration of I r (x)

The image I d (x)contains many objects in general. The objects will be at different depth from the observer. The proposed method gives following depth clues to restore scene radiance I r (x).

  • Clue #1: If an object is very close or near to camera then,

    $$ d(x)\approx0 \qquad \Rightarrow tr(x)= 1 $$
    (8)

    From (4), I d (x) = I r (x)which infers that the objects closer to camera do not degrade due to scattering of light. Thus the visibility of closer objects remains clear in poor weather conditions.

  • Clue#2: The radiance of an objects attenuates as the distance of objects from camera increases. If an object is very far from camera,

    $$ d(x)=\infty \qquad \Rightarrow tr(x)\approx0 $$
    (9)

From (4), I d (x) = A which shows that the effect of bad weather is more at long distance objects and is almost equal to global atmospheric light. It can be used to find the value of the global atmospheric light A. The clues prove that the scene depth is strongly correlated with transmission t r(x) and global atmospheric light A. Therefore accurate estimation of scene depth is vital for restoration of clear day image.

In computer vision, depth information is not included with a single image. Therefore, fog removal from a single image is hard and challenging, due to loss of one dimension in image formation. As discussed earlier, if depth information of an image can be recovered then clear day image can be restored. This inspired us to propose a linear depth model for fog or haze removal.

Aim of the proposed work is to address the issues of existing literature and to introduce a accurate depth estimation model to preserve existing edges and restore degraded edges while preserving the structure of image. The proposed work is inspired by existing linear model for depth estimation based on color attenuation prior [46]. Contribution of the proposed work is in two folds.

  1. 1.

    Color attenuation prior [46] considered difference of brightness and saturation to approximate depth of a scene point on the basis of an observation, that brightness increases and saturation decreases with depth. However, hue of scene also increases with depth, which proves that hue is positively correlated with depth. Thus, the estimated scene depth can be improved by adding hue in color attenuation prior. Therefore, the proposed model considers hue, brightness and saturation to approximate depth, which gives more accurate scene depth and restores degraded edges.

  2. 2.

    White object problem is solved by [46] using minimum filter, which results into loss of existing edges. Thus, the proposed work used median filter to preserve existing edges to improve the visibility.

4 Theoretical foundation of the proposed linear model

4.1 Theoretical foundation

For theoretical basis, a number of experiments are conducted over the images degraded by fog to find the statistics behind it. It is found that the depth information not only depends upon difference of saturation and brightness but also varies with the hue.

Figure 2 shows the statistics of a natural scene under various foggy conditions. Figure 2b illustrates the statistics of a patch with dense fog of an image in Fig. 2a and it can be observed that the brightness of patch is very high, saturation is very low, difference of brightness and saturation is very high and difference of saturation with hue and brightness is more.

Fig. 2
figure 2

Relationship of Fog density with the proposed linear model. a A Foggy image; b A patch of image (a) with dense fog and its statistics; c A patch of image (a) with medium fog and its statistics; d and e A patch of image (a) without fog and its statistics

Figure 2c shows the statistics of a patch with medium fog, it shows that the brightness is good, saturation is little more, difference of brightness and saturation is quite high and difference of saturation with brightness and hue is more. This shows that the difference of saturation with brightness and hue increases sharply with fog and fades the color of image.

Figure 2d and e represents statistics of a patch without fog. It is observed that the difference of saturation and brightness only, is not good clue for depth estimation. According to [46], the difference of saturation and brightness must be very low but in Fig. 2e, the saturation is too low, difference is very high.

Furthermore, in (5), assume ε(x) = I d (x) − A. The value of ε(x)is used to measure the amount of degradation of a scene point at location x, where atmospheric light A is the maximum possible intensity in I d (x).

The value of ε(x) approaches to zero for a scene point if I d (x)is almost equal to A, which indicates that location x is at long distance and affected by dense fog. Furthermore, the increasing value of ε(x)indicates the decreasing affect of fog which infers that location x is near to camera or observer.

For a fog free patch in Fig. 2e, the difference of saturation with brightness and difference of saturation with brightness and hue is

$$ vs_{ff}= 0.8873 $$
(10)
$$ vhs_{ff}= 1.1998 $$
(11)

where v s f f is difference of saturation with brightness and v h s f f is difference of saturation with brightness and hue for a fog free image. Therefore, they are representing the approximate values of I(x).

In first look, the values of v s f f and v h s f f represents same fact that the patch is at long distance and affected by fog but there is a significant difference between them. In Fig. 2b, for the dense fog, the maximum difference of saturation with brightness and the difference of saturation with brightness and hue is

$$ vs_{df}= 0.9754 $$
(12)
$$ vhs_{df}= 1.5919 $$
(13)

The abbreviations ff and df in (10), (11), (12) and (13) are used to represent fog free and dense fog conditions for understanding purpose.

The values of v s d f = 0.9754and v h s d f = 1.5919 are the maximum possible differences, therefore they approximately representing the values of A. The difference of v s f f v s d f = 0.0881, which is low and wrongly indicates that the patch is fog affected and here the method of [46] fails, while the difference of v h s f f v h s d f = 0.3921, which is high and indicates that the patch is not fog affected. Therefore the consideration of hue is vital.

Furthermore, it can be observed from Fig. 2 that the value of b r i g h t n e s s + h u es a t u r a t i o n is decreasing more sharply than the value of b r i g h t n e s ss a t u r a t i o n. Therefore the proposed work considers the failure factor of [46] and introduce an improved linear depth model which not only depends upon the difference of saturation with brightness but considers hue as well.

According to the statistics, the depth of a scene point and concentration of haze can be correlated as:

$$ depth(x) \alpha (brightness(x)+hue(x)-saturation(x)) $$
(14)

Equation (14) represents positive correlation of depth of a scene point with brightness, saturation and hue component of same scene point.

5 Mathematical modeling of the proposed linear model

This section presents the mathematical modeling of the proposed work. As explained in Section 4, the difference of saturation from sum of hue and brightness gives a clue about depth and can be used to approximate the depth. Since this observation is based on statistics of hue, brightness and saturation. Therefore, there is a need to model the observation as a statistical model. Thus, the proposed improved linear depth model can be expressed as:

$$ d(x)=(c_{1}+c_{2}v(x)+c_{3}h(x)-c_{4}s(x))/\alpha +\epsilon(x) $$
(15)

where, x is the location of a scene point with in the given image. Generally, x represents coordinates of the scene point. Depth, brightness, hue and saturation of a scene point at location x is represented by d(x), v(x), h(x) and s(x) respectively. Linear coefficients of the model are c 1,c 2,c 3and c 4. Scene depth is normalized by a parameter α. Random error of the model is represented by 𝜖(x), which can be modeled by a random image.

Equation (15) can be used to approximate depth of a scene point at location x. Accuracy of estimated depth depends upon fine tuning of linear coefficients. Figure 3 shows approximated depth of a foggy image using (15) with c 1 = c 2 = c 3 = c 4 = α = 1 and 𝜖(x) = 0, which is represented by a black image.

Fig. 3
figure 3

Depth approximation using (15) a Foggy Image, b Hue(h(x)), c Brightness(v(x)), d Saturation(s(x)), e Approximated Depth(d(x))

Figure 3a, b, c, d and e shows a foggy image, h(x), v(x), s(x) and approximated d(x)of same image. In Fig. 3e, it can be observed that scene points close to camera are at zero depth, which is represented by black color and scene points at long distance are bright. Accuracy of d(x)can be increased by learning linear coefficients and random error.

5.1 Computation of linear coefficients

Objective of the proposed work is to preserve existing edges and to recover degraded edges, which depends upon accuracy of the estimated depth using proposed linear depth model. Accuracy of linear depth model is influenced by linear coefficients c 1,c 2,c 3 and c 4. Therefore proper and correct evaluation of linear coefficients is essential using two dimensional data analysis techniques [22,23,24, 42].

Effect of restored edges is best described by structure of image. Therefore, the proposed work relies on structural similarity index(ssim) [34] to compute the values of c 1,c 2,c 3 and c 4 to produce quality results. The proposed work compute values of linear coefficients such that ssim improves. s s i m(j,k)is used to measure the variance of combined effect of luminance(l u m(j,k)), contrast(c o n t(j,k)) and structure (s t r u c t(j,k))of two images j and k. Equations (16), (17), (18) and (19) are used to compute s s i m(j,k), l u m(j,k), c o n t(j,k) and s t r u c t(j,k) of images j and k respectively [33, 34].

$$ ssim(j,k)=lum(j,k)*cont(j,k)*struct(j,k) $$
(16)

where,

$$ lum(j,k)=\frac{2\mu_{j}\mu_{k}+C_{1}}{{\mu_{j}}^{2}+{\mu_{k}}^{2}+C_{1}} $$
(17)
$$ cont(j,k)=\frac{2\sigma_{j}\sigma_{k}+C_{2}}{{\sigma_{j}}^{2}+{\sigma_{k}}^{2}+C_{2}} $$
(18)
$$ struct(j,k)=\frac{\sigma_{jk}+C_{3}}{\sigma_{j}\sigma_{k}+C_{3}} $$
(19)

where μ j ,μ k are local means, σ j ,σ k are local standard deviations of image j and k respectively, cross-covariance of same images is represented by σ j k . If images j and k are same then value of s s i m(j,k) = 1.

Figure 4 shows local means (μ j ,μ k ), standard deviations (σ j ,σ k ), cross co-variance(σ j k ), l u m(j,k), c o n t(j,k), s t r u c t(j,k) and s s i m(j,k) of an image k with respect to a reference image j.

Fig. 4
figure 4

Illustration of s s i m(j,k)with varying k a Image k, b μ k , c σ k , d σ j k , e l u m(j,k), f c o n t(j,k), g s t r u c t(j,k)and h s s i m(j,k)

First row of Fig. 4 shows images j, μ j , σ j , σ j j , l u m(j,j), c o n t(j,j), s t r u c t(j,j) and s s i m(j,j). Thus, l u m(j,j), c o n t(j,j) and s t r u c t(j,j)found as white images and scaled down to show their presence. Value of s s i m(j,j) = 1, which proves that images j and k are same.

Furthermore, image k, μ k , σ k , σ j k , l u m(j,k), c o n t(j,k), s t r u c t(j,k) and s s i m(j,k)are shown in remaining rows. Image k is obtained by contaminating image j using gaussian noise with zero mean and varying standard deviation σ = 0.001,0.009,0.020 respectively. As σ increases, l u m(j,k), c o n t(j,k), s t r u c t(j,k) and s s i m(j,k)decreases. However, change in luminance is not noticeable. This proves that s s i m(j,k) = 1, if and only if l u m(j,k) = 1, c o n t(j,k) = 1 and s t r u c t(j,k)= 1. Therefore, from (17), (18) and (19).

$$ \frac{2\mu_{j}\mu_{k}+C_{1}}{{\mu_{j}}^{2}+{\mu_{k}}^{2}+C_{1}}= 1 \Rightarrow (\mu_{j}-\mu_{k})^{2}= 0 $$
$$ \frac{2\sigma_{j}\sigma_{k}+C_{2}}{{\sigma_{j}}^{2}+{\sigma_{k}}^{2}+C_{2}}= 1 \Rightarrow (\sigma_{j}-\sigma_{k})^{2}= 0 $$
$$ \frac{\sigma_{jk}+C_{3}}{\sigma_{j}\sigma_{k}+C_{3}}= 1 \Rightarrow \sigma_{j}\sigma_{k}=\sigma_{jk} $$

Thus, it infers that ssim will be high if square of difference of mean, square of difference of standard deviation of two images j and k is low. Therefore the proposed method computes values of linear coefficients such that the squared difference will be low i.e.

$$(\epsilon(x))^{2}=(d(x)-c_{1}-c_{2}v(x)-c_{3}h(x)+c_{4}s(x))^{2} \simeq 0 $$

Ordinary least square(OLS) method of regression analysis is principally based on minimization of squared errors [2]. Therefore the proposed method used OLS method to compute values of c 1,c 2,c 3and c 4. To obtain intermediate equations of OLS, generalized regression model of (15) is required and can be expressed as:

$$ d_{i}(x)=(\frac{c_{1}}{\alpha}+\frac{c_{2}}{\alpha}v_{i}(x)+\frac{c_{3}}{\alpha}h_{i}(x)-\frac{c_{4}}{\alpha}s_{i}(x))+\epsilon_{i}(x) $$
(20)

where d i (x) is random depth at location x of i th sample and represents dependent variable of linear regression, v i (x),h i (x)and s i (x)are brightness,hue and saturation components of i th sample respectively and represents independent variables of linear regression. Random error produced by i th sample of the linear regression is represented by 𝜖 i (x)and modeled using gaussian distribution with zero mean and σ 2 variance. After replacing \(\frac {c_{1}}{\alpha },\frac {c_{2}}{\alpha },\frac {c_{3}}{\alpha }\) and \(\frac {c_{4}}{\alpha }\) by β 0,β 1,β 2 and β 3 in (20) respectively, where all β i <= 1for (i = 0,1,2,3).

$$ d_{i}(x)=(\beta_{0}+\beta_{1} v_{i}(x)+\beta_{2} h_{i}(x)-\beta_{3} s_{i}(x))+\epsilon_{i}(x) $$
(21)

Equations (20) and (21) are representing generalized model of (15). Using (21), the sum of squared error s is given as:

$$s=\sum\limits_{k = 1}^{num} (d_{i}(x)-(\beta_{0}+\beta_{1} v_{i}(x)+\beta_{2} h_{i}(x)-\beta_{3} s_{i}(x)))^{2} $$

where num represents number of samples used for regression analysis. Partially differentiating s with respect to β 0,β 1,β 2,β 3 and equating to zero (i.e. \(\frac {\partial s}{\partial \beta _{0}}= 0, \frac {\partial s}{\partial \beta _{1}}= 0, \frac {\partial s}{\partial \beta _{2}}= 0\) and \(\frac {\partial s}{\partial \beta _{3}}= 0\)), gives following Equations.

$$\begin{array}{@{}rcl@{}} \sum\limits_{k = 1}^{num} d_{i}(x)&=&num\beta_{0}+\beta_{1}\sum\limits_{k = 1}^{num}{v_{i}(x)}+\beta_{2}\sum\limits_{k = 1}^{num}{h_{i}(x)}-\beta_{3}\sum\limits_{k = 1}^{num}{s_{i}(x)} \end{array} $$
(22)
$$\begin{array}{@{}rcl@{}} \sum\limits_{k = 1}^{num} d_{i}(x)*v_{i}(x)=\beta_{0}\sum\limits_{k = 1}^{num}{v_{i}(x)}+\beta_{1}\sum\limits_{k = 1}^{num}{v_{i}(x)}^{2}+\beta_{2}\sum\limits_{k = 1}^{num}{h_{i}(x)*v_{i}(x)}-\beta_{3}\sum\limits_{k = 1}^{num}{s_{i}(x)*v_{i}(x)}\\ \end{array} $$
(23)
$$\begin{array}{@{}rcl@{}} \sum\limits_{k = 1}^{num} d_{i}(x)*h_{i}(x)=\beta_{0}\sum\limits_{k = 1}^{num}{h_{i}(x)}+\beta_{1}\sum\limits_{k = 1}^{num}{v_{i}(x)*h_{i}(x)}+\beta_{2}\sum\limits_{k = 1}^{num}{h_{i}(x)}^{2}-\beta_{3}\sum\limits_{k = 1}^{num}{s_{i}(x)*h_{i}(x)}\\ \end{array} $$
(24)
$$\begin{array}{@{}rcl@{}} \sum\limits_{k = 1}^{num} d_{i}(x)*s_{i}(x)=\beta_{0}\sum\limits_{k = 1}^{num}{s_{i}(x)}+\beta_{1}\sum\limits_{k = 1}^{num}{v_{i}(x)*s_{i}(x)}+\beta_{2}\sum\limits_{k = 1}^{num}{h_{i}(x)*s_{i}(x)}-\beta_{3}\sum\limits_{k = 1}^{num}{s_{i}(x)}^{2}\\ \end{array} $$
(25)

Equations (22), (23), (24) and (25) are intermediate equations of regression analysis, which are used to obtain values of β 0,β 1,β 2 and β 3.

Value of α is required to obtain values of c 1,c 2,c 3 and c 4. As explained earlier that α is used to normalize the value of scene depth d(x)in closed interval [0,1]. Therefore α is defined as follows:

$$\alpha= \left\{\begin{array}{lllllllll} 1 \qquad if \quad max(d(x))<= 1\\ d_{max} \qquad otherwise \end{array}\right. $$

where d m a x is maximum possible value of scene depth d. From (21), the scene depth d will be maximum if and only if all β i = 1,v i (x) = 1,h i (x) = 1 and s i (x) = 1. Thus the value of d m a x = 2. Therefore, the values of c 1,c 2,c 3 and c 4 can be calculated using value of α.

5.2 Generation of ground truth and data collection for regression analysis

It is hard to obtain ground truth of scene depth due to environmental constraints. An object at some depth in a scene at a time may be at different depth in same scene at some other time. Therefore a set of 200 clear day images of outdoor scenes containing trees, mountains, animals, river sites etc., have been downloaded from internet to prepare sample space for computation of linear coefficients.

Figure 5 shows the process to obtain the values of all variables (\({\sum }_{k = 1}^{num}{d_{i}(x)},\) \( {\sum }_{k = 1}^{num}\) \({v_{i}(x)}, {\sum }_{k = 1}^{num}{h_{i}(x)}, {\sum }_{k = 1}^{num}{s_{i}(x)} \) etc.) involved in (22), (23), (24) and (25) for computation of linear coefficients. Random depth map d i (x) is obtained for each fog free image(J i (x)) based on gaussian distribution with zero mean and 0.5 standard deviation, which indicates proper diversity in the depth map. Value of Airlight A is computed on the basis of uniform standard distribution. Foggy images(I i (x)) with respect to each J i (x) are prepared using (4). Hue (h i (x)), saturation (s i (x)) and brightness (v i (x)) components of each I i (x)are extracted. This makes a sample space to be used to fit the curve of (21) based on OLS method of regression analysis using (22), (23), (24) and (25) to obtain values of linear coefficients. The obtained values of linear coefficients are c1 = 0.0122,c2 = 0.9592,c3 = 0.9839 and c4 = 0.7743. However the values of linear coefficients can vary with the sample space used to compute them.

Fig. 5
figure 5

Process to obtain the values of all the variables involved in (22), (23), (24) and (25)

It is observed that on an average the value of c 3 = 0.9, which shows that the role of hue is important in depth estimation. Table 2 presents the analysis in support of this fact and shows the values of ssim at various values of c 3for different images. It can be observed that as high the value of c 3 is, as better is ssim.

Table 2 ssim at different values of c 3

6 Properties of the proposed linear model and restoration of scene radiance

The properties and the process to recover the scene radiance is described in this Section.

6.1 Edge preserving property of the proposed linear model

The improved linear depth model preserves the edges and restores the degraded edges. From (15), the gradient of d(x)can be expressed as [16].

$$ {\Delta} d= c_{2} {\Delta} v(x)+c_{3} {\Delta} h(x)-c_{4} {\Delta} s(x)+{\Delta} \epsilon $$
(26)

The principle of regression analysis is based on the assumption that Δ𝜖 = 0, therefore

$$ {\Delta} d= c_{2} {\Delta} v(x)+c_{3} {\Delta} h(x)-c_{4} {\Delta} s(x) $$
(27)

Equation (27) shows that d(x)will depends on the value of brightness, saturation and hue components. The d(x) will have an edge only if there is an edge in h or s or v or in all. In Fig. 6, (a) is the degraded foggy image, (b) is ground truth, (c) is edge map of degraded image, (d) is edge map of ground truth, (e) is depth model prepared by [46], (f) is edge map of (e), (g) is the depth model prepared by the proposed method and (h) is the edge map of the depth model by proposed method. It can be observed from Fig. 6f and g that the proposed depth model preserves the edges more accurately.

Fig. 6
figure 6

a Degraded Image, b Ground Truth, c Edge Map of Degraded Image, d Edge Map of Ground Truth, e Depth model using [46], f Edge map of e with psnr= 62.85, g Depth model of proposed method, h Edge map of f with psnr= 63.71

The peak signal to noise ratio(psnr) of the edge map prepared by the proposed model is calculated with reference to edge map of ground truth image and it shows that the proposed model is more close to ground truth. The psnr of all three edge map show that the proposed edge map is more close to ground truth. This shows that the proposed depth model is more accurate due the edge dependency on v(x),s(x) and h(x).

6.2 Handling of white regions

The degraded scene may have white regions due to, (1) presence of real white color objects or (2) effect of airlight on objects at long distance. Therefore it is difficult to distinguish real white objects from long distance objects. The proposed model in (15) can be used to estimate depth accurately. However it fails in the case of white objects in a scene since the brightness will be very high, saturation will be very low and hue will be moderate.

To solve the problem of white regions, [5, 46] assumed that the local pixels are at same depth and used minimum filter to refine estimated depth. However, due to minimum filter, the existing edges are lost as shown in Fig. 6f. The median filter not only solves problem of white region but also preserves existing edges. Therefore the proposed method used median filter, which can be utilized to refine depth map without losing edge and disturbing the estimated depth of white object. The refine depth map can be expressed as:

$$ d_{r}(x)=med_{y \epsilon \omega_{r}(x)}{d(y)} $$
(28)

where d r (x) is refined depth map, ω r (x) is patch of size r X r and d(y) is estimated depth at location y. However, due to patch based operation, the image gets blocking effects, therefore image guided filter is used for smoothing of image [6].

Estimation of any property is influenced by neighborhood [23]. In real scenes, a pixel and its local neighborhood shares same depth. Therefore, the proposed method assumes that locally the pixels are at same depth. However, strength of this assumption is depends upon size of local neighborhood represented by a patch ω r (x). A small patch size would results into overestimation of depth, while a large patch size loses the edges. Therefore, selection of a moderate patch size is required for effectiveness of the assumption. A few foggy images, their estimated depth map using varying patch size and graph of these depth maps are shown in Fig. 7.

Fig. 7
figure 7

Estimated depth maps d(x)using varying patch sizes and their respective graph with respect to distance from camera. a Original Foggy Image, b d(x)using patch size 3 × 3, c d(x)using patch size 7 × 7, d d(x)using patch size 15 × 15, e d(x)using patch size 30 × 30

Depth maps d(x)shown in Fig. 7b to e proves that estimated depth of the pixels which are close to camera is zero. Estimated depth of pixels increases with increase in distance from camera. This validates the assumption. However, increased patch size results into smoothness of depth maps. Figure 7b and c shows depth maps obtained using patch size 3 × 3and 7 × 7respectively and their respective graphs. It can be observed that depth maps of Fig. 7b and c are more detailed. However, depth is overestimated at long distance as shown in graph.

Furthermore, increase in patch size solves the problem of overestimation as shown in graphs in Fig. 7d and e. However, increase in patch size causes more smoothness in depth, which may result into loss of edges. Therefore selection of patch size is important. A patch size of 15 × 15 is proved to be effective [5].

6.3 Restoration of scene radiance I r (x)

If global atmospheric light A and transmission t r(x) is known then the scene radiance I r (x)can be restored using (5).

The proposed method estimated the value of atmospheric light A c for each color channel c where c 𝜖(R,G,B)using method of [5].

$$ A=\min_{c\epsilon(R,G,B)}(A^{c}) $$
(29)

where min is minimum function, c is color channel and A c is atmospheric light present in c color channel. The global atmospheric light A calculated using (29).

The scene depth d(x)is recovered using (15) and refined using (28). Transmission t r(x) is recovered using t r(x) = e βd(x)and refined using (7).

$$ {I_{r}^{c}}(x)=\frac{{I_{d}^{c}}(x)-A}{ \max{ \{ e^{-\beta},tr(x) \} }}+A $$
(30)

Where \({I_{r}^{c}}(x)\) is restored radiance of a scene point in c color channel , \({I_{d}^{c}}(x)\) is c color channel of input foggy image, A is atmospheric light and t r(x) is transmission. Equation (30) can be used to restore radiance of each color channel and these color channels can be combined to recover clear day image I r (x). Equation (30) ensures that I r (x)≠ 0. Thus, the proposed model produce quality results.

The value of β plays an important role in scene estimation. In the proposed method, it is assumed that the atmospheric scattering is homogeneous, therefore β will be constant. The proposed model took β = 1.

7 Experimental analysis

The proposed method is implemented using MatlabR2014a on Intel CORE(TM) i7-4790 @3.60 GHz. and tested on a large data set which is described below.

7.1 Data set

The Data set is categorized into two categories.

  • Data set A: Waterloo IVC Dehazed Image Database [14] is available online. It contains 25 foggy images of outdoor scenes and indoor static objects. The 22 outdoor images are real world and degraded by fog to different extents(heterogeneous fog) and 3 images are degraded by homogeneous fog. All 25 degraded images are restored using [5, 11, 15, 30, 31, 38] and produced 6 different dehazed images for each of the 25 foggy images.

  • Data set B: Due to varied atmospheric conditions, it is not possible to capture real ground truth of natural foggy images, therefore Frida2 [32] data set is used which consist of synthetic foggy images and their ground truth.

The qualitative and quantitative analysis is performed to measure efficiency and validity of the proposed method.

7.2 Qualitative evaluation

The qualitative analysis is performed using Data set A and visual results are compared with with renowned existing methods [5, 15, 30, 31, 46]. The qualitative comparison of results is shown in Fig. 8 and from Figs. 910111213 and 14.

Fig. 8
figure 8

Qualitative comparison of various state of art methods on Data Set: A. a Foggy Image. b [5] c [31] d [15] e [30] f [46] g proposed method

Fig. 9
figure 9

Comparison of Histograms a Foggy Image. b [31] c [5] d [30] e [46] f proposed method

Fig. 10
figure 10

Comparison of Histograms a Foggy Image. b [31] c [5] d [30] e [46] f proposed method

Fig. 11
figure 11

Comparison of Histograms a Foggy Image. b [31] c [5] d [30] e [46] f proposed method

Fig. 12
figure 12

Comparison of Histograms a Foggy Image. b [31] c [5] d [30] e [46] f proposed method

Fig. 13
figure 13

Comparison of Histograms a Foggy Image. b [31] c [5] d [30] e [46] f proposed method

Fig. 14
figure 14

Comparison of Histograms a Foggy Image. b [31] c [5] d [30] e [46] f proposed method

7.3 Quantitative evaluation

The purpose of the fog removal algorithm is to enhance visibility and recover/restore edges. It is expected that these algorithms must preserve structure of the image and colors also. Therefore, the evaluation parameters must be based on visibility of edges, texture, entropy and structure of the image. Generally, quantitative comparison can be performed in two ways: (1) Non reference based and (2) Reference based [36, 43]. The proposed method is validated using the non reference based parameters as well as reference based parameters.

In non reference based parameters, the parameters dedicated to measure visibility of the restored image are used to evaluate the proposed method. Two parameter e and r are computed, which measures the ability to recover degraded edges and preserve existing edges.

The value of e is used to evaluate ability of the method to restore those edges which were not visible in degraded image. The average visibility effect is measured by r [43]. Higher values of e and r infers better results.

Furthermore, the color distortion is another parameter to measure the effectiveness of the restoring method. Therefore, color retention degree h is used to measure color distortion produced by the method. The proposed method measured color retention degree by comparing Histograms of the degraded input image and restored image. Lesser the value of h, more similar histograms are in shape with stronger color retention [45].

Reference based parameters [8] require degraded foggy image and corresponding real clear weather image. In real scenarios, it is impossible to find clear weather image respective to foggy image. Therefore, to evaluate reference based parameters, the Data set B is used.

In reference based parameters, psnr and ssim is calculated. psnr measures the human perception about the quality of the restored image. The higher psnr indicate good restoration quality. To calculate psnr of two images, the following formula is used.

$$ psnr= 10*\log{\frac{I_{max}^{2}}{MSE}} $$
(31)

where I m a x is the maximum intensity of the images and MSE is mean square error.

The ssim is based on the comparison of luminance, contrast and structural value of the foggy image and the restored image. The formula to calculate ssim is given in (16).

Figure 8 shows the subjective comparison of results. It can be observed from Fig. 8b that method of [5] produce Halo artifacts near depth discontinuities. In images with the sky, colors are distorted in sky region due to inaccurate transmission estimation by [5]. In Fig. 8c, results of [31] have been shown. Transmission is overestimated by [31], which results into dark image. In results of [15], the colors are distorted. Results of [30] are better but it fail in case of dense fog. Results of [46] are more better than all other methods. However, it can be observed from the image of goose that the visibility is poor due to wrong estimation of the depth. As can be observed that the results produced by the proposed method are promising. The proposed method produce quality results for sky and non sky images. It can be observed from Fig. 8g that the color of the sky region in image of dog and in image of mountain, is more natural.

Furthermore, it can be observed from Figs. 12f and 13f that the proposed method obtains better visibility in restored image than other methods.

Figures 9 to 14 shows comparison of histograms of real world natural images and their respective restored images under varied weather conditions using [5, 30, 31, 46] and the proposed method. Histograms of the restored images in Figs. 9 to 14 shows that the proposed method restores degraded color without any color distortion.

Table 3 shows value of h parameter of the histograms of images shown in Figs. 9 to 14. The value of h parameter is less compared to other methods which proves that the colors of the results produced by the proposed method are being retained.

Table 3 Comparison of values of h of the images shown in Figs. 9 to 14

Objective evaluation of the proposed method for the images shown in Figs. 9 to 14 is performed on the basis of e and r parameter and details are given in Tables 4 and 5. The average value of e and r by the proposed method surpasses the average value given by [5, 15, 30, 31, 46].

Table 4 Comparison of values of e of the images shown in Figs. 9 to 14
Table 5 Comparison of values of r of the images shown in Figs. 9 to 14

Furthermore, the proposed method is evaluated on the basis of the reference based parameters psnr and ssim. The average value of the psnr and ssim is shown in Table 6. The psnr by [5] is very less, [46] seems better than [5] but the proposed method surpasses both in psnr. The ssim by [5] and [46] is almost same but the proposed method achieves much better ssim due to edge preserving property and accurate depth estimation.

Table 6 Reference based Quantitative Comparison on Data set B

8 Conclusion

This paper has proposed an improved linear depth model based on statistics of Hue,Saturation and Brightness of a scene point. The proposed improved linear depth model is more accurate than existing model and preserves the edges perfectly due to involvement of hue in the model and median filter. The median filter is not only used to preserve edges but it is also used to tackle with the problems of white regions. The validity of the proposed method is proved on the basis of quality and objective analysis.

The improved values of e,r and h proves that the proposed method preserves the edges more accurately. Improved values of psnr and ssim proves that results produced by the proposed method are close to human visual system.

However, the proposed method is based on the assumption that scattering is homogeneous. Therefore, the value of atmospheric light A and scattering coefficient β is considered as constant. Thin foggy conditions produce varying effects of scattering in different regions of image. Therefore, region wise estimation of scattering coefficient and pixel wise estimation of transmission, can solve heterogeneous scattering. In future, we would like to focus on these issues, using a new depth model for heterogeneous scattering.