1 Introduction

Haze is caused by suspended particles or water droplets in the atmosphere. The dry particles are so small that they cannot be felt or seen individually with our naked eyes, but the aggregate reduces visibility and gives the atmosphere an opalescent appearance. Haze can significantly degrade the imaging quality of outdoor visible light sensor due to a series of reactions, such as scattering, refraction, and absorption between these particles or water droplets and light from the atmosphere [1]. Image dehazing is an important issue in many scene understanding applications such as surveillance systems, intelligent vehicles and feature extraction. However, image dehazing remains a challenge due to the unknown scene depth.

Significant progress has been made on single image dehazing in recent years, although single image dehazing is an ill-posed problem [2]. Different approaches [36] were based on a single image, yet they required geometric information about the input scene. Tan [7] removed the haze by maximizing the local contrast of the restored image. These results were visually compelling but it was tended to be over-saturated and not be physically valid. Fattal [8] estimated the albedo of the scene and the medium transmission under the assumption that the transmission and surface shading should be locally uncorrelated. This approach could not well handle heavy haze images. He [9] proposed DCP method to estimate the transmission based on the observation that a haze-free pixel generally contains one or more RGB color channels being black or nearly black. Currently, many DCP-based improved algorithms [10, 11] have been proposed. However, the transmission estimation in many methods under the DCP framework is unsmooth and lack of local neighbor transmission information. He [12] used the quite time-consuming soft matting to refine it. He [13] further proposed guided image filtering to refine the transmission.

KRM methods have been developed in statistics to estimate the conditional expectation of a random variable without assumptions about its probability distribution function [14]. These methods are well documented and summarized in the literature [15]. KRM methods have been widely used in image processing and pattern recognition, such as medical image process [16], image annotation [17], image saliency detection [18] and feature extraction [19]. In this paper, we extended KRM with local neighbor information to removing the block effect for the transmissions estimated by DCP framework.

The outline of this paper is as follows. Section 2 introduces the DCP. Section 3 describes our method to refine the transmissions. Experiments and results are given in Sect. 4. Finally, some conclusions are drawn in Sect. 5.

2 Dark channel prior model

In computer vision and computer graphics, the atmospheric light model [20] widely used to describe the formation of a hazy image is

$$\begin{aligned} I(x)=J(x)t(x)+A(1-t(x)) \end{aligned}$$
(1)

where I(x) is the hazy image, J(x) is the scene radiance, A is the global atmospheric light, and \(t(x)(0\le t(x)\le 1)\) is the scene transmission.

He [9] proposed the DCP for single image dehazing, in which the prior comes from the observation that most non-sky patches in outdoor haze-free images have at least one color channel with some low-intensity pixels. For an arbitrary imageJ(x), its dark channel is given by

$$\begin{aligned} J^\mathrm{{dark}}(x)=\mathop {\min }\limits _{c\in \mathrm {(r,g,b)}} (\mathop {\min (J^{c}(y)}\limits _{y\in \Omega (x)} ) \end{aligned}$$
(2)

where \(J^{c}\) is a color channel of J(x), and \(\Omega (x)\) is a local window patch centered at pixel x. Dark channel is the outcome of two minimum operators: \(\mathop {\min }\limits _{c\in \mathrm {(r,g,b)}} \) is performed on each pixel in the RGB color space, and \(\mathop {\min }\limits _{y\in \Omega (x)} \) is a minimum filter. If J(x) is an outdoor haze-free image, then the intensity of J(x)’s dark channel is very low and tends to zero: \(J^\mathrm{{dark}}(x)\rightarrow 0\).

He [9] assumed that the atmospheric light A would be a given constant. First the top 0.1 percent brightest pixels in the dark channel are picked, and then the pixels with highest intensity in the input image I are selected as the atmospheric light.

According to Eq. (1), the hazed image can be normalized by A

$$\begin{aligned} \frac{I^{c}(x)}{A^{c}}=t(x)\frac{J^{c}(x)}{A^{c}}+1-t(x) \end{aligned}$$
(3)

He [9] assumed that the transmission in a local patch \(\Omega (x)\) would be constant \(\tilde{t}(x)\). The dark channel is calculated as follows

$$\begin{aligned} \mathop {\min }\limits _{y\in \Omega (x)} \left( \mathop {\min }\limits _c \frac{I^{c}(y)}{A^{c}}\right) =\tilde{t}(x)\mathop {\min }\limits _{y\in \Omega (x)} \left( \mathop {\min }\limits _c \frac{J^{c}(y)}{A^{c}}\right) +1-\tilde{t}(x) \end{aligned}$$
(4)

The transmission can be estimated by

$$\begin{aligned} \hat{{t}}(x)=1-\mathop {\min }\limits _{y\in \Omega (x)} \left( \mathop {\min }\limits _c \frac{I^{c}(y)}{A^{c}}\right) \end{aligned}$$
(5)

Because the transmission in a local patch is constant, the transmission map estimate may produce block effects.

3 Our method

3.1 Kernel regression model

To remove the block effect in the recovered image, we propose KRM to smooth the center transmission with the local neighbor transmissions. KRM is a nonparametric approach in estimating the conditional expectation of a random variable:

$$\begin{aligned} E(z|x)=f(x) \end{aligned}$$
(6)

where z and x are the random variables and f(.) is a non-parametric function. The objective is to find a non-linear relation between a pair of random variables z and x. Assume that the model estimation has the following form:

$$\begin{aligned} \hat{{f}}(x)=z+\varepsilon \end{aligned}$$
(7)

where \(\varepsilon \) is an independent noise with zero mean. If n pairs of input and output observations \((x_i ,z_i )\) are available, the Nadaraya-Watson kernel regression estimator [21] of \(\hat{{f}}(x)\) for a given input observation is defined as follows:

$$\begin{aligned} \hat{{f}}(x)=\frac{\mathop {\sum }\nolimits _{i=1}^n {K_h (x-x_i )z_i } }{\mathop {\sum }\nolimits _{i=1}^n {K_h (x-x_i )} } \end{aligned}$$
(8)

where h is bandwidth or smoothing parameter. And \(K_h ({\bullet })\) is the kernel function and defined by

$$\begin{aligned} K_h (s)=\frac{1}{h}K \left( \frac{s}{h}\right) \end{aligned}$$
(9)

And we select Gaussian Kernel function, that is \(K(u)=\frac{1}{\sqrt{2\pi }}\exp (-\frac{u^{2}}{2})\). The optimal bandwidth that minimizes the Mean Integrated Square Error (MISE) [21] can be approximated by

$$\begin{aligned} h^{*}=\sigma \left( \frac{4}{3n}\right) ^{1/5} \end{aligned}$$
(10)

where \(\sigma \) is the standard deviation.

3.2 Image dehazing with KRM

Suppose transmission of the pixel in image I with the ith row and jth column is \(t_{i,j} \), and the approximation estimator of \(\hat{{f}}(t_{i,j} )\) with local neighbors in a window with radius r is defined by

$$\begin{aligned}&\hat{{f}}(t_{i,j} )\nonumber \\&\quad =\frac{\mathop {\sum }\limits _{k_1=-r}^r {\mathop {\sum }\limits _{k_2 =-r}^r {K_h ((i,j),(i+k_1 ,j+k_2 ))\hat{{t}}_{i+k_1 ,j+k_2 } } } }{\mathop {\sum }\limits _{k_1=-r}^r {\mathop {\sum }\limits _{k_2 =-r}^r {K_h ((i,j),(i+k_1 ,j+k_2 ))} } }\nonumber \\ \end{aligned}$$
(11)

The denominator of Eq. (11), labeled as \(f_{1} \), is computed by

$$\begin{aligned} f_1 (i,j)= & {} \mathop {\sum }\nolimits _{k_1=-r}^r {\mathop {\sum }\nolimits _{k_2 =-r}^r {K_h ((i,j),(i+k_1 ,j+k_2 ))} }\nonumber \\= & {} \mathop {\sum }\nolimits _{k_1=-r}^r {\mathop {\sum }\nolimits _{k_2 =-r}^r {\frac{1}{h^{2}}K\left( \frac{k_1 }{h}\right) K\left( \frac{k_2 }{h}\right) } } \end{aligned}$$
(12)

As illustrated in Eq. (12), \(r^{2}\) multiplications and \(r^{2}-1\) additions are carried out for computing the denominator of Eq. (11). The numerator of Eq. (11), labeled as \(f_{2} \), is computed by

$$\begin{aligned} f_{2} (i,j)= & {} \mathop {\sum }\limits _{k_1 =-r}^r {\mathop {\sum }\limits _{k_2 =-r}^r {K_h ((i,j),(i+k_1 ,j+k_2 ))\hat{{t}}_{i+k_1 ,j+k_2 } } }\nonumber \\= & {} \mathop {\sum }\limits _{k_1 =-r}^r {\mathop {\sum }\limits _{k_2 =-r}^r {\frac{1}{h^{2}}K_h \left( \frac{k_1 }{h}\right) .K_h \left( \frac{k_2 }{h}\right) .\hat{{t}}_{i+k_1 ,j+k_2 } } }\nonumber \\ \end{aligned}$$
(13)
figure a
Fig. 1
figure 1

Synthetic Image and hazy images: a Dolls, b hazy Dolls with \(t=e^{-1}\), c hazy Dolls with \(t=e^{-2}\)

Fig. 2
figure 2

Dehazed results of Fig. 1b by different methods: a Fattal’s method, b He’s method, c Our method

Fig. 3
figure 3

Transmissions for Fig. 1b estimated by: a Fattal’s method, b He’s method, c Our method

4 Experiments and results

We implemented the proposed algorithm on a Windows 7 PC with an Intel(R) Core(TM) i5 CPU@2.67GHz processor, running MATLAB R2014a. We compared our algorithm with two methods: He’s method [13] and Fattal’s method [8]. In order to compare the results of those methods quantitatively, we computed their MSE (Mean squared error) [22] and SSIM (Structural SIMilarity) [23] indexes. The MSE is defined by

$$\begin{aligned} \mathrm{{MSE}}(I,J)=\frac{1}{3MN}\sum _{i=1}^M {\sum _{j=1}^N {\sum _{c=1}^3 {(I(i,j,c)-J(i,j,c))^{2}} } } \end{aligned}$$
(14)

Lower MSE is better performance. SSIM [23] is one of the most commonly used measures for image visual quality assessment. And it is given by

$$\begin{aligned} \mathrm{{SSIM}}=\frac{\hbox {1}}{MN}\sum _{i=\hbox {1}}^M {\sum _{j=1}^N {\frac{(2\mu _{I,ij} \mu _{J,ij} +c_1 )(2\sigma _{IJ,ij} +c_2 )}{(\mu _{I,ij}^2 +\mu _{J,ij}^2 +c_1 )(\sigma _{J,ij}^2 +\sigma _{I,ij}^2 )}} } \end{aligned}$$
(15)

where \(\mu _{I,i}\) and \(\sigma _{I,i}^2\) are local mean and variance of the hazy-free image computed on a block centered on the pixel i of the image I. The block size is \(3\times 3\) in this paper. \(\mu _{J,i} \) and \(\sigma _{J,i}^2 \) are their counterparts for the dehazed image J. \(\sigma _{IJ,i} \) is the covariance between the hazy and dehazed images on the same window, and it is given by

$$\begin{aligned} \sigma _{IJ,i} =\frac{1}{n_1 -1}\sum _{i=1}^{n_1 } {(I_i } -\mu _{I,i} )(J_i -\mu _{J,i} ) \end{aligned}$$
(16)

where \(n_1 \) is the number of pixels in the block. And the constants \(c_1 =0.01\) and \(c_{2} =0.0{3}\) are chosen as recommended by Wang [23]. Higher SSIM is better performance.

4.1 Synthetic images with ground-truth images

For quantitative evaluation on complete images, we synthesized hazy images from stereo images with known atmospheric light and transmission. We set the atmospheric light \(A=[0.8,0.8,0.9]\). The hazy images were generated according to Eq. (1). The Dolls image is shown in Fig. 1a. The hazy images of Dolls with transmission \(t=e^{-1}\) and \(t=e^{-2}\) are shown in Fig. 1b and c, respectively.

Dehazed results for Fig. 1b by Fattal’s method [8], He’s method [13] and our method are shown in Fig. 2a, b and c, respectively. From Fig. 2, we know that Fattal’s method [8] loses and changes a lot of color and texture information. Our method has better results than He’s method and Fattal’s method. The transmissions for Fig. 1b estimated by different methods are shown in Fig. 3. The transmissions estimated by Fattal’s method in Fig. 3 a are not smooth. The transmissions estimated by our method in Fig. 3c are smoother than those of other two methods. Therefore, our method has the best results. The average of MSE and SSIM for R(red), G(green) and B(blue) channels of 10 synthesize images by 3 methods is shown in Table 1. From Table 1, we know that the SSIMs for R, G and B channels by our method are better than those of Fattal’s method and He’s method. The MSE of our method is the smallest.

Table 1 Average MSE and SSIM of 10 synthesize images by 3 methods
Fig. 4
figure 4

Hazy images, ground-truth images and dehazed results: a hazy images, b Fattal’s method, c He’s method, d Our method, e ground-truth image

Fig. 5
figure 5

Ground-truth and estimated transmissions for Fig. 4a: a ground-truth, b Fattal’s method, c He’s method, d Our method

4.2 Real images

  1. (1)

    Images with ground-truth images

It is difficult to acquire pairs of hazy images and their ground-truth images. We used 5 pairs of hazy-free and hazy images from [24]. Some hazy images, ground-truth images and dehazed results by the 3 methods are shown in Fig. 3. Results of Fattal’s method [8] in Fig. 4b are the worst. The result of He’s method [13] has some blue color bias in Fig. 4c, especially in the air regions. Comparing with the ground-truth image in Fig. 4e, we know that the results of our method in Fig. 4e are the best among those 3 methods. The transmissions for Fig. 4a estimated by 3 methods are shown in Fig. 5. Comparing with the ground-truth transmission in Fig. 5a, we know that the transmission of Fattal’s method in Fig. 5b loses many information, that the transmissions of He’s method and our method are smooth, and that the transmission of our method has the most information. The average indexes of MSE and SSIM for RG and B channels of the 5 images by the 3 methods are shown in Table 2. From Table 2, we know that the MSE and SSIM indexes by our method are better than those of Fattal’s and He’s method.

  1. (2)

    Images without ground-truth images

Some hazy images and dehazed results of the 3 methods are shown in Fig. 6. Fattal’s method is based on local statistics and requires sufficient color information and variance. However, the color of heavy hazy images is faint and their variances are not enough high. Therefore, the recovered result for the first hazy image in Fig. 6a by Fattal’s method is not as good as for the other two images in Fig. 6a. Since only parts of transmissions can be recovered, some hazes in distant regions cannot be removed. He’s method has better results in Fig.6c than those of Fattal’s method in Fig. 6b. However, He’s method loses some details in far distance regions. In particular, the sky regions in the hazy images are recovered badly. Our approach has better and more natural results in Fig. 6d than those of other two methods in Fig. 6b and c because our results have smooth and natural transmissions in these regions using local neighbor information.

Table 2 Average MSE and SSIM of 5 images from [24] by 3 methods
Fig. 6
figure 6

Haze removal results: a Input hazy images, b Fattal’s method, c He’s method, d Our method

5 Conclusions

We have proposed an image dehazing method based on DCP and KRM. DCP was used to find the initial transmission information for a hazy image. However, the transmission estimated by DCP is not smooth and has not local neighbor information which leads to the block effects. Experimental results on synthetic images and real images showed that KRM can address this problem effectively using the local neighbor information and that our method performed better than state-of-the-art methods.