1 Introduction

The performance of computer vision-based systems is sensitive to the visual quality of the observed scene for various inference-related tasks such as object detection and image segmentation. Poor visibility of observed scene degrades the performance of algorithms developed for these applications. Nowadays, vision-based systems rely on scene enhancement techniques to rectify the captured noisy observations and restore the clear scene. Physically, noise is induced by the suspended atmospheric particles such as dust and smoke. The size, material, shape, and concentration of these particles affect the intensity of the added noise [1]. During image acquisition, light rays reflected from object surface suffer scattering and absorption due to these suspended particles. It results in poor contrast and visibility in the acquired image. This phenomenon of dispersion of light due to suspended atmospheric particles is called atmospheric scattering. Mathematically, the atmospheric scattering phenomenon has been formulated by Narasimhan and Nayar [2, 3] as:

$$\begin{aligned} S^{\lambda }(a,b) = J^{\lambda }(a,b) t(a,b) + A \left( 1-t(a,b) \right) \end{aligned}$$
(1)

where \(S^{\lambda }(a,b)\) is the observed scene, \(J^{\lambda }(a,b)\) is the true scene radiance, A is the global atmospheric airlight, \(\lambda \in \{ \text {R,G,B} \}\) represent the associated color channel, (ab) is a pixel location, and t(ab) is the scene transmission defined as:

$$\begin{aligned} t(a,b) = \exp \left( -\beta \times d(a,b) \right) \end{aligned}$$
(2)

where \(\beta \) is the scattering coefficient and d(ab) is the scene depth at pixel location (ab).

1.1 Literature review

Based on the atmospheric scattering model, researchers came up with various dehazing methods categorized as image enhancement-based, priors or assumption-based, fusion-based, and deep learning-based methods. Before Narasimhan and Nayar’s haze formation model, Kopeika [4] and Yitzhaky [5] used weather predicted atmospheric modulation transfer function and prior distance estimate. Further, Oakley et al. [6] and Tan et al. [7, 8] developed a physics-based model to enhance the hazy scene without any weather prediction. Earlier approaches for image dehazing require additional information such as Schechner and Shwartz [9, 10] use multiple images of a scene with various degrees of polarization using polarized filters to obtain the clear image. In real-time applications, vision-based algorithms cannot rely on these types of algorithms. Hence, it generates the need for single-image dehazing.

Single-image dehazing is an ill-posed problem in which relevant feature(s) of an image is modeled based on the available pixel information, which is utilized to obtain a clear image from an input hazy image. Different researchers focus on different properties of image features and utilizes different modeling techniques to target this problem. One of the associated effect of haze is the loss of contrast in the acquired images. Based on this, Tan [11] proposed a single-image dehazing framework by maximizing the local contrast of an image, and Fattal [12] proposed to enhance the scene visibility by estimating its true transmission. Alternatively, for the estimation of true transmission, He et al. [13] introduced dark channel prior (DCP)-based method for single-image dehazing. The performance of DCP was further improved by bilateral [14], median [15, 16], edge-preserving [17], and guided [18] filters. A fast hazy region identification technique based on semi-inverse of an image was proposed by Ancuti et al. [19]. Further, they came up with a multi-scale fusion-based approach to restore the clear image [20]. The dehazing quality was also improved by introducing a boundary constraint and contextual regularization (BCCR) [21]. Since all of these methods depend on various priors and assumptions, they cannot be applied globally. Tang et al. [22] trained a random forest-based regression model using various haze relevant features for estimating true scene transmission. This learning-based framework became the motivation for the use of deep learning-based architectures for image dehazing. Cai et al. [23] proposed DehazeNet architecture for estimating true scene transmission using convolutional neural network (CNN)-based architecture. Similarly, Ren et al. [24] developed an end-to-end multi-scale CNN for estimating scene transmission. Following the success of CNN-based models for single-image dehazing task, a number of different novel architectures have been proposed. AIPNet [25] is based on the observation that the illumination channel gets significantly altered by haze compared to any other channels in YCrCb color space. Li et al. [26] modeled a novel variable instead of transmission map and proposed a fast and effective CNN architecture to model it. GCANet [27] and GMAN [28] are based on encoder–decoder architecture to extract the noise component and to rectify it from the image.

Fig. 1
figure 1

Block diagram of the proposed DCNet. \(D_p\) is referred as dark channel prior for patch size \(p = \sqrt{p} \times \sqrt{p}\) where \(p = {1 \times 1,3 \times 3,5 \times 5,7 \times 7,10 \times 10}\)

1.2 Motivation

As mentioned earlier, single-image dehazing is an ill-posed problem with clear scene radiance, atmospheric light intensity and transmission map being the unknown entities. Looking at (1), clear scene \(J^{\lambda }(a,b)\) depends on the observed scene \(S^{\lambda }(a,b)\), transmission t(ab), and global atmospheric airlight A as:

$$\begin{aligned} J^{\lambda }(a,b) = \frac{S^{\lambda }(a,b) - A \left( 1-t(a,b) \right) }{t(a,b)}. \end{aligned}$$
(3)

The interpretation behind (1) is that the original clear scene radiance \(J^{\lambda }(a,b)\) gets degraded by the scattering caused by suspended particles in the air. Additionally, these suspended particles scatter the atmospheric light, which causes diffused white light in the observed scene. In (3), transmission map represents a crucial unknown entity required to retrieve the clear scene. Therefore, modeling of transmission map is carried out in the following study. Atmospheric light varies minimally across the scene and thus approximated as a constant.

1.3 Contributions

The main objective of the vision-based algorithms is to perceive the visual stimuli from the environment in a similar way as humans perceive. The human brain can easily identify hazy regions and is able to distinguish varying haze densities. The CNN-based models utilize the local information of an image in a similar manner as that of a human eye. Therefore, this paper proposes a CNN-based architecture for the estimation of true scene transmission. The architecture is named as “dark channel network (DCNet)”. Figure 1 shows the diagram of the proposed model. The contributions of this paper are summarized as:

  • This paper proposes a novel CNN-based architecture named as “dark channel network (DCNet)” for the estimation of true scene transmission of a hazy scene with minimum number of layers in comparison with the CNN-based dehazing models available in the literature. The proposed DCNet architecture comprises of two layers: (1) feature extraction layer and (2) convolutional neural network layer.

  • The proposed network has been extensively analyzed, and performance is compared with current state-of-the-art models over the standard benchmark datasets to show the effectiveness of the model and to verify the proposed hypothesis.

The paper is organized as follows in further sections: Section 2 explains the architecture of the proposed DCNet. Sections 3 and 4 discuss the training and experimental results of the model. Section 5 concludes the paper with its limitations and future scope.

2 DCNet: proposed architecture

The DCNet architecture comprises of two layers:

  1. 1.

    Feature extraction layer

  2. 2.

    Convolutional neural network layer

In the following subsections, these layers are discussed in detail.

2.1 Feature extraction layer

Feature extraction layer extracts relevant pixel maps for rectifying haze from an input hazy image. These features constitutes value (V) channel and multi-scale dark channels.

2.1.1 Value (V) channel

In general, the V channel or illumination channel of an image constitutes edge information for textural interpretation. In HSV color domain, V channel of an image constitutes maximum edge information (Refer Fig. 2). Mathematically, the V or illumination channel is defined as:

$$\begin{aligned} S^\text {V}(a,b) = \max _{\lambda \in \{\text {R,G,B}\}} S^\lambda (a,b). \end{aligned}$$
(4)

The observed scene in hazy environment involves degraded radiance of scene due to scattering of light, as discussed in the previous section. It causes loss of contrast and saturation, along with significant variation in value and saturation channels of an image. The mean square error (MSE) plot for 100 images from NYU dataset [30] is shown in Fig. 3. The MSE is evaluated for H, S, and V channels of clear and hazy images. Clearly, the V channel is significantly affected by the presence of haze. Another critical property of V channel is that it provides a means to identify originally bright regions in the image and to distinguish it from hazy regions.

Fig. 2
figure 2

Visualization of an image in RGB and HSV color domains. Top row: clear scene image. Bottom row: hazy scene image. From left to right: RGB image, H channel, S channel, and V channel

Fig. 3
figure 3

Plots for the mean square error obtained between clear and hay images for H, S, and V channels

The minimum channel for the atmospheric scattering model (1) is defined as:

$$\begin{aligned} S_{\text {min}}(a,b) = J_{\text {min}}(a,b) t(a,b) + A \left( 1 - t(a,b) \right) \end{aligned}$$
(5)

where

$$\begin{aligned} J_{\text {min}}(a,b) = \min \limits _{\lambda \in \{ \text {R,G,B} \} } S^{\lambda }(a,b). \end{aligned}$$
(6)

The maximum channel for the atmospheric scattering model (1) is defined as:

$$\begin{aligned} S_{\text {max}}(a,b) = J_{\text {max}}(a,b) t(a,b) + A \left( 1 - t(a,b) \right) \end{aligned}$$
(7)

where

$$\begin{aligned} J_{\text {max}}(a,b) = \max \limits _{\lambda \in \{ \text {R,G,B} \} } S^{\lambda }(a,b). \end{aligned}$$
(8)

The difference between (5) and (7) is

$$\begin{aligned} \varDelta S(a,b) = S_{\text {max}}(a,b) - S_{\text {min}}(a,b) \end{aligned}$$
(9)

or

$$\begin{aligned} \varDelta S(a,b) = \left( J_{\text {max}}(a,b) - J_{\text {min}}(a,b) \right) t(a,b). \end{aligned}$$
(10)

It is to be noted that \(\varDelta S(a,b)\) have smaller magnitude for bright regions and significantly larger magnitude for dark regions. Clearly, the value channel comprises relevant image information for image dehazing. In the proposed architecture, RGB space is transformed to HSV space and the value channel is extracted by

$$\begin{aligned} \left( S^{\text {R}}, S^{\text {G}}, S^{\text {B}} \right) \rightarrow \left( S^{\text {H}}, S^{\text {S}}, S^{\text {V}} \right) . \end{aligned}$$
(11)

2.1.2 Multi-scale dark channels

Dark channel of an image is the pixel-wise minima over a local patch at corresponding pixel location of RGB channels [13]. It is defined as:

$$\begin{aligned} D_p(a,b) = \min \limits _{(i,j) \in \varOmega _p(a,b)} \left( \min \limits _{\lambda \in \{ \text {R,G,B} \}} \left( S^{\lambda }(i,j) \right) \right) \end{aligned}$$
(12)

where \( S^{\lambda }(i,j) \) represents the observed image intensity at location (ij), \( \lambda \in \{ \text {R,G,B} \} \), and \( \varOmega _p(a,b) \) is a patch of size \( \sqrt{p} \times \sqrt{p} \) centered at (ab) in \( S^{\lambda }(i,j) \).

The significance of dark channel is that it provides a way of capturing haze density in a hazy image. For a clear image, the image intensity over a local path at each pixel location, except the sky, is close to zero for at least one of the color channels (RGB) [13]. In a hazy environment, the scattered light intensity adds up to the RGB channels of an observed scene. Thus, the pixel intensity of all 3 color channels gets incremented corresponding to the depth of the scene [13, 22]. Figure 4 shows the dark channels of an image for different window sizes (p). He et al. [13] stated the following

  • For smaller p, dark channel property fails in certain regions, but the pixel map constitutes better edge information.

  • For larger p, dark channel property holds better, but dehazed image shows “halo effects”.

The feature extraction layer of DCNet extracts multi-scale dark channels. It takes advantage of both small patch size and large patch size to extract out the information from input hazy scene, thus resulting in a transmission map with sharp edges. The proposed DCNet has been analyzed over various combinations of multi-scale dark channels. It has been observed that the network performance improves by increasing the number of dark channels. The increase tends to saturate after the patch size of \(p=10 \times 10\). In this paper, DCNet is designed by choosing dark channels with patch sizes \(p \in \{1\times 1,\ 3\times 3,\ 5\times 5,\ 7\times 7,\ 10\times 10\}\).

The output of feature extraction layer comprises the concatenation of the V channel and multi-scale dark channels of an input hazy image. It can be represented as:

$$\begin{aligned} F(a,b,z) \leftarrow \left( S^V(a,b), D(a,b) \right) \end{aligned}$$
(13)

where D(ab) represent the set of all dark channels and z represents the number of layers being concatenated (i.e., \(z=6\) in the proposed architecture).

Fig. 4
figure 4

Visualization of dark channel \(D_p(a,b)\) obtained for different patch sizes. From left to right and top to bottom: hazy scene image, \(D_p(a,b)\) for \(p = \sqrt{p} \times \sqrt{p} = 1 \times 1\), \(3 \times 3\), \(5 \times 5\), \(7 \times 7\), \(9 \times 9\), \(11 \times 11\), \(13 \times 13\), \(15 \times 15\), \(17 \times 17\), \(19 \times 19\), and \(21 \times 21\)

2.2 Convolutional neural network layer (CNN layer)

The CNN layer of DCNet architecture constitutes the primary component involved in estimating the transmission map. The output of feature extraction layer acts as the input to CNN layer. The operation of a convolution layer is defined as:

$$\begin{aligned} C_l = f\left( W_l*C_{l-1}+b_l\right) \end{aligned}$$
(14)

where \(W_l\) and \(b_l\) represent the weights and biases, respectively, of the lth convolutional layer, f represents the activation function, and \(C_l\) represents the output of lth convolutional layer. The output of the first convolutional layer is

$$\begin{aligned} C_1 = f\left( W_1*C_{0}+b_1\right) \end{aligned}$$
(15)

where

$$\begin{aligned} C_0 = F(a,b,z). \end{aligned}$$
(16)
Table 1 Architecture of the proposed model

Convolutional network constitutes 5 convolutional layers with filter size of \(f \times f = 3 \times 3\), as shown in Table 1. Zero padding is applied to maintain the output dimensions across layers. The output of CNN layer is the estimated transmission map of the image as:

$$\begin{aligned} t = C_4 = f\left( W_4*C_{3}+b_4\right) . \end{aligned}$$
(17)

Finally, the dehazed image is obtained using (3), with global atmospheric airlight as \(A=(0.95,0.95,0.95)\). The variation of atmospheric light is usually minimal, thus kept constant in the model.

3 Training of DCNet

The training of DCNet has been carried out as a supervised learning problem. The model is trained over synthetically generated hazy patches of images with their corresponding transmission maps. These patches are extracted by randomly cropping small sections of images of size \(20 \times 20\) and then adding uniform haze in the patches using (1) for a random value of transmission map t. Figure 5 shows some sample patches. The motivation behind this formulation is that usually the variation of depth over a small patch of an image tends to be insignificant. Thus, the transmission map over the region is assumed to be a constant.

The training of the model has been carried out using two different specifications for drawing random transmission maps. The primary model is trained by drawing transmission map uniformly randomly such that \(t \in \left[ 0.4,1.0\right] \). This model is more robust and better handles heavy haze situations. The secondary model is trained by drawing \(t \in \left[ 0.7,1.0\right] \), thus better suited for dehazing of lightly hazed images.

In further sections, the dataset and model analysis have been discussed in detail.

Fig. 5
figure 5

Visualization of training patches of size \(20 \times 20\). Top row: clear patches. Bottom row: corresponding hazy patches

3.1 Dataset, loss function and optimizer for DCNet

The training dataset is created by custom selection of a subset of relevant images from “ImageNet 2013 validation set” [31]. The dataset comprises of 200 images, and in total 10,000 random patches, i.e., 50 patches per image, are extracted for the training of DCNet. Similarly, the Synthetic Test dataset is created by custom selection of 250 distinct and challenging images from “ImageNet 2013 validation set”.

The training of the proposed model is carried out using the MSE loss function (18). The MSE loss function is defined as:

$$\begin{aligned} L(\varTheta ) = \frac{1}{2} \sum _{n=1}^N || NN(x_n) - t_n ||^2. \end{aligned}$$
(18)

The dataset is randomly shuffled over each epoch, and RMSprop optimizer in PyTorch library is utilized during the training procedure.

3.2 Architecture analysis

The proposed DCNet has various hyper-parameters which affects the effectiveness and speed of the proposed model significantly. The network has been analyzed over these hyper-parameters to obtain the final model architecture. These parameters are discussed below.

3.2.1 Haze relevant features

The model has been analyzed over a number of distinct features, i.e., value (V) channel, multi-scale dark channels, and multi-scale saturation (S) channels. Analysis shows that adding multi-scale S channels as another feature to the CNN layer, in addition to the V and multi-scale dark channels, results in negligible improvement in the performance of the model. Thus, the final architecture of the proposed model constitutes only the value channel and multi-scale dark channels.

3.2.2 Patch size of dark channel

The patch size of dark channel is an important parameter of the feature, and the behavior of dark channel is highly sensitive to it. The small-sized patch constitutes edge information but the dark channel hypothesis fails under this specifications, whereas large-sized patch satisfies the dark channel hypothesis but loses the boundary information. To utilize the benefit of both the features while mitigating their limitations, a hybrid implementation of multi-scale dark channels has been proposed. The study has been conducted to analyze the performance of the proposed model over varying set of dark channels \((D_{t}|t \in \mathbf{S} )\), where \(\mathbf{S} \) represents a set of window sizes. The study shows that the performance of the model improves with increasing number of dark channels. The improvement becomes minimal after \(p = 10 \times 10\), whereas the computational load keeps increasing with increase in the number of dark channels. Table 2 shows the performance of DCNet for various combinations of dark channels. The model is optimized over accuracy and computational load. The final architecture constitutes a set of 5 dark channels, i.e., \((D_1, D_3, D_5, D_7, D_{10})\).

3.2.3 Number of convolutional layers

Table 2 Performance of DCNet for different combinations of dark channels (\(D_p\)) in DCNet

It is a general trend in neural networks that with increasing complexity of the model, the performance of model improves. Concurrently, the computational load of network also increases. In this work, the proposed model has been analyzed for multiple sets of convolutional layers. The performance of the model shows an increasing trend, and it tends to saturate after \(l=5\) layers. Thus, in the final architecture, 5 convolutional layers are used.

Fig. 6
figure 6

Plots of sigmoid, ReLU, and leaky ReLU activation functions

3.2.4 Activation function

The proposed architecture has been analyzed with a number of different activation functions, i.e., ReLU, leaky ReLU, and sigmoid function. Figure 6 shows the behavior of these functions. Analysis shows that the proposed architecture with leaky ReLU function outperforms models with similar architecture constituting sigmoid and ReLU functions. The superior performance over ReLU is due to the small negative slope of leaky ReLU function which helps in bypassing the dead ReLU problem. The proposed model constitutes each convolutional layer being interleaved by a leaky ReLU function with a slope of 0.05.

Table 3 Average performance comparison of DCNet with state of the art on Synthetic dataset
Table 4 Average performance comparison of DCNet with state of the art on Middlebury Stereo dataset
Fig. 7
figure 7

Comparative visualization of dehazing methods on Baby and Drumsticks images from Middlebury Stereo dataset

4 Results and discussions

Extensive experiments have been conducted to comprehend the effectiveness of the proposed model. The performance of the proposed model has been analyzed and compared with the current state-of-the-art methods: DCP [13], BCCR [21], SIDMF [20], CAP [29], MSCNN [24], DehazeNet [23], GCANet [27], AOD-Net [26], and GMAN [28]. Experiments have been conducted on the Synthetic Test dataset [31], Middlebury Stereo dataset [32,33,34], SOTS and HSTS datasets [35] and some standard naturally hazed images. For fair comparison, the results over the state-of-the-art methods were reproduced with author’s support. Both quantitative and qualitative analyses have been conducted to obtain the detailed comparison among these models. The quantitative analysis is conducted using standard statistical measures, i.e., mean square error (MSE), structural similarity index (SSIM) [36], and peak signal-to-noise ratio (PSNR).

Tables 3 and 4 show the average performance comparison of DCNet with current state-of-the-art methods over Synthetic Test dataset and Middlebury Stereo dataset, respectively. The hazy images are generated using (1) with \(t=0.6\). The results show that DCNet outperforms all other models.

Fig. 8
figure 8

Comparison of DehazeNet and DCNet on “bird” image. From left to right and top to bottom: Clear image and its feather region, hazy image and its feather region, dehazed image using DehazeNet and its feather region, and dehazed image using DCNet and its feather region

Figure 7 shows the visual comparison of the effectiveness of models over some images from Middlebury Stereo dataset. Different models are able to remove haze to varying extents. DCP and BCCR models removed haze to some extent but affected the color composition significantly. SIDMF removed the haze non-uniformly and affected the other color channels. CAP, MSCNN, DehazeNet, GCANet, AOD-Net, and GMAN are able to remove significant haze, but the resulting images are degraded to some extent in their color composition and texture information. The resulting images generated using DCNet are fairly close to the original clear images preserving the texture, color, and edge information. Similarly, Fig. 8 shows the performance comparison of DehazeNet and DCNet on a sample image of “bird”. The zoomed views highlight the feathers in original clear image, its corresponding hazy image, DehazeNet output, and DCNet output. The output obtained using DehazeNet seems over-saturated, but DCNet output highly resembles the original clear image.

Table 5 Performance evaluation of DCNet on SOTS Outdoor Dataset
Table 6 Performance evaluation of DCNet on HSTS dataset
Fig. 9
figure 9

Comparative visualization of dehazing methods on Canyon image. The zoomed views are shown for sky region in input hazy image (yellow box) and dehazed image using DCNet (red box) (color figure online)

Furthermore, the performance of DCNet has been analyzed on SOTS, HSTS, and standard benchmark datasets [35]. The PSNR and SSIM metrics of the model on SOTS outdoor dataset are provided in Table 5. Table 6 shows the results on Synthetic HSTS dataset. (The performance metrics of other state-of-the-art models are retrieved from [35].) The DehazeNet model outperforms all other models on HSTS dataset. It is to be noted that for consistency in experimental results throughout the paper, the evaluation is carried out using originally trained DCNet model (with synthetic dataset). Hence, this could be a reason for disparity in results.

Fig. 10
figure 10

Comparative visualization of dehazing methods on Girls image. The zoomed views are shown for region with illumination and haze both, in input hazy image (yellow box) and dehazed image using DCNet (red box) (color figure online)

Additionally, the proposed model has been analyzed on some standard natural images frequently used in the literature for comparison. These are interesting cases containing regions like clouds, marble, etc., which generally becomes the bottleneck for dehazing models in removing haze, as white regions are potential areas for the presence of heavy haze. Their corresponding clear images are not available. Figures 9 and 10 show the results for “Canyon” and “Girls” image, respectively. DCP model rectified hazy regions but, on the other hand, increased the contrast in certain areas, such as sky region, unrealistically high. BCCR shows similar behavior as that of DCP. In SIDMF, the resulting image shows non-uniform haze removal. In CAP, the model removed significant haze but also caused loss in edge information and over-saturation. MSCNN preserved the structural integrity and naturalness of the image, but the image still contains haze to some extent. DehazeNet effectively removed haze while maintaining the structural integrity and naturalness, but the image shows halo effects in certain regions. It is distinctly visible in high-frequency regions of the image. GCANet and GMAN effectively removed haze from the image, but the color composition is not preserved. AOD-Net is able to remove haze and better preserve the color composition of the image. In comparison with all aforementioned methods, DCNet is able to preserve the edge information, structural integrity, and is effectively being able to remove haze from the image while maintaining its naturalness. However, there are slight halo effects in the regions with sharp color change. Figure 11 shows experimental results of DCNet on some other standard naturally hazed images. The resulting images from the model show that the closer regions have been successfully resolved, whereas in farther regions, light haze is still prevalent which maintains the naturalness in accordance with human visual perception.

Fig. 11
figure 11

Dehazing results on some standard images. Top row: hazy images; bottom row: dehazed images using DCNet

Table 7 Results of ablation studies on synthetic and Middlebury Stereo dataset

4.1 Ablation studies

The proposed model takes 5 dark channels and the value channel as input features. To realize the importance of these features in the model, an ablation study is conducted. Table 7 shows the performance comparison of the proposed architecture with and without value channel as an input feature to the proposed network. It is quite evident that the performance in all three metrics significantly degrades without the value channel. Similarly, the analysis has been conducted to understand the importance of dark channels with different window sizes. Table 2 shows the performance of model with different set of dark channels. As stated earlier, the performance of the proposed model improves by adding more dark channel. The improvement becomes minimal after the window size of \(10 \times 10\), but the computational load keeps increasing. Thus, the final architecture constitutes a set of 5 dark channels.

5 Conclusions

In this paper, a novel single-image dehazing model—“DCNet” has been proposed, constituting two major components, namely feature extraction layer and CNN layer, for estimating true transmission map of a hazy scene. The model learns nonlinear mapping between input hazy image and its transmission map, used for dehazing of image. The experimental results show that the proposed model outperforms the current state-of-the-art methods for single-image dehazing task over wide range of datasets.

Still, there is room for improvement in the model performance. The model shows slight halo effects in high-frequency regions of the images involving heavy hazy. Also, the model shows non-ideal performance in some images involving sizable bright regions.