Introduction

Images having a high level of visibility are essential for tasks using computer vision. On the other hand [1], the quality of photographs that are taken on the hazy days tends to deteriorate because of the absorption of light by floating particles that are present in the surroundings. It is vital to create an efficient dehazing algorithm in order to accomplish the goal of restoring color and features of pictures that have been distorted [2].

Image de-hazing [3] is one of the primary obstacles to progress in computer vision research. Image dehazing remains difficult to achieve despite technological advancements. Both the field of computer vision and everyday life can benefit from solving the problem of image dehazing. One such use is in the removal of haze from photographs. It is possible to find its applications in many different facets of day-to-day living. The issue is a component of a larger group of issues in image processing that pertain to the procedure of de-noising images. Before it reaches the camera, the light that has been reflected from an object will be dispersed by the atmosphere. The abundance of aerosol particles in the atmosphere is responsible for the phenomena of light rays being scattered as they travel through the atmosphere. In turn, this phenomenon has an effect on the way in which a picture is caught by a camera. The quality of the picture is impacted when there are elements such as dust, fumes, fog particles, etc. are present. The lack of vividness and detail in these photographs is due to the circumstances in which they were shot. When used as a reliable source in areas like transportation or surveillance [4], photos like these that are lacking in detail present a risk that might have serious consequences. As a result, the need of picture dehazing has become more vital [5].

The formation of haze [6] may be attributed to either the scattering or the absorption of light that occurs as a result of droplets of water that are floating in the air or to a huge number of very small particles [7]. Images taken in hazy conditions have limited color fidelity and contrast, which is a problem for many optical imaging systems such as satellite remote sensing, aerial photography, outdoor monitoring, and target identification. It introduces a lot of difficulties that must be solved in order to complete the research.

Image processing-based improvements and physical model-based restoration are the two primary types of image processing technologies now accessible for hazy image processing. The more recent of the two is augmentation based on image processing. The technique of improving a picture that relies on image processing begins with the image itself, and it does not take into account the particular reason for the image's deterioration. By increasing the contrast and brightness of the image, the visual impact of the picture can be improved to meet the goal of clarity. These methods are typically mature and effective, and the outputs of clarity can occasionally meet the criterion for clarity. However, such systems are not capable of adapting to a variety of pictures and scenarios. In particular, the picture that has a greater variety of scene depth transitions is ineffective. Because the approach is predicated on picture enhancement and does not take into account the process of fog quality reduction, it is unable to significantly increase image definition. This is the most crucial aspect of the method. It is unable to clear away the fog to restore the original look, and as a consequence, the resulting distortion is even more severe. Not only does the treated picture have a disappointing visual appearance, but it also does not lend itself well to further processing [8].

The remainder of the study is summarized below: Section 2 discusses previous research that is pertinent to the topic of this research. Research methods will be covered in Section 3, while the experiment's results and in-depth analysis will be covered in Section 4. This part also includes the outcomes of the experiment, and Section 5, the study's last section, emphasizes the relevance of the experiment and identifies opportunities for further investigation.

Literature review

The following section is a review of the literature on image dehazing. This section offers information on earlier research work that is related to the present study. According to the findings of the provided literature review, it is akin to setting a precedent among the already accessible approaches. Numerous research has been conducted using image de-hazing technology and techniques.

Yin et al. [9] provide an image dehazing approach based on a color-transfer image dehazing concept that outperforms modern techniques. This may be accomplished by employing a Deep CNN-based deep framework to develop an image-dehazing model that uses color-transfer image dehazing to clear away the haze and learn about the model's coefficient. The suggested technique outperforms currently available single-picture dehazing approaches, as shown by quantitative and qualitative assessments of synthetic and hazy images.

Golts et al. [10] explain an unsupervised training technique that involves decreasing the well-known energy function of the Dark Channel Before (DCP). We only utilize real-world outside photos to improve network performance by directly minimizing the difference between the best and worst-case variables, rather than providing the network with bogus data. The utilization of the network and the learning process has resulted in extra regularization, as indicated by this. Experiments show that the performance of our method is comparable to that of large-scale supervised algorithms.

Min et al. [11] provide a method for rating dehazing algorithms that takes into account picture structure recovery, colour rendition, and contrast enhancement in low-light areas. Both types of images can benefit from the proposed method; however, they have made it more suitable for aerial photographs by taking into consideration the particular qualities of these. The recommended approaches have been shown to be successful based on the results of experiments conducted on two different subsets of the SHRQ database.

Huang et al. [12] create a new model that results in the removal of the need for a haze/depth data set by using unsupervised learning and a cycle generative adversarial network. Although evaluated on both synthetic as well as actual haze photos, descriptive and analytical testing indicated that the proposed method outperformed existing state-of-the-art dehazing algorithms. This was the case regardless of whether the haze was actual or synthetic.

Du and Li [13] suggested that the dehazed picture be fed back into the input of the Deep Residue Learning (DRL) network in a recursive manner. An interpretation of this recursive extension as a nonlinear optimization of DRL, the convergence of which can be logically evaluated by applying fixed-point theory, is one possible interpretation. Extensive experimental research has been carried out by our team on both simulated and actual data derived from hazy environments. The efficacy of the suggested recursive DRL approach has been shown by the results of our experiments, and it has been demonstrated that the algorithm gives better than other competing approaches.

Li et al. [8] researchers have developed a dehazing method that is based on residual-based Deep CNNs as part of this body of work. After first providing the network model with a foggy picture, which it uses to derive an estimate of the transmission map based on this image, the network then receives a ratio of the foggy image to the transmission map, which causes the haze to be removed from the picture. Increases the efficiency of dehazing while also eliminating the need to estimate light levels throughout the environment. A training set based on the NYU2 depth datasets has been incorporated into the suggested method. The exploratory results indicate that the proposed method is effective and trustworthy in terms of full-reference metrics peak ratio of signal to noise and correlation, in addition to feature similarity and the non-reference metrics SSIM, PSNR, RMSE, and MSE. Additionally, the results show that the proposed method is good in terms of feature similarity.

Research methodology

Describe the research methodology that was employed for this study effort in the third subsection of this part. This section outlines the entire process, which includes the various steps, tools, and workflow.

Proposed methodology

The most challenging inverse problem is frequently ranked as image dehazing. Deep learning methods have appeared as an addition to traditional model-based techniques, helping to define a fresh state-of-the-art in regards to the level of dehazed pictures that can be obtained. used its deep learning model in this study to solve the aforementioned issue. To begin this study, use the dataset that was gathered. Gather the haze and dehaze datasets first. The collection consists of 55 comparisons between haze-free and hazy images. This dataset is split into both testing and training halves in a 90:10 ratio. 30 s for training. Apply the next preprocessing method, which normalizes images, converts BRG images to RGB images, and converts images to NumPy arrays. Following this, carry out an EDA that displays histogram plots and implements AlexNet using a functional neural network that makes use of the Adam optimizer and a variety of activation functions. Because this takes a while, we have set the number of epochs to five and the batch size to eight. The experimental results verify the efficacy and robustness of the suggested method, which is then calculated using a performance evaluation matrix consisting of SSIM, PSNR, RMSE, MSE, as well as BRISQUE. Below is a brief description of each process.

Data collection

The collection of data. Assemble the datasets for haze and dehaze first. There are 55 comparisons between images with and without haze in the collection. There are training and testing versions of this dataset.

Image pre-processing

Data pre-processing serves as a common and useful technique in the deep learning process. This is because it has the potential to both expand the original database's size and enhance the data that is hidden within the dataset. As a result, the efficiency of the way the subsequent procedures has been carried out is significantly influenced by how well the pre-processing was done. Image processing's main objective is to improve the picture data by eliminating distorted noise and enhancing image pixels. Numerous techniques are used to achieve this. In this project, we gather the unprocessed dehaze images and convert them to RGB. The next step is to normalize the images, which modifies the pixel's range of intensity. Next, create a NumPy vector with three images, each with a unique height, width, and color channel. Before merging the channels of the image, the next step is to make all of them the same.

Proposed model (AlexNet with functional neural network)

Applying a neural network to data [14, 15] a collection of methods that mimic the accuracy and processing speed of the brain in an effort to uncover hidden patterns. “Neural networks” are any systems, whether artificial or real, that are made up of neurons. Since neural networks are adaptable, they still can deliver superior outcomes even when the output requirements are essentially unchanged. More and more often, when creating new trading systems, neural networks, an idea derived from AI. In order to successfully classify pictures using ImageNet, AlexNet is the first significant neural network with a convolutional architecture. Only the older models which weren't deep learning-based were capable of outperforming AlexNet, which was joined in the competition.

Convolutional layers are followed by normalization layers, pooling layers, convolutional-pool-norm layers, a few additional convolutional layers, a max-pooling layer, and finally a number of fully connected layers in many ways resembles the LeNet network. In general, there really are simply more layers. The final fully connected layer, which connects to an output class, comes before These convolutional layers have five actual layers, two of which are fully connected.

AlexNet is a very reliable model which can deliver high levels of accuracy—even when applied to datasets that are exceedingly difficult. The performance of AlexNet would suffer significantly if any one of the convolution layers was removed. An established object-detection architecture with great potential for computer vision tasks is AlexNet. In the near future, it's possible that CNNs [16] will be replaced by AlexNet as the go-to source for image jobs.

AlexNet architecture

AlexNet is a straightforward CNN architecture that performs well. As a part of the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC-2012), Alex Krizhevsky et al. made the initial suggestion [17]. Stages built on top of one another make up the majority of it. Convolution, pooling, rectified linear unit (ReLU), and fully connected layers are some of these stages. The first, second, third, and fourth layers of AlexNet are convolutional layers. Following the fifth and pooling layers, there are 3 fully connected layers. AlexNet is a fundamental, straightforward, and successful CNN architecture that's been initially proposed by Alex Krizhevsky et al. in the ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC-2012) [17]. The majority of its components are layered on top of one another. These steps are the pooling layer, the rectified linear unit (ReLU), the fully connected layer, and the convolution layer. Alex Net’s first, second, third, and fourth convolutional layers are all present. The following two fully connected layers are the pooling layer and the fifth layer. Equation (1) illustrates how the ReLU, a form of half-wave rectifier, can be utilized to accelerate training and reduce over fitting. When paired with the fully connected layers of the AlexNet design, the dropout approach can be viewed as a sort of regularisation.

$$f(x) = \max (x,0)$$
(1)

Figure 1 depicts the pre-trained AlexNet network model.

Fig. 1
figure 1

The AlexNet architecture

Data splitting

The data have been converted into a 90:10 ratio. 90% of the time is spent on teaching, with 10% going towards assessment. Overfitting can be avoided by splitting data using a machine learning method (ML). Overfitting is the process by which machine learning happens to fit the training data so well that it is unable to reliably fit any new data. That category includes this situation. Before entering this initial data into an ML model, it is frequently split into three to four different subgroups. Common examples of datasets are the testing and training datasets.

Proposed algorithm

Input: Haze and Dehaze Dataset

Output: Predicted Results

Step1—Dataset gathering and information

The gathering of information. Create the sets of data for haze as well as dehaze first. In the collection, there really are 55 comparisons among pictures with and without haze.

Step2—Data preprocessing

This preprocessing of the data from BRG to RGB lowers the contrast of the images. Creating a NumPy vector from just a single image, where each element has a height, width, and color channel. Combining the image after each channel has been adjusted into a single unit.

Step3—Exploratory data analysis (EDA)

Histogram maps of a predicted image and the raw image are plotted to show the differences. Likewise for data visualization.

Step4—Neural network model to dehaze images

For ground truth and dehazed images, prepare and test samples. 90% of the data are for training, and 10% are for testing. Parameters of a neural network. Used activation function (RELU, Sigmoid). Hyper-Training Conditions Functional neural network built on AlexNet that generates images.

Step5—Performance evaluation metrics

  • SSIM

  • PSNR

  • RMSE

  • MSE

  • BRISQUE

Step6—Predicted outcome

Proposed flowchart

The process flow of our work is shown in Fig. 2, below. Upon closer inspection, a structure can be seen inside the picture; this structure is made up of fundamental steps, and within each fundamental step is a sub-step. The study project's flowchart. The graph shows the steps as data collection, preprocessing, information splitting during the testing and training phases, implementing the suggested deep-learning model, and calculating the proposed model's performance evaluation.

Fig. 2
figure 2

Proposed flowchart

Figure 2 above is a diagram of the study project's suggested flowchart. The flow of events is shown in the graph as starting with data collection, then pre—processing, data splitting during testing and training, application of the suggested deep-learning model, as well as calculation of the recommended model's performance evaluation.

Results and discussion

In this part, the specifics of the implementation are followed by the outcomes of the model are described. This part discusses the dataset that is used for image dehazing, and its visualization, and brings attention to the analysis of experiments that is included in the current study effort. During this research, the offered methods were applied using Python 3.0, and the dataset used was called “dehaze”. In order to put the suggested idea into action, the computer language Python was used. Procedures for evaluation are carried out one after the other in order to verify that the selection of training and test datasets is completely at random. It has been determined that a selection rate of 90% of the data will be used for the training phase, and a selection rate of 10% will be used for the testing phase. In order to illustrate how well the recommended procedures worked, a number of different assessment markers were used. Several performance measures are used to figure out how well something worked.

Exploratory data analysis (EDA)

Expert data analysis (EDA) is a method that involves looking at multiple datasets to figure out how the data is organized. Usually, when people talk about EDA, they mean a way of thinking and a set of tools for adaptable data analysis that doesn't presuppose anything about how the data was originally created. There is a continuous increase in both the volume and the level of complexity of the data that are created by enterprises. EDA is a strategy for doing statistical data analysis.

Plotting some histogram maps of the raw image and the predicted image to clarify the difference.

Figure 3 shows the average columns and rows of every pixel of the haze and dehaze image. In the figure, the x-axis shows the rows and the y-axis shows the columns. Each picture is comprised of a grid of pixels, and each grid has its own width and height. The number of columns determines the width, while the number of rows determines the height.

Fig. 3
figure 3

Average columns and rows of every pixels of haze image and dehaze image

Figure 4 shows the frequency of pixels of haze and dehaze images. Graph (a) and (b) shows the haze and dehaze image frequency. The frequency of the image shows on the x-axis and the range shows on the y-axis. The frequency range is 0–250. The numbers that are closer to zero indicate shades that are deeper, while the numbers that are closer to 255 describe shades that are lighter or whiter.

Fig. 4
figure 4

Frequency of pixels in range 0–255 of haze image and dehaze Image

In Fig. 5 shows the color Intensity in haze and dehaze images. The graph (a) shows the intensity of color of the haze image and graph (b) shows the intensity of color of the dehaze images. The graph x-axis and the y-axis shows the range and frequencies of both types of data. The graph shows the RGB color performance.

Fig. 5
figure 5

Intensity of every color channel in haze image and dehaze Image (color figure online)

Performance evaluation measures

Measuring the performance of the trained DL [18] models require using performance assessment measures. This provides assistance in determining how much higher the DL model can execute on a dataset that it has never seen before. In this part, we provide an introduction to some of the most useful performance assessment measures that may be used in DL [7, 19, 20].

MSE (mean square error)

The most common way to measure the quality of an image is with the MSE. It is a full reference measure, and the numbers are better the closer they are to zero.

MSE among 2 images for example \(g\left(x,y\right)and \widehat{g}(x,y)\)  is definite as:

$${\text{MSE}} = \frac{1}{MN}\sum\limits_{n = 0}^{M} {\sum\limits_{m = 1}^{N} {[\hat{g}(n,m) - g(n,m)]^{2} } }$$
(2)

From Eq. (2), we can see that MSE is a representation of absolute error.

RMSE (root mean square error)

The root-mean-squared error (RMSE) is another type of error assessment approach commonly used to evaluate the gaps between an estimator's prediction and the actual result. This method of error analysis is similar to the concept of root-mean-square error. The error's significance is evaluated. It is the gold standard for measuring the precision with which different estimators forecast a given variable. It's the gold standard of precision, if you will.

Consider an estimator with respect to a specific estimated parameter, whereby the RMSE is defined as the square root of the MSE:

$${\text{RMSE}}(\hat{\theta }) = \sqrt {{\text{MSE}}(\hat{\theta })}$$
(3)

PSNR (peak signal to noise ratio)

To determine the quality of a signal's representations, the PSNR is used to compute the ratio among the highest potential signal power as well as the power of the distorting noise. When comparing two photographs, the decibel ratio is used to calculate the difference between the two. The logarithm term of the decibel scale is often used to compute the PSNR because of the vast dynamic range of the signals being measured. Between the greatest and the lowest conceivable values, this dynamic range may be changed by their quality. In terms of PSNR:

$${\text{PSNR}} = 10\log_{10} ({\text{peakval}}^{2} )/{\text{MSE}}$$
(4)

Structure similarity index method (SSIM)

“SSIM is a technique that relies on people's subjective perceptions of similarity. Images are thought to be degraded when their structural information is altered. Other key perception-based facts such as luminance masking or contrast masking are also involved in this process. The phrase “structural information” refers to pixels that have a high degree of interdependence or are located in close proximity to one other”. These intricately intertwined pixels point to more details about the visual items in the picture. It's called luminance masking when the distortion is reduced at the image's edges. Contrast masking, on the other hand, reduces the visibility of texture distortions in a picture. Image and video quality are assessed using SSIM. It compares two images: the original plus the one that was recovered.

Blind/reference less image spatial quality evaluator (BRISQUE)

“BRISQUE fits the mean subtracted contrast normalized (MSCN) coefficients plus their neighborhood coefficients using the generalized gaussian distribution (GGD) and the asymmetric generalized gaussian distribution (AGGD) models. The quality of a product is evaluated using these model parameters”.

From the Table 1 and Fig. 6, shows the performance of base and proposed model, we can see in figure and table proposed model get RMSE is 0.012. SSIM is 0.99, PSNR is 66.5, BRISQUE is 15.22 and MSE is 3.21, respectively. While base SSIM PSNR and BRISQUE are 0.99, 27.81 and 22.32, respectively. The proposed model gets higher performance in comparison to existing model.

Table 1 Model performance between base and proposed model
Fig. 6
figure 6

Comparison graph of base and proposed model performance

The above Fig. 6 shows the after and before haze and dehaze image of the predicated results. Image dehazing's primary goal is to make hazy pictures more clearly visible. The left side images of haze and right-side image of dehaze shows in above figure. First, a hazy picture is fed into the network model, which estimates the transmission map based on this image; next a ratio of foggy image to transmission map is fed into the network, which removes haze from the image. Improves dehazing performance by avoiding the estimate of ambient light (Fig. 7).

Fig. 7
figure 7

Output images of before and after haze and dehaze

Conclusion and future work

The process of visually enhancing the vision that has been deteriorated as a result of atmospheric circumstances is referred to as image dehazing. The primary purpose of picture dehazing is to totally eliminate the haze or fog that is present in the image without causing any deterioration. This method has a wide range of potential applications, including video surveillance, imaging underwater, picture composting, image editing, interactive photomontage, and many more. Deep learning has been found to be an excellent way for picture dehazing in recent studies. In today's world, there has been development in the application of deep learning techniques to the process of picture dehazing. The research presents an image-dehazing technique that makes use of AlexNet in conjunction with a functional NN model. The findings demonstrate that the suggested model not only executes dehazing processing successfully for a variety of scenarios, but that it also does not exhibit any evident color distortion, picture blur, or other such issues. It is more comparable to the expected outcome. On the dataset consisting of both haze and its removal, the performance of the suggested method is assessed. We get good SSIM (0.99), PSNR (66.5), RMSE (0.012), MSE (3.21), and BRISQUE (15.42) scores on sets, and we also demonstrate how our technique produces superior visual results in comparison to previous learning-based approaches. In the not-too-distant future, one of our goals is to improve the structure of the network and find other applications for it. In addition to this, we are going to expand the data collection and make it more accurate. To further boost performance, we also need to raise the intensity of the training received by the network.