1 Introduction

Image enhancement techniques have been widely used in many applications of image processing in which the subjective quality of images is important for human interpretation. Contrast is an important factor in any subjective evaluation of image quality. Contrast is the difference in visual properties that makes an object distinguishable from other objects and the background [1, 2, 7, 17, 20, 22, 23].

Night vision signifies the ability to see in dark (night). This ability is normally possessed by owls and cats, but with the development of science and technology, devices have been developed to enable human beings to see in the dark and in adverse atmospheric conditions such as fog, rain, and dust [3, 5, 19]. The main purpose for the development of night vision technology was military use to locate enemies at night. Night vision technology is not only used extensively for military purposes, but also for navigation, surveillance, targeting and security [4, 8, 18, 19, 26].

Few thermal IR datasets have been published in the past such as the OTCBVS Benchmark [24, 27], the LITIV Thermal-Visible Registration Dataset [6, 21, 25]. These datasets can be used for the evaluation of any image processing algorithm that can be applied for better night vision. The proposed approach is based on trilateral contrast enhancement of IR night vision images. The paper is arranged as follows. Section 2 gives the motivations and related work. Section 3 gives an explanation of the histogram equalization. Section 4 gives the bilateral histogram equalization referred to as bi-histogram equalization. Section 5 gives a discussion of the segmentation stage in the proposed approach. Section 6 gives a discussion of the plateau histogram equalization. Section 7 covers an IR image enhancement approach based on the AWT with homomorphic processing. Section 8 presents the proposed trilateral contrast enhancement approach. In section 9, performance evaluation quality metrics are given. Section 10 gives a discussion of the experimental results. Finally, section 11 gives the conclusions and the future work.

2 Motivations and related work

This paper deals with a vital topic derived from the problems addressed for IR images [1,2,3, 7, 17, 20, 22, 23]. The objective is the development of image processing technologies to enhance IR night vision images. The proposed approach is based on a hybrid implementation of three stages: segmentation, enhancement and sharpnening [3,4,5, 8, 18, 19, 24, 26, 27]. Compared to the most relevant work [2, 6], this work depends on performance evaluation with spectral entropy, average gradient and Sobel edge magnitude [16, 21, 25]. The proposed approach depends on trilateral contrast enhancement. The IR night vision images pass through three stages: segmentation, enhancement, and sharpening. It is clear that the obtained results in this paper are better than those of the previous works as shown in the Tables 1, 2, 3, 4, 5 and 6 for six cases. Enhancement of the night vision images and videos is very important for many computer vision tasks, such as visual tracking in the night [11, 13]. The use of multiple features for tracking from IR videos can be enhanced with the proposed approach since different types of variations such as illumination, occlusion and pose can be enhanced [9, 10].

Table 1 Numerical results of the first experiment before the last sharpening stage
Table 2 Numerical results of the second experiment before the last sharpening stage
Table 3 Numerical results of the third experiment before the last sharpening stage
Table 4 Numerical results of the the fourth experiment before the last sharpening stage
Table 5 Numerical results of the fifth experiment before the last sharpening stage
Table 6 Numerical results of the sixth experiment before the last sharpening stage

To intelligently analyze and understand video content, a main step is to accurately perceive the motion of the objects of interest in videos. The task of object tracking aims to determine the position and status of the objects of interest in consecutive video frames. This field is very important, and has received great research interest in the last decade. Although numerous algorithms have been proposed for object tracking in RGB videos, the task is still limited in IR videos [12, 14, 15].

3 Histogram equalization

Histogram equalization (HE) is a specific case of the more general class of histogram remapping methods. These methods seek to adjust the image to make it easier to analyze or improve its visual quality. It can also be used on color images by applying the same method separately to the Red, Green and Blue components of the RGB color values of the image [7].

Still, it should be noted that applying the same method on the Red, Green, and Blue components of an RGB image may yield dramatic changes in the image color balance since the relative distributions of the color channels change as a result of applying the algorithm. However, if the image is first converted to another color space, Lab color space, or HSL/HSV color space in particular, then the algorithm can be applied to the luminance channel without resulting in changes in the hue and saturation of the image. The HE operation can be represented as follows [22].

$$ b\left(x,y\right)=f\left[c\left(x,y\right)\right] $$
(1)

where c(x,y) is an image with a poor histogram, and f is the function that transforms the image c(x,y) into an image b(x,y). The Probability Density Function (PDF) of a pixel value a in the image c is given by:

$$ {p}_c(a)=\frac{1}{Area}{H}_c(a) $$
(2)

In fact, pc(a) is the probability of finding a pixel with the value a in the image c. Area is the area or number of pixels in the image, and Hc(a) is the histogram value of the image c for gray level a. The Cumulative-Density Function (CDF) for gray level a in image c is therefore given by:

$$ {P}_c(a)=\sum \limits_{i=0}^a{p}_c(i)=\frac{1}{Area}\sum \limits_{i=0}^a{H}_c(i) $$
(3)

The CDF is the sum of all PDFs up to the value a. Note that ideally the image b has a flat histogram such that Hb(0) = Hb(1) = .... = Hb(a) = .... = Hb(255). Therefore, the probabilities of all pixel values are now equal. They all occur similar times. So, the desired HE function f(a) simply takes the PDF for the values in the image c and multiplies its reciprocal by the CDF of the values in the same image.

$$ f(a)={D}_m\frac{1}{Area}\sum \limits_{i=0}^a{H}_c(i) $$
(4)

Dm is the number of gray levels in the new image b. Assuming histogram uniformity in the image b, we can conclude that Dm = 1/pb(a) for all pixel values a in the image b. It is important to realize that HE reduces the number of gray levels in the image, because the equalization process is a nonlinear process, which may transform multiple gray levels in the image with a poor histogram into a single gray level in the equalized image.

4 Bi-histogram equalization

Bi-histogram equalization (BHE) divides the original image histogram into two different histograms with the reference as the mean value of the original image. Then, the sub-divided image histograms are equalized separately by histogram equalization. The following steps are performed to perform BHE.

  1. 1.

    Mean computation: Mean value of the input image xm is computed.

  2. 2.

    Bi-histogram formation: From the mean value the input image histogram, two sub-image histograms xa and xb are generated as [22]:

$$ {x}_a=\left\{x\left(i,j\right)|x\left(i,j\right)\le {x}_m\right\} $$
(5)
$$ {x}_b=\kern1.25em \left\{x\left(i,j\right)\vert x\left(i,j\right)>{x}_m\ \right\} $$
(6)
$$ x=\kern1.25em \left\{\ {x}_a\cup {x}_b\right\}\kern0.75em $$
(7)

where x is the input image, xa and xb are the sub-image histograms.

  1. 3.

    Histogram equalization of sub-images: Histogram equalization of sub-images is performed similar to that of the traditional image.

5 Segmentation stage

This stage is based on Otsu’s N thresholding method. Otsu’s method of segmentation is an optimum global thresholding method. It is a non-parametric and unsupervised method of automatic threshold selection for segmentation of images. It is a simple procedure, and it utilizes only the zeroth and the first-order cumulative moments of the gray-level histogram. It is optimum in the sense that it maximizes the between-class variance, a well-known measure used in statistical discriminant analysis [16].

$$ MN={n}_0+{n}_2+\dots +\kern0.5em {n}_{L-1} $$
(8)

where M × N is the size of the image, ni is the total number of pixels in the image with level i. Suppose we select a threshold k, and use it to threshold the image into two classes, C1 and C2. Class C1 consists of pixels with intensity values in the range [0, k]. Class C2 consists of the pixels with intensity values in the range [k + 1, L-1]. Using this threshold, the probability, P1(k), that a pixel is assigned to class C1 is given by the cumulative sum as follows:

$$ {P}_1(k)=\sum \limits_{i=0}^k{p}_i\kern0.5em $$
(9)

The pixels of the input image are represented in L gray levels, and k is a selected threshold from 0 < k < L-1.

Similarly, the probability of pixels in Class C2 is,

$$ {P}_2(k)=\sum \limits_{i=k+1}^{L-1}{p}_i=1-{P}_1(k) $$
(10)

where P1(k) is the probability of pixels in Class C1.

The mean intensity values of the pixels assigned to class C1 are

$$ {m}_1(k)=\frac{1}{P_1(k)}\sum \limits_{i=0}^k\ i\ {p}_i\kern0.5em $$
(11)

Similarly, the mean intensity values of the pixels assigned to class C2 are

$$ {m}_2(k)=\frac{1}{P_2(k)}\sum \limits_{i=k+1}^{L-1}\ i\ {p}_i $$
(12)

The global mean is given by,

$$ {m}_G(k)=\sum \limits_{i=0}^{L-1}\ i\ {p}_i $$
(13)

The problem is to find an optimum value for k, which maximizes the criterion defined by this equation:

$$ y(k)=\frac{{\sigma_B}^2(k)}{{\sigma_G}^2(k)} $$
(14)

where  σB2(k) is the between-class variance defined as

$$ \kern1.5em {\sigma_B}^2(k)\kern0.75em ={P}_1{\left({m}_1-{m}_G\right)}^2+{P}_2{\left({m}_1-{m}_G\right)}^2 $$
(15)

and σG2(k) is the global variance defined as,

$$ \kern1.5em {\sigma_G}^2{(k)}_{=}\sum \limits_{i=0}^{L-1}\ {\left(i-{m}_G\right)}^2\ {P}_i $$
(16)

where the optimum threshold is the value k* that maximizes σB2(k).

6 Plateau histogram equalization

Plateau histogram equalization (PHE) modifies the shape of the input histogram by reducing or increasing the values in the histogram bins based on a threshold limit before the equalization takes place. An appropriate threshold value is selected firstly, which is represented as T. If the value of P(Xk) is greater than T, then it is forced to be equal to T. Otherwise, it is unchanged, as shown below [17]:

$$ P\left({X}_k\right)=\frac{n_k}{n} $$
(17)

where nk represents the number of times that the level Xk appears in the input image and n is the total number of samples in the input image, for k = 0, 1, ...., L − 1.

$$ {P}_T\left({X}_k\right)={\displaystyle \begin{array}{c}\Big\{\begin{array}{c}P\left({X}_k\right)\kern5em P\left({X}_k\right)\le T\kern0.5em \\ {}\kern1.5em \\ {}\kern3.5em T\kern6.5em P\left({X}_k\right)>T\kern4.5em \end{array}\kern2em \\ {}\kern25.25em \end{array}} $$
(18)

where P(Xk) is the modified probability density function, and T is the selected threshold value.

Then, histogram equalization is carried out using this modified probability density function. There is one main problem associated with plateau histogram equalization. Most of the methods need the user to set manually the plateau threshold of the histogram, which makes these methods not suitable for automatic systems. Although some methods can set the plateau threshold automatically, the process for deciding one threshold is often complicated.

Selection of plateau threshold value is very important for IR image enhancement. It has an effect on the contrast of images. An appropriate plateau threshold value would greatly enhance the contrast of the image. In addition, some plateau values would be appropriate to some IR images, but not appropriate to others. As a result, the plateau threshold value would be selected adaptively according to the IR image.

The steps of this algorithm are performed as follows:

  1. 1.

    The IR image is obtained for an object through the optical lens of a thermal imager.

  2. 2.

    The image is considered in matrix form with different pixel values.

  3. 3.

    All pixel values of the image are arranged in an ascending order.

  4. 4.

    Histogram is estimated.

  5. 5.

    The median of the image levels is estimated and used as a threshold.

  6. 6.

    Comparison with the estimated threshold is performed to determine the required processing.

  7. 7.

    Histogram equalization for every pixel is performed.

7 AWT with homomorphic enhancement

In this approach, we merge the benefits of the AWT and homomorphic enhancement. First, the IR image is decomposed into sub-bands using the AWT. After that, each sub-band is processed, separately, using the homomorphic enhancement to reinforce image details.

A visual image can be represented as a product of two components as folows:

$$ f\left({n}_1,{n}_2\right)=\kern0.5em i\left({n}_1,{n}_2\right)r\left({n}_1,{n}_2\right) $$
(19)

where f(n1, n2) is the obtained image pixel value, i(n1, n2) is the light illumination incident on the object to be imaged and r(n1, n2) is the reflectance of that object.

It is known that illumination is approximately constant, since the light falling on all objects is approximately the same. The only change between objects is in the reflectance component.

If we apply a logarithmic process on Eq. (19), we can change the multiplication process into an addition process as follows:

$$ \log \left(f\left({n}_1,{n}_2\right)\right)=\log \left(i\left({n}_1,{n}_2\right)\right)+\log \left(r\left({n}_1,{n}_2\right)\right) $$
(20)

The first term in the above equation has small variations, but the second term has large variations as it corresponds to the reflectivity of the object to imaged. By attenuating the first term and reinforcing the second term of Eq. (20), we can reinforce the image details. This idea can be extended to IR image enhancement by working with the image pixels as values only without considering the composition process of pixel values in IR imaging.

The steps of the AWTH approach can be summarized as follows:

  1. 1.

    Decompose the IR image into four subbands p3, w1, w2 and w3 using the additive wavelet transform and the low-pass filter mask given by [2]:

$$ H=\frac{1}{256}\left(\begin{array}{ccccc}1& 4& 6& 4& 1\\ {}4& 16& 24& 16& 4\\ {}6& 24& 36& 24& 6\\ {}4& 16& 24& 16& 4\\ {}1& 4& 6& 4& 1\end{array}\right) $$
(21)
  1. 2.

    Apply a logarithmic operation on each sub-band to get the illumination and reflectance components of the subbands w1, w2 and w3 as they contain the details.

  2. 3.

    Perform a reinforcement operation on the reflectance component in each sub-band and an attenuation operation on the illumination component.

  3. 4.

    Reconstruct each sub-band from its illumination and reflectance using addition and exponentiation processes.

  4. 5.

    Apply adaptive plateau histogram equalization on p3

  5. 6.

    Perform an inverse additive wavelet transform on the obtained sub-bands by adding p3, w1, w2 and w3 after the homomorphic processing to get the enhanced image.

In image processing, it is often desirable to emphasize high-frequency components representing the image details without eliminating low-frequency components. The high-boost filter can be used to enhance high-frequency components. It is used for amplifying high-frequency components of images. The amplification is achieved via a procedure, which subtracts a smoothed version of the image from the original one [1].

$$ {W}_{hb}=A{W}_{allpass}+{W}_{hp} $$
(22)

where Whp is a high-pass filter, A is a constant, and Whb is a high-boost filter

$$ \kern2em {W}_{hb}=\kern1em \left[\begin{array}{ccc}0& -1& 0\\ {}-1& A+8& -1\kern0.75em \\ {}0& -1& 0\end{array}\right] $$
(23)
$$ {W}_{allpass}=\left[\begin{array}{ccc}0& 0& 0\\ {}0& 1& 0\kern0.5em \\ {}0& 0& 0\end{array}\right] $$
(24)

8 The proposed trilateral contrast enhancement approach

The proposed approach is concerned with the enhancement of IR night images based on trilateral contrast enhancement. The word trilateral means three stages. The IR night images pass through three stages: segmentation, enhancement, and sharpning (Fig. 1).

Fig. 1
figure 1

Steps of the proposed approach

The steps of the proposed approach can be summarized as follows:

  1. 1.

    Pick IR night vision image from IR camera.

  2. 2.

    Divide the IR image into overlapping sub-images by a segmentation stage.

  3. 3.

    Apply the AWPH equalization on the resultant image.

  4. 4.

    Apply the high-boost filter on the enhanced resultant image.

9 Performance evaluation metrics

This section presents the quality metrics used for the valuation of the enhancement results. These metrics include average gradient (AG), spectral entropy (Ef) and Sobel edge magnitude (∇f). These metrics are evaluated as follows [8]:

$$ AG=\frac{1}{mn}\sum \limits_{x=1}^m\sum \limits_{y=1}^n\sqrt{\frac{\left({\left(\frac{\partial f}{\partial x}\ \right)}^2+{\left(\frac{\partial f}{\partial y}\ \right)}^2\right)}{2}\kern0.5em } $$
(25)

where AG is the average gradient of the IR image f, and m×n is the size of the IR image

The spectral entropy is computed in the discrete cosine transform (DCT) domain on a block-by-block basis as illustrated in Fig. 2. It is a function of the probability distribution of the local DCT coefficient values. This probability distribution function (PDF) is given as follows [15]:

$$ p\left(i,j\right)=\frac{c^2\left(i,j\right)}{\sum \limits_i\sum \limits_j{c}^2\left(i,j\right)} $$
(26)
Fig. 2
figure 2

Estimation of spectral and spatial entropies for an image

where 1 ≤ i ≤ 8, 1 ≤ j ≤ 8, i, j ≠ 1, and c(i, j) represents the DCT coefficients.

The local spectral entropy is defined as [27]:

$$ {E}_f=-\sum \limits_i\sum \limits_jp\left(i,j\right){\log}_2p\left(i,j\right)\kern0.75em $$
(27)
$$ \nabla f=\sqrt{{f_x}^2+{f_y}^2\kern0.75em } $$
(28)

where ∇f is the Sobel edge magnitude, fx and fy are two images containing the horizontal and vertical derivative approximations, respectively.

10 Simulation results

This section presents several simulation experiments executed on IR night vision images. These results adopt a strategy of presenting the original IR images with their enhanced versions using different enhancement methods. The results of the first experiment are shown in Fig. 3. Part (a) gives the original IR night vision image. Part (b) gives the IR image after AWPH equalization. Part (c) gives the IR image after adaptive plateau histogram equalization. Part (d) gives AWT with homomorphic enhancement on three sub-bands. Part (e) gives the IR image after the bi-histogram equalization. Part (f) gives the enhanced IR image using the proposed algorithm. Comparing between Parts (b), (c), and (d), it is clear that the proposed enhancement approach enhances the visual quality of the processed image. The performance metrics results are given in Table 1. Similar experiments have been carried out on other IR images and the results are given in Figs. 4 and 5. The higher the value of the average gradient and Sobel edge magnitude, the better the image quality. It has been shown that this algorithm has succeeded in the improvement of the visual quality of the IR images with much details. From these results, it is clear that the proposed approach has succeeded in obtaining the best results in the improvement of IR night vision images from both the visual quality and performance metrics perspectives as illustrated in Tables 2 and 3.

Fig. 3
figure 3

Visual results of the first experiment

Fig. 4
figure 4

Visual results of the second experiment

Fig. 5
figure 5

Visual results of the third experiment

To further confirm the effectiveness of the proposed approach experiments on images from other datasets are presented. The Dune and Otcbvs images with size 300 × 300 pixels, respectively, and the Car images with size 301 × 149 pixels were provided by Shao et al. [6, 21, 24, 25]. The proposed approach has been tested on these images and the results are shown in Figs. 6, 7 and 8. The results illustrate that the proposed approach is superior as compared with other methods. The numerical results are given in Tables 4, 5 and 6. The results of distributions of block spectral entropy for all experiments are shown in Figs. 9, 10, 11, 12, 13 and 14. These results also ensure that the proposed approach is superior as compared with other methods.

Fig. 6
figure 6

Visual results of the fourth experiment

Fig. 7
figure 7

Visual results of the fifth experiment

Fig. 8
figure 8

Visual results of the sixth experiment

Fig. 9
figure 9

Distributions of block spectral entropies for the first experiment (a) after the AWPH (b) after the adaptive plateau histogram equalization (c) after the Bi-histogram equalization (d) after the HE (e) after the proposed approach

Fig. 10
figure 10

Distributions of block spectral entropies for the second experiment (a) after the AWPH (b) after the adaptive plateau histogram equalization (c) after the bi-histogram equalization (d) after the HE (e) after the proposed approach

Fig. 11
figure 11

Distributions of block spectral entropies for third experiment (a) after the AWPH (b) after the adaptive plateau histogram equalization (c) after the bi-histogram equalization (d) after the HE (e) after the proposed approach

Fig. 12
figure 12

Distributions of block spectral entropies for the fourth experiment (a) after the AWPH (b) after the adaptive plateau histogram equalization (c) after the bi-histogram equalization (d) after the HE (e) after the proposed approach

Fig. 13
figure 13

Distributions of the block spectral entropies for the fifth experiment (a) after the AWPH (b) after the adaptive plateau histogram equalization (c) after the bi-histogram equalization (d) after the HE (e) after the proposed approach

Fig. 14
figure 14

Distributions of the block spectral entropy for the sixth experiment (a) after the AWPH (b) after the adaptive plateau histogram equalization (c) after the bi-histogram equalization (d) after the HE (e) after the proposed approach

11 Conclusions and future work

This paper presented an approach for enhancement of IR night vision images. It is a trilateral contrast enhancement approach. It depends on three stages: segmentation, enhancement and sharpning. The proposed approach comprises an enhancement stage using AWTH. Simulation results revealed that the proposed approach gives superior results to the other methods from the quality metrics perspectives. For future work, deep learning models for object detection from IR images will be considered in conjunction with IR image pre-processing.