Keywords

1 Introduction

There is a great volume of research efforts in the area of the face recognition using the visible spectrum mechanism. However, the visible spectrum-based face recognition suffers from the problem of the variations of light [1]. To address this problem, 3D face recognition [2] or a combination between visible and Infrared (IR) spectrum have been suggested [3]. There is always a need for more robust security systems which does not affect by variations of light. As the IR spectrum is not affected by variations of light, this have increased the rise to develop face recognition system based only on the infrared spectrum.

It is reported that IR spectrum could offer a promising alternative face recognition systems to visible spectrum specifically in case of variations in the face appearance causing by illumination changes [4, 5]. In particular, Jain et al. [6] reported that IR spectrum provides an ability for human identification under different lighting conditions even in the total darkness. Wolff et al. [7] have concluded that IR spectrum is nearly invariant to any change in ambient illumination. Thus, IR-based human recognition systems have the potential to offer simpler and yet robust solutions which achieve a good performance in uncontrolled environment.

Segmentation is an important step in recognition systems. Recognition rates of most recognition approaches can be improved by a good segmentation technique as it enables the utilization of the face shape in the recognition process [8, 9]. There are limited studies about the segmentation approaches for thermal face image. Here, we will give an overview about these studies.

Aglika et al. [10] proposed a segmentation approach using an elliptical mask to be put over the face image to remove the background, align and scale the faces. However, this approach is applicable only for frontal and centered faces. Pavlidis et. al. [11] suggested a face segmentation method based on Bayesian approach. This method is based on both of the models of skin and the background pixel intensities. Thus, clothes pixels was included as skin pixels while ignoring other skin pixels and considering them as a background. In another study, Cho et al. in [12] proposed a segmentation method for the IR face images using contours and morphological operations. The Sobel edge detector was used for the edge detection then the morphological operations was applied to the contour to connect open contours and remove small areas. Recently, Filipe et al. [13] proposed two segmentation methods which make use of the active contour approaches and the statistical modeling of pixel intensities. The two methods are robust against face pose, expression, and rotation. In addition, they addressed the problem of considering the clothes as part of the face, thus enabling the segmentation of the face shape to be used recognition methods.

Superpixels can improve the computational efficiency of algorithms as it reduces hundreds of thousands of pixels to at most a few thousand superpixels. Algorithms for generating superpixels can be categorized as either graph based [1417] or gradient-ascent based [1822]. Quick-shift is a common image segmentation method as a gradient-ascent based method [18]. The quick-shift’s superpixels are not fixed in size or number and preserve most of the boundaries in the original image. The quick-shift parameters are usually determined by segmenting a few training images. Generating superpixels by quick shift are controlled by three parameters of Ratio, Kernel Size, and Distance.

In this paper, a face extraction model is proposed based on superpixel technique for thermal IR human face images. Superpixels formation using quick-shift helps to get more accurate face extractions. The Quick-Shift parameters’ values and automatic thresholding, using a simple Otsu’s thresholding, help to produce good results of extracting faces from the thermal images. The Terravic Facial IR Database is used to evaluate our approach. The Experimental results showed that the proposed model was robust against image illumination, face rotations, and different artifacts. Comparing to the most related work, our model was found better in many cases.

2 Theoretical Background

2.1 Quick-Shift Method

The quick-shift method [18] is used to extract superpixels from the thermal face image. The superpixels, in this method, depend on three different parameters of ratio, kernel size, and maximum distance. Determining the quick-shift parameters successfully makes the resulted image more meaningful and easier to be used to extract the thermal face superpixels. In this paper, the three parameters’ values are determined by segmenting a few training images by hand until we find a set that shows a good segmentation result for nearly all of the face boundaries and had the largest possible average segment size. In practice, the quick-shift algorithm is not too much sensitive to the choice of parameters, thus a quick tuning by hand is somewhat sufficient for thermal face extraction.

In summary, the superpixels extracted by quick-shift depends on the following parameters:

  • Ratio, Ratio: It is a tradeoff between spatial and intensity consistency.

  • Kernel size, KernalSize: It is the parameter that controls the scale at which the density is estimated.

  • Max-distance, MaxDist: It is the distance between two pixels that the method considers when building the tree.

2.2 Otsu’s Thresholding Method

Converting a greyscale image to a binary image is a common task in image processing. Otsu’s segmentation method [23] is usually used to automatically perform clustering-based image thresholding [24]. This method converts a grayscale image to binary image. The algorithm assumes that the image contains two classes of pixels (foreground and background pixels). Thresholding tries all possible threshold values to separate the pixels that either fall in foreground or background. The optimum threshold value minimizes the sum of foreground and background spreads.

3 Proposed Thermal Face Extraction Model

A model is proposed to extract human faces based on their thermal images. The model makes use of the Quick-Shift algorithm to produce superpixels and the Otsu’s method for automatic thresholding. The proposed model steps are as shown in the Algorithm 1. This model works as follows.

Firstly, a thermal face image \(I_i\) is selected, where \(I_i\) represents the ith input image of the total number of images N in this group for \(i=1,2,3, \ldots ,N\). The Terravic Facial IR Database is used for the proposed model. The Quick-Shift method is applied with its initial parameters, ratio, kernel size, and maximum distance, to produce superpixels. The Otsu’s thresholding method is then applied to the produced superpixels image. Thus, each superpixels image is converted to a binary image \(B_i\) based on the optimum threshold. Finally, the relevant pixel values from the original thermal image are extracted. The Quick-Shift parameters’ values with the automatic thresholding one can get the best results of extracting faces from the thermal images.

figure a

4 Experimental Results and Discussion

To evaluate our approach, the Terravic facial IR database [25] was used. This database consists of 20 persons each of one of them has a different number of images with various variations (front, left, right; indoor/outdoor; glasses, hat). Its images’ format is 8-bit grayscale JPEG with the size of \(320 \times 240\) pixels. 18 persons with 22,784 thermal images were used from this database in our experiments. Table 1 shows the distribution of the images for each class.

Table 1 Terravic facial IR database images distribution
Fig. 1
figure 1

Extracting faces for class ‘face16’; a Original image, b Otsu’s threshold [23], c Quick-Shift based threshold, d Face Extraction

Two main scenarios were designed to evaluate our proposed model:

  • The first scenario was to check the accuracy of extracting thermal faces using only Otsu’s automatic threshold.

  • The second scenario was to evaluate the accuracy of extracting thermal faces using Quick-Shift based automatic threshold.

Fig. 2
figure 2

Extracting faces for class ‘face01’; a Original image, b Otsu’s threshold [23], c Quick-Shift based threshold, d Face Extraction

To show the evaluation of the two scenarios, we use classes (01), (12), and (16) which represents various poses and variations required for face recantation. All of these classes contain different poses (front, left, right). The class of (1) contains indoor images while (12) and (16) includes outdoor images. For the glasses and hat poses, class (1) contains glasses whereas (12) and (16) include glasses and hat.

Figure 1 shows the results for class ‘face16’ for the two scenarios. This class of the thermal images were captured outdoor from front, left, right direction with glasses, hat, and both. From this figure, its clear that both methods (Otsu’s and Quick-Shift) achieve good results for this class because there is clear different in the brightness between the face area and the object of clothes, hate, glass, and other surroundings.

On the other side, Fig. 2 shows the results of the class ‘face01’ where both methods (i.e. scenarios) are totally different. As the thermal images of class ‘face01’ were captured indoor with different front, left, right direction with glasses, the Otsu’s method did not succeed to extract the face area because of the brightness of the clothes.

For images of class ‘face12’, which were captured outdoor from front, left, right direction and containing glasses, hat, both the Quick-Shift accomplished some good results and other not good as shown in Fig. 3. The good or the bad the results were noticed that they depend on the face direction.

Fig. 3
figure 3

Extracting faces for class ‘face12’; a Original image, b Otsu’s threshold [23], c Quick-Shift based threshold, d Face extraction

To show the effectiveness of the proposed method, a face segmentation approach based on active counters [26] and suggested by Filipe et al. [13] was implemented and its results were compared with our model’s results. This comparison was conducted with the same database (the Terravic facial IR database) and its results are illustrated in Fig. 4. From this figure, it can be noticed that the proposed method was more robust than Filipe’s approach. Although both methods could effectively extract human faces when the intensity clothes are high, still our method showed results better than Filipe’s one.

Fig. 4
figure 4

Comparing the proposed model; a Original image, b Active Contours [13, 26], c Proposed method

From the above results, it can be concluded that the proposed method can successfully extract face from thermal images which were taken indoor/outdoor under various variations, e.g. different directions (left, right, front), glasses, clothes, and hat. On the other side, with high-intensity clothes in the images, the model needs refinement.

5 Conclusion

In this paper, we proposed a face extraction model from IR human face images. This model made use of Otsu and Quick-Shift methods. Based on an extensive experimental results using 18 persons with 22,784 thermal images from the Terravic Facial IR Database, it was concluded that Quick-shift can improve the face extraction results. Our model achieved excellent results for extracting faces from thermal images which were taken under various variations, e.g. different directions, glasses, clothes, and hat. Comparing with the related work, our model’s results were better in all different variations. As for the future work, we plan to (a) make a refinement for the case where there is high intensity in clothes in the images and (2) explore the effectiveness of the proposed model for object detection and extracted in different types of thermal databases such as Terravic Motion IR Database, Terravic Weapon IR Database, and Thermal Infrared Video Benchmark for Visual Analysis.