Introduction

In recent decades, face recognition and face feature detection have been much attended to due to their application in different sciences such as Human–Computer Interfaces (HCI), Brain-Computer Interfaces (BCI), biometric applications, and so on [1]. One of the interesting subjects is a kind of eye detection as an eye mouse in electronic devices such as computers and smartphones. This system is used to read, write, and surf in the devices. Another important application of this system is “monitoring driver fatigue” [2]. Eye detection in face images is a primary and important step of face detection, face recognition, pupil estimation, and human–computer interface. There are cognitive processes in visual perception, including illumination [3], shape and geometric expression [4], dimension [5], and color processes [6].

Several techniques have been put forth for face detection, which can be classified into model-based methods, structural methods, and variable or feature-based methods. Template-based methods do face detection by the general definition of face template examples as a whole and separate near-eye components [7, 8]. Appearance-based methods create a model based on the appearance of eyes [5,6,7,8,9, 9, 1010]. Different imaging methods obtain features like eye corners, edges, as well as points selected according to specific filter responses []. Face detection is a difficult task due to eyelid obstruction, open or closed eyes, and different head positions [5]. After the face detection step, the image is converted to a grayscale image, and the lighting condition of the image is improved. Using both Sobel and Prewitt masks separately, the horizontal edges are revealed. The output images of the Sobel mask and Prewitt mask are combined to output a single common image.

The suggested approach provides several clear benefits compared to current strategies in the field of real-time eye recognition. The combination of skin detection and geometric facial analysis in the first face identification process provides adaptation to different lighting situations and human racial traits, a quality that is often missing in traditional approaches. In addition, the method's dependence on Sobel and Prewitt edge detection algorithms improves the accuracy of eye recognition, especially in situations with intricate backdrops or obstructions. The use of edge detection not only enhances precision but also adds to the method's low computing cost, matching the need for real-time performance. Incorporating Euclidean distance measurements for verification purposes enhances the reliability of the detection process by utilizing innate face proportions, hence adding a layer of resilience. These combined advancements result in a technique that not only performs better than current methods in terms of how quickly it can be executed but also exceeds them in terms of accuracy, making it a strong competitor in the field of real-time eye detection.

The image is divided into the right upper quarter, left upper quarter, and lower half part. To connect near edges and also to separate “almost connected” eyebrows the morphology operations such as dilation and erosion are done on each part of the image separately. Labeling is done on all parts of the image and the biggest binary object is found in each part. The top right side has more edge numbers for the right eye, the top left side has more edge numbers for the left eye, and the bottom side of the image has more edge numbers for the mouth Number of images. Some mathematical relationships are used to confirm the detected eyes and mouth in the image. The investigation is arranged as follows: Sect. "Related work" which discusses the related works. Sect. "Detecting the face region" describes the details of the proposed method. Sect. "The proposed method for eye detection" provides experimental results on different databases. Sect. "Experiments" covers conclusions.

Related work

Zhu Zhiwei and Qiang Ji [11] introduced a real-time eye detection method working in different lighting conditions and different face poses. They used tracking and object detection technology that is based on appearance-based methods. C. W. Park et al. [12] introduced an eye detection method using an eye filter and reduced error recovery based on nonnegative matrix factorization. Aryuanto Soetedjo’s method [13] is based on shape and color features. Localizing the eyes, he used projection techniques (white threshold method that eliminates the white area of ​​the sclera) and an oval detection method due to the oval shape of the eye. Sirhohey and Rosenfeld [14] used linear and non-linear filters to detect eyes. Jiatao Song et al. [15] introduced a method based on a combination of binary edge and intensity information. The method consists of 3 steps; the first step is extracting binary edges from grayscale images using multi-resolution wavelet transformation. The second step is eye region extraction from the binary edges image. The final step is eye detection based on bright spots (highlights) and intensity information. Nasiri et al. [3] presented an illumination-based technique. Even though this technique is quite precise, the computer cost is very high and it only works for color images. Soylemez and Ergen [4] presented a method based on the Hue cycle that also includes cutting at the top. Chiang et al. [6] introduced a color-coded method. This method is fast but does not work for face photos of people with dark skin because the eyes are not as dark as the skin area. Ying Li Tian except [7] introduced a method based on eyes closed and eyes open. This method considers a half-open eye to be a closed eye and does not work in poor lighting conditions or with eyes blocked by glasses. A technique based on blink patterns was presented by Kristin Grauman et al. [9] to represent speech. It is used to type text in Microsoft Word. There are also methods that use special equipment [16, 17]. However, this method is very fast and accurate There are limitations, including the need for additional equipment, working only in video format, and distance limitations due to the removal of student reflections.

In 2022, Beltrán et al. [18], the computing frameworks that require dedicated computers used in today's commercial and research systems are power-hungry, bulky, and expensive. In this investigation, we present an integrated hardware-based solution for real-time face detection. From an algorithmic point of view, the widely used Viola-Jones approach has been redesigned to allow single-step parallel image processing. Tests to determine the accuracy of the proposed machine for eye detection are performed on the CASIA-Iris-distance V4 database. In practice, the accuracy of face detection was found to be 100% [19].

In 2020, Robin et al. presented an improvement in the performance of eye and face detection utilizing multitasking cascaded convolutional networks. Face and eye detection in unrestricted conditions has long been problematic because of the variety of expressions, coloration fringing, and illumination. With our dataset, this research presents an enhanced face and eye identification technique utilizing cascaded multi-project convolutional networks. We advise in this paper a deep cascaded multi-challenge machine that exploits their inherent correlation to improve their overall performance. Our proposed technique achieves a ninety-eight percent accuracy charge thinking our dataset that is superior to the other techniques utilized to hit upon the face and eye from a photo. However, the results of the experiments suggest that the suggested method produces better appropriate face and eye detection output from movies [20].

In 2020, Knapik and Cyganek provided fast eye detection in thermal images. In recent years, many strategies have been proposed for eye detection. In some cases, however, which include driving force drowsiness detection, light conditions are so tough that thermal imaging is a sturdy opportunity for the seen mild sensors. In this paper, we endorse a green approach for eye detection based on thermal image processing which can be efficaciously used in hard environments. We evaluate our method with the YOLOv3 deep gaining knowledge of the version. Our approach attains high accuracy and speedy reaction in real conditions without computational complexity and the requirement of a big dataset related to deep neural networks [21].

Padliya et al. [22] proposed a novel approach for extracting facial features and recognizing facial expressions from images. The study focused on detecting and measuring features such as eyes, nose, mouth, and eyebrows. These features are then used for expression recognition through matching techniques using Supervised Machine Learning. The researchers utilized the JAFFE (Japanese Female Facial Expression) database and classified each image into seven categories of facial expressions: (1) Angry, (2) Disgust, (3) Fear, (4) Happy, (5) Neutral, (6) Sad, and (7) Surprise. The goal was to improve the accuracy and efficiency of facial expression detection, which has applications in various fields such as marketing, teaching strategies, VR applications, and inclusive healthcare services.

Qadir et al. [23] proposed a novel approach that combines the Black Hole Algorithm (BHA) with the Canny edge detector and circular Hough transform. By using this integrated method, they successfully identified the circular parameters of the iris and pupil, achieving a segmentation accuracy of 98.71%. The system was tested on the CASIA-V3 database, and the segmentation-based BHA proves effective for iris identification in future access control applications. Overall, this technique enhanced iris segmentation and has potential applications in image analysis and biometric identification.

Liu et al. [24] focused on improving optical sensing technology for moving image visual communication. Combining an optical sensor image edge detection technique with AI recognition, the study enhanced edge detection accuracy. The process involved identifying objects in moving images using AI, locating target objects in images optical sensing, and extracting edge information. Experimental results demonstrated the effectiveness of this approach, outperforming conventional methods in optical image edge detection.

Chen et al. [25] proposed an eye detector for video-based eye-tracking systems that adopts a coarse-to-fine strategy. The detector consisted of three classifiers: an ATLBP-THACs feature-based cascade classifier, a branch CNN, and a multi-task CNN. Additionally, the study introduced a method for coarse pupil localization, which provides initial pupil coordinates for fine pupil localization. The proposed approach aimed to address challenges such as face rotation, glasses, eye-shape variation, and illumination changes in accurately detecting eyes and localizing pupils. The collected neepuEYE dataset, containing 5500 NIR eye images from 109 people, supports the evaluation of the proposed methods, demonstrating their effectiveness in achieving high detection rates and localization speeds.

Bonteanu et al. [26] proposed a system that uses a classifier implemented with slim-type neural networks to detect the position of the pupil within eye images. To reduce complexity, a parallel architecture with two independent classifiers was used to deliver pupil center coordinates. The system was trained, tested, and validated using almost 40,000 eye images from 20 different databases. The experimental results demonstrated a high detection rate (96.29% at five pixels) and fast processing speed (100 frames/s), making it suitable for real-time applications in various fields, including assistive technology for neuromotor-disabled patients, computer gaming, and automotive safety by monitoring driver cognitive state.

In 2020, Vijayalaxmi et al. [27], a conceptual review of face detection methods based on image processing was performed. Recently, many traffic accidents have occurred due to distracted driving. In fact, about 32% of the drivers involved in these accidents were tested for signs of fatigue to varying degrees before the accident. The aim of this research paper is to re-examine the various interventions designed to support car consumers in preventing serious road accidents. This paper tries to summarize the work that has been done in this direction. Obviously, there are many simple ways to understand motor force fatigue, especially biological or physiological measures, motor type, and most importantly, face analysis by sight and multi-attribute phrases [28].

In 2022, Ahmed et al. presented eye detection using Faster-RCNN. Face detection is important in many computer vision applications, such as driver sleep detection, human behavior analysis, life detection, estimation, etc. Finding it is a challenge. Pre-processing is done using a variable extension, which makes the library more searchable. Overloading and optimization will make the converter more accurate and robust. We analyze the effectiveness of the face detection model on the available AR and GI4E databases. This model provides 98.32% and 98.11% accuracy for the AR and GI4E databases, respectively, with 0.52 ms per image. Several experiments show that our proposed model performs better than modern face detection methods [29].

In 2022, Raju et al. presented iris print attack detection using eye movement signals. Iris-based biometric authentication has become a widely used biometric method due to its accuracy and other advantages. Improving the resistance of iris biometrics to fraud attacks is an important research topic. It has the same hardware for eye and iris tracking, an infrared light source, and an image sensor. This feature allows eye trackers to work on biometric-iris systems. Current work is advancing the state of the art in detecting iris print attacks, where fraudsters submit the user's iris print to the biometric system. Detecting iris imprinting attacks is done by analyzing captured eye movement signals using a deep learning model. The results show that the selected method is superior to the previous method [30].

Detecting the face region

Eye region detection is a skin color classification method. The flowchart of the method is shown in Fig. 1:

Fig. 1
figure 1

Proposed method’s flowchart of face detection

Skin detection

A skin detection technique is a combination of different skin detection techniques. YCbCr and HSV color spaces are created by converting the RGB color space, and skin detection works independently in each of the three color spaces. Skin detection in the YCbCr color space is obtained using the method of Kukharev and Nowosielski [31]. G. Kovac et al. [28], a correlation for skin detection in RGB color space was shown by applying it to face shape models [32, 33]. The method presented in [18, 19] was then used to obtain the skin area in the HSV color space.

Skin detection in YCbCr color space is applied by Rule 1 components. Skin regions are detected using Rules 2 and 3 in RGB color space. Also, skin detection was done by applying Rule 4 in the HSV color space. Their result is a binary image in which its white regions represent skin and black regions represent non-skin.

$$\text{Y}>80\text{ AND }85<\text{Cb}<135\text{ AND }135<\text{Cr}<180$$
(1)
$$R>95\; AND\; G>40\; AND\; B>20\; AND \left|R-G\right|>15\; AND\; R>G\; AND\; R>B\; AND\dots \text{max}\left\{R,G,B\right\}-\text{min}\left\{R,G,B\right\}>15$$
(2)

Skin color in lamp light:

$$R>220\; AND\; G>210\; AND\; B>170\; AND\; \left|R-G\right|\le 15\; AND\; R>B\; AND\; G>B$$
(3)
$$0\le H<0.25\; AND\; 0.15\le S\le 0.9$$
(4)

The proposed method for skin detection obtains the intersection of binary images obtained from results in YCbCr and RGB color spaces. And also, obtains the intersection of binary images resulting from HSV and RGB color spaces. The outputs are two binary images that specify skin regions. A flowchart of the suggested method of skin detection is in Fig. 2.

Fig. 2
figure 2

The proposed method’s flow diagram

The proposed method offers several distinct advantages over existing techniques in the realm of real-time eye detection. Firstly, its integration of skin detection and geometric facial analysis for initial face detection ensures adaptability to varying lighting conditions and human racial characteristics, a feature lacking in many conventional methods. Furthermore, the method's reliance on Sobel and Prewitt edge detection techniques enhances the precision of eye detection, particularly in scenarios with complex backgrounds or occlusions. This utilization of edge detection not only improves accuracy but also contributes to the method's low computational complexity, aligning with the requirement for real-time performance. Additionally, the incorporation of Euclidean distance measurements for verification purposes adds a layer of robustness to the detection process, exploiting inherent facial proportions for enhanced reliability. These collective innovations culminate in a method that not only outperforms existing approaches in terms of execution speed but also surpasses them in accuracy, making it a formidable contender in the realm of real-time eye detection.

Morphology operation (at face detection phase)

Image processing is a long set of image-based image-processing operations [34]. Morphological functions include expansion, contraction, closure, and opening. These operations produce smooth object boundaries with no changes in the involved regions. These functions are applied to binary images obtained from skin detection. Using these features will improve face detection.

Labelling connected components

Binary images have two values, zero and one (black and white respectively). Binary objects are those white regions. It labels each group of connected pixels as a connected component. Even one pixel alone will be considered as a connected component [34]. Labeling operation is so that each connected same value pixels gets a single unique integer label based on neighboring pixels.

Face detection

Three features (height, width, and number of holes) of connected components are tested to detect and verify a face in a component. The proposed method investigates the number of existing holes in a connected component to find the eyes, nose, and mouth in a connected component. Then it obtains the height and width of each connected component. If height is considered as H and width as W and the number of holes as L, a connected component must follow Rule 5 conditions to be considered as a face region.

$${\text{W}} \le {3}/{2}*{\text{H AND H}} \le {5}/{2}*{\text{W AND L}} > 0$$
(5)

Obtaining the number of holes in each connected component

  1. 1.

    Extract each binary object (connected component) from the total binary image and separately do the following operations on them.

  2. 2.

    The complement of the binary object values is calculated (0’s become 1 and 1’s become 0).

  3. 3.

    At this step, the binary object (connected component) is considered as a new binary image and would be labeled again to detect the number of holes in it.

  4. 4.

    Extract the unique labels in the first row and column and also in the last row and column. Find the total number of unique labels among these extracted labels.

  5. 5.

    Calculate the number of all unique labels in the new binary image.

  6. 6.

    Extract the number obtained in step 3 (N) from the total number obtained from step 4 (M). The resulting value is the number of holes in the binary object. (Number of holes = M–N)

Finding the height and width of connected components.

A binary image is a black-and-white image which is actually a matrix containing only 0’s and 1’s for black and white areas respectively. To calculate the height and width of a connected component the following approach is used:

  1. 1.

    Find the indexes of rows and columns containing the connected component’s unique label. The result would be a matrix named ones _ matrix containing the same unique label’s coordination in the original matrix of connected binary components. The rows indexes would be in X vector and the columns indexes would be in Y vector. (ones _ matrix = [X, Y])

  2. 2.

    Find the average of the X vector and Y vector as mx and my respectively.

    $$mx=\frac{1}{n}\sum_{i=1}^{n}{X}_{i}$$
    (6)
    $$my=\frac{1}{n}\sum_{i=1}^{n}{Y}_{i}$$
    (7)

Normalize the X and Y vectors by extracting the average numbers (mx and my) from each row number (X vector) and each column number (Y vector) respectively.

After this operation, the center of the connected component overlaps the center of the coordinate system (0, 0).

$$NX=\left[x\right]-mx$$
(8)
$$\text{NY}=\left[\text{y}\right]-\text{my}$$
(9)

Calculate the covariance matrix of the resulting normalized NX and NY vectors. The result would be a 2 × 2 matrix.

$$\gamma =cov\left(NX,NY\right)=E\left[NX\times {NY}^{T}\right]-E\left[NX\right]\times E{[NY]}^{T}$$
(10)

Calculate the eigenvalues (characteristic value) and eigenvectors (characteristic vector) of the covariance matrix.

$$\alpha =\left[\begin{array}{cc}\lambda & 0\\ 0& \lambda \end{array}\right]$$
(11)
$$\left|(\gamma -\alpha \right.\left.)\right|=0$$
(12)

In Eq. 12 the λ1 and λ2 (eigenvalues) would be achieved when the determinant of (γ-α) equals zero.

$$\gamma \times {\beta }_{1}={\lambda }_{1}\times {\beta }_{1}$$
(13)

By replacing the λ1 in the above Eq. (13) the β1 eigenvector vector of the covariance matrix would be found.

$$\gamma \times {\beta }_{2}={\lambda }_{2}\times {\beta }_{2}$$
(14)

In the same way, by replacing the λ2 in the above Eq. (14) the β2 eigenvector vector of the covariance matrix would be found.

Calculate the height and width of the connected component by following equations:

$${\beta }_{1}=\left[\begin{array}{c}k1\\ k3\end{array}\right] , {\beta }_{2}=\left[\begin{array}{c}k2\\ k4\end{array}\right]$$
(15)
$$d=\left[\begin{array}{cc}{\lambda }_{1}& 0\\ 0& {\lambda }_{2}\end{array}\right]$$
(16)
$$cw=2*\left(\sqrt{{\lambda }_{1}}\right)*k1$$
(17)
$$ww=2*\left(\sqrt{{\lambda }_{1}}\right)*k3$$
(18)
$$ch=2*\left(\sqrt{{\lambda }_{2}}\right)*k2$$
(19)
$$hh=2*\left(\sqrt{{\lambda }_{2}}\right)*k4$$
(20)
$$Width=2\times \left|ww-cw\right|$$
(21)
$$Height=2\times \left|hh-ch\right|$$
(22)

In the process of determining the height and width of a connected component within a binary image, several variables are utilized. The binary image is represented as a matrix containing only 0s and 1s, denoting black and white areas, respectively. The coordinates of the connected component are extracted into X and Y vectors, with corresponding means mx and my calculated. These coordinates are then normalized into NX and NY vectors by subtracting the mean values. The covariance matrix γ is computed from the normalized vectors, yielding eigenvalues (λ1, λ2) and eigenvectors (β1, β2). Parameters k1, k2, k3, and k4 are derived from the eigenvectors to form the diagonal matrix d. Subsequently, the width (cw, ww) and height (ch, hh) of the connected component are determined using the eigenvalues and parameters. Finally, the actual width and height are calculated from the differences between the computed values.

The proposed method for eye detection

The detected face region (Fig. 5) is cropped from the whole image (Figs. 3, 4 and 5). Then the cropped image is converted to a grayscale image (Fig. 6). To improve the performance of the method the lighting conditions will be adjusted in the image. The horizontal edges of the face are detected using both Sobel and Prewitt masks separately. More accurate edge detection output is reached after the combination of the results of both Sobel and Prewitt masks. To improve the accuracy and the suggested method's speed, the morphology operation is applied and the eye detection is done. The flowchart of the proposed method is shown in Fig. 7.

Fig. 3
figure 3

The whole image

Fig. 4
figure 4

Face region detection

Fig. 5
figure 5

The cropped face region

Fig. 6
figure 6

The grayscale transformation of the face region

Fig. 7
figure 7

The flowchart of the proposed method

Horizontal edge detection using Sobel and Prewitt masks

The difference between the highest light intensity and the lowest light intensity is called the contrast of the image in grayscale mode. The image may suffer from bad contrast or bad lighting conditions. To resolve the problem, the lighting condition of the image would be improved (Fig. 8). The Sobel edge detection mask (Eq. 23 and Fig. 9) is applied to the improved grayscale image (Fig. 11). The sharp edge regions in the face image are the eyes mouth and nose. By using the Prewitt edge detection masks (Eq. 24 and Figs. 10 and 11) on grayscale images, the horizontal edges are detected (Fig. 12). When the two outputs of different methods are combined, the edge found in the two outputs of the method is called the true edge (Fig. 13). To do this, the dot product of the matrix of both outputs is calculated pixel by pixel [35].

Fig. 8
figure 8

Image neighborhood

Fig. 9
figure 9

Sobel edge detection masks

Fig. 10
figure 10

Prewitt edge detection masks

Fig. 11
figure 11

Output of Sobel mask

Fig. 12
figure 12

Output of the Prewitt mask

Fig. 13
figure 13

The combination of both mask’s output

The suggested approach utilizes the Prewitt and Sobel masks to convolve the picture, therefore calculating the gradient magnitudes and orientations. This process successfully enhances the visibility of edges. These masks are specifically intended to highlight abrupt changes in intensity. They effectively estimate the gradients of a picture, collecting information about edges in both the horizontal and vertical directions. Due to their simplicity, computational efficiency, and ability to detect edges in several directions, they are highly suitable for real-time applications. In terms of edge detection approaches, the Sobel and Prewitt masks achieve a good combination of accuracy and efficiency when compared to other methods like the Canny edge detector or Roberts cross-operator. This contributes to the overall accuracy of the system in recognizing eye areas. Their localized operations facilitate immediate processing, which is essential for the optimal performance of the proposed system. Additionally, their directional sensitivity improves the system's ability to handle changes in lighting conditions and face characteristics. The significance of Sobel and Prewitt masks in providing precise and efficient eye detection within the proposed system is emphasized by these characteristics.

$$\nabla f={\left[{g}_{x}^{2}+{g}_{y}^{2}\right]}^\frac{1}{2}={\{{\left[\left({z}_{7}+2{z}_{8}+{z}_{9}\right)-\left({z}_{1}+2{z}_{2}+{z}_{3}\right)\right]}^{2}+{\left[\left({z}_{3}+2{z}_{6}+{z}_{9}\right)-\left({z}_{1}+2{z}_{4}+{z}_{7}\right)\right]}^{2}\}}^\frac{1}{2}$$
(23)
$$\nabla f={\left[{g}_{x}^{2}+{g}_{y}^{2}\right]}^\frac{1}{2}={\{{\left[\left({z}_{7}+{z}_{8}+{z}_{9}\right)-\left({z}_{1}+{z}_{2}+{z}_{3}\right)\right]}^{2}+{\left[\left({z}_{3}+{z}_{6}+{z}_{9}\right)-\left({z}_{1}+{z}_{4}+{z}_{7}\right)\right]}^{2}\}}^\frac{1}{2}$$
(24)

Morphology operation

Dilation is an operation that makes binary objects thicker or bigger. The structural component determines the thickness's degree. 3-spoke structural elements are used in expansion operations to connect adjacent edges and fill the gaps between them. Erosion operation results in binary object’s thinning or shrinking. Like the dilation, the degree of the thinning depends on the structural element. Again the 3-radius structural element is used in erosion operation to avoid the connection of eyes and eyebrows that is occurring in some face images. The small holes inside binary objects would be connected. To reduce the number of objects to be processed, the small objects inside the binary image would be deleted so the speed of the proposed method would be improved. If a hair appears in the face image it may be considered as an eye mistakenly. So, the white pixels connected to the boundary would be deleted to avoid such mistakes (Fig. 14).

Fig. 14
figure 14

The output of morphology operation

Eyes detection

To view the right eye, process the top right 1/4 of the image, and to view the left eye, process the top left 1/4 of the face image. The mouth is visible on the bottom half of the face shape. The largest binary object in each segment is the desired object in that segment because most interior edges in each region belong to the desired object. To find the right eye, we find the largest binary object in the upper right corner of the image, calculate the center of the largest object, and call it the center of the right eye. Also, calculate the center of the left eye and the center of the mouth and name it.

The validation step of eye detection

As a verification process to find the correct position of the eyes and mouth, the distance between the mouth's center, the center of the right eye, and the center of the left eye is calculated by Eq. 3. Then the distances are two: compare. If the difference is below a predefined threshold, the mask is correctly detected by the detection method. The threshold is because the difference between the two normal distances is small. The Euclidean distance calculation is used to find the distance between the eyes and the mouth. In Eq. 25, the D1 value is the distance between the mouth and the right eye. The D2 value is the distance between the mouth and the left eye. TH is the threshold. Figure 15 shows the results of the proposed method.

Fig. 15
figure 15

The result of the proposed method

$$\genfrac{}{}{0pt}{}{\begin{array}{c}center\; of\; right\; eye=\left(X1,Y1\right)\\ center\; of\; left\; eye = \left(X2,Y2\right) \end{array}}{\begin{array}{c}center\; of\; month = (X3,Y3) \\ D1=\sqrt{{\left(X3-X1\right)}^{2}+{\left(Y3-Y1\right)}^{2}}\\ D2=\sqrt{{\left(X3-X2\right)}^{2}+{\left(Y3-Y2\right)}^{2}}\\ Th=\left|D2-D1\right| \\ \\ \end{array}}$$
(25)

Experiments

The proposed method was implemented and tested on the Aberdeen database [27] (Table 1). This database contains the different lighting conditions and different faces that pose challenges. The proposed method is compared with the Sirohey et al. method[14]. The total number of images that are evaluated equals 665 images. Some sample images are shown in Figs. 16 and 17.

Table 1 Comparing the proposed method with Sirohey et al. method using Accuracy
Fig. 16
figure 16

Sample images of the Aberdeen database

Fig. 17
figure 17

The results of implementing the proposed method on Aberdeen sample images

The experimental setup involved the evaluation of the proposed eye detection method using the Aberdeen database, chosen for its established reputation and suitability for benchmarking facial recognition algorithms. The Aberdeen database comprises a diverse set of facial images captured under various lighting conditions, facial expressions, and ethnicities, making it representative of real-world scenarios encountered in eye detection applications. This diversity ensures that the method's performance is thoroughly tested across a wide range of conditions, enhancing its generalizability and applicability. The dataset was partitioned into training and testing subsets to facilitate rigorous evaluation, with appropriate metrics such as accuracy, precision, recall, and F1-score computed to assess the method's performance comprehensively. Additionally, cross-validation techniques may have been employed to mitigate potential biases and ensure the robustness of the results. Overall, the choice of the Aberdeen database for evaluation was driven by its richness in diverse facial imagery, aligning with the goal of assessing the method's effectiveness across different lighting conditions and human races encountered in real-world scenarios.

TP: True Positive; TN: True Negative; FP: False Positive; FN: False Negative

$$Accuracy=\frac{TP+TN}{TP+TN+FP+FN}$$

The proposed eye detection method was rigorously compared with existing methods in the literature, considering various quantitative metrics to assess performance comprehensively. In terms of speed, the proposed method outperformed all existing approaches, exhibiting significantly lower computational overhead owing to its efficient grayscale transformation and judicious selection of edge detection techniques. This advantage was particularly evident in real-time applications, where the proposed method demonstrated superior processing speed without compromising accuracy. Regarding accuracy, the proposed method surpassed existing techniques, achieving a commendable 92% accuracy on the Aberdeen database, indicative of its robustness and reliability in detecting eye regions across diverse lighting conditions and human races. Additionally, resource consumption was meticulously evaluated, with the proposed method exhibiting optimal utilization of computational resources, making it appropriate for implementation in situations with limited resources, like embedded systems or mobile devices. Overall, the comprehensive comparison reaffirmed the superiority of the proposed method in terms of speed, accuracy, and resource efficiency, positioning it as a state-of-the-art solution for real-time eye detection tasks.

Although the suggested eye detection approach shows promising outcomes, its implementation in real-world applications must take into account several constraints and obstacles. An important obstacle is its ability to function consistently under different lighting situations. Although attempts were made to improve the strength of the system by using skin detection and edge detection methods, it may still face challenges in situations with severe changes in illumination or harsh environmental conditions. Furthermore, it is necessary to do further research to examine the efficacy of the approach in other ethnic groups. Although the Aberdeen database offers a varied collection of face photos, it would be advantageous to have supplementary datasets that include a wider spectrum of ethnicities in order to enhance the method's capacity to apply to different populations. Moreover, the suggested method's dependence on facial characteristics and geometry analysis for detecting faces may provide difficulties in situations where there are obstructions or when faces are not facing directly forward. This would need further adjustments to enhance performance. Additionally, the method's appropriateness for real-time applications in HCI and driver fatigue monitoring depends on its capacity to attain both precise precision and computing efficiency. Although the suggested technique exhibits remarkable speed and accuracy, more optimizations may be required to overcome resource limitations in real-world deployment situations. In general, while the suggested technique displays potential for many applications, it is essential to tackle these constraints and problems in order to successfully incorporate them into real-world systems.

Conclusion

In conclusion, the developed real-time eye detection method presents a significant advancement in the domain, showcasing notable efficacy and versatility across diverse lighting conditions and human racial characteristics. By leveraging a combination of skin detection, geometric facial features, and edge detection techniques such as Sobel and Prewitt masks, the proposed method demonstrates robustness in detecting both face and eye regions with remarkable accuracy. The integration of morphological features further refines the detection process, ensuring high precision in identifying horizontal edges crucial for eye detection. Moreover, the utilization of Euclidean distance for verification enhances the method's reliability by exploiting inherent facial proportions. The achieved accuracy of 92% on the Aberdeen database underscores the effectiveness of the suggested approach. However, future research endeavors should focus on refining the method's accuracy and exploring its applicability in light of recent advancements in the field, thereby consolidating its position as a state-of-the-art solution for real-time eye detection tasks. For future enhancements, exploring robustness under extreme lighting conditions and diverse ethnicities, refining performance in scenarios with occlusions or non-frontal poses, and optimizing computational efficiency for real-time deployment in HCI and driver fatigue monitoring applications could be valuable directions.