Abstract
Robust and accurate detection of the pupil position is a key building block for head-mounted eye tracking and prerequisite for applications on top, such as gaze-based human–computer interaction or attention analysis. Despite a large body of work, detecting the pupil in images recorded under real-world conditions is challenging given significant variability in the eye appearance (e.g., illumination, reflections, occlusions, etc.), individual differences in eye physiology, as well as other sources of noise, such as contact lenses or make-up. In this paper we review six state-of-the-art pupil detection methods, namely ElSe (Fuhl et al. in Proceedings of the ninth biennial ACM symposium on eye tracking research & applications, ACM. New York, NY, USA, pp 123–130, 2016), ExCuSe (Fuhl et al. in Computer analysis of images and patterns. Springer, New York, pp 39–51, 2015), Pupil Labs (Kassner et al. in Adjunct proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing (UbiComp), pp 1151–1160, 2014. doi:10.1145/2638728.2641695), SET (Javadi et al. in Front Neuroeng 8, 2015), Starburst (Li et al. in Computer vision and pattern recognition-workshops, 2005. IEEE Computer society conference on CVPR workshops. IEEE, pp 79–79, 2005), and Świrski (Świrski et al. in Proceedings of the symposium on eye tracking research and applications (ETRA). ACM, pp 173–176, 2012. doi:10.1145/2168556.2168585). We compare their performance on a large-scale data set consisting of 225,569 annotated eye images taken from four publicly available data sets. Our experimental results show that the algorithm ElSe (Fuhl et al. 2016) outperforms other pupil detection methods by a large margin, offering thus robust and accurate pupil positions on challenging everyday eye images.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Understanding processes underlying visual perception has been a focus in various research fields including medicine, psychology, advertisement, autonomous driving, or application control. Recent developments in head-mounted eye tracking have enabled researchers to study human visual perception and attention allocation in natural environments. Such systems generally fall into two categories, i.e., remote tracking systems, where the subject is recorded with an external camera, and head-mounted eye trackers. The main challenges that have to be faced in remote eye tracking are the robust detection of the subject’s face and eyes. Several techniques have been proposed for the robust face (e.g., [9, 10]) and eye recognition as well as hallucination (e.g., [11]) in low resolution images.
The measurement of eye movements in head-mounted eye trackers is based on one camera that records the viewed scene and, however, at least one additional camera, which is directed towards the subject’s eye to record the eye movements (e.g., Dikablis Mobile eye tracker, Pupil Labs eye tracker [15], SMI Glasses, Tobii Glasses). The gaze point is then mapped to the viewed scene based on the center of the pupil and a user-specific calibration routine. A crucial prerequisite for a robust tracking is an accurate detection of the pupil center in the eye images. While eye tracking can be accomplished successfully under laboratory conditions, many studies report the occurrence of difficulties when eye trackers are employed in natural environments, such as driving [1, 12, 16, 20], shopping [14, 26], or simply walking around [28].
The main source of error in such settings is a non-robust pupil signal that is mainly related to challenges in the image-based detection of the pupil. Schnipke and Todd summarized in [25] a variety of difficulties occurring when using eye trackers, such as changing illumination, motion blur, recording errors, and eyelashes covering the pupil. Rapidly changing illumination conditions arise primarily in tasks where the subject is moving fast (e.g., while driving) or rotates relative to unequally distributed light sources. Furthermore, in case the subject is wearing eye glasses or contact lenses, further reflections may occur (see Fig. 7, Data set III and XVIII). A further issue arises due to the off-axial position of eye camera in head-mounted eye trackers, e.g., when the pupil is surrounded by a dark region (see Fig. 7, Data set VII). Other difficulties are often posed by physiological eye characteristics, which may interfere with detection algorithms, such as additional dark spots on the iris. Therefore, studies based on eye tracking in uncontrolled environments constantly report low pupil detection rates [13, 20, 33]. As a consequence, the data collected in such studies has to be manually post-processed, which is a laborious and time-consuming procedure. Furthermore, such post-processing is impossible for real-time applications that rely on pupil monitoring (e.g., driving assistance based on eye tracking input [31], gaze-based interaction [22, 27, 34], eye-based activity and context recognition [1–3] and many more). In light of the above challenges, real-time and accurate pupil detection is an essential prerequisite for pervasive video-based eye tracking.
2 State-of-the-art methods for pupil detection
Over the last years, there has been a large body of work on pupil detection. However, most of the approaches address pupil detection under laboratory conditions, e.g., in [7, 17], both employing a histogram-based threshold. Such algorithms can be applied to eye images captured under infrared light as in [19, 21, 24]. Another threshold-based approach was presented in [38], where the pupil is detected based on the calculation of the curvature of the threshold border. An isophotes curvature-based approach is presented in [35], using the maximum isocenter as pupil center estimation. Probably the most popular algorithm in this realm is Starburst, introduced by [18] in 2005. In 2012, Swirski et al. [30] proposed an algorithm, which starts with a coarse positioning using Haar-like features and then refines the pupil center position.
The open source head-mounted eye tracker from Pupil Labs also comes with a pupil detection algorithm designed for unconstrained everyday settings [15]. Wood et al. [37] presented a model-based gaze estimation system for unmodified tablet computers. Their gaze estimation pipeline also includes a Haar-like feature based eye detector as well as a RANSAC-based limbus ellipse fitting approach. Three recent methods, SET [8], ExCuSe [5], and ElSe [6], explicitly address the aforementioned challenges associated with pupil detection in natural environments. SET is based on thresholding and ellipse fitting. ExCuSe [5] first analyses the input images with regard to reflections based on intensity histograms. Several processing steps based on edge detectors, morphologic operations, and the Angular Integral Projection Function are then applied to extract the pupil contour. Similar to ExCuSe, ElSe is also based on edge filtering, ellipse evaluation, and pupil contour validation [6].
The algorithms Starburst [18], Świrski [30], Pupil Labs [15], SET [8], ExCuSe [5], ElSe [6] will be presented and discussed in detail in the following subsections. We compared these algorithms on a large corpus of hand-labeled eye images (Sect. 3) and will present the results in Sect. 4.
2.1 Starburst
In the first step of Starburst [18], the image is denoised using a Gaussian filter. The algorithm then uses adaptive thresholding on a square region of interest in each video frame to localize the corneal reflection. The location of the corneal reflection is given by the geometric center of the largest region in the image using the adaptively determined threshold. Radial interpolation is further used to remove the corneal reflection from the eye image. The central step of the algorithm that also gave this method its name, is to estimate the pupil contour by detecting edges along a limited number of rays that extend from a central best guess of the pupil center (see Fig. 1b, c, d). The rays are independently evaluated pixel by pixel until a threshold is exceeded, indicating the edge of the pupil. A feature point is defined at that location as a contour edge candidate and the processing along the ray is stopped. For each pupil candidate, another set of rays is generated that creates a second set of pupil contour candidates. This process is iterated until convergence. Model fitting is finally performed following a Random Sample Consensus (RANSAC) paradigm to find the best fitting ellipse describing the pupil boundary. This result is further improved through a model-based optimization that does not rely on feature detection [7, 18].
2.2 Świrski
The algorithm by Świrski et al. [30] works in three main stages (see Fig. 2). To find the pupil, the Świrski detector first calculates the integral image and convolves it with a Haar-like feature, similar to the features used in cascade classifiers [36]. The algorithm repeats this convolution for a set of possible radii between a user-specified minimum and maximum.
The position of the strongest response is the estimated center of the pupil, with the size of the region determined by the corresponding radius. The initial region estimation is unlikely to be accurately centered on the pupil. Therefore, in the second step, the algorithm approximates the pupil position within this region based on k-means clustering: The image histogram is segmented in dark (pupil pixels) and light. The algorithm then creates a segmented binary image of the pupil region by thresholding any pixels above the maximum intensity in the dark cluster. Afterward, connected components in the segmented image are found and the largest among these is selected to be the pupil.
The center of mass of this component approximates the pupil position. The final stage of the algorithm refines the pupil position estimate by fitting an ellipse to the boundary between the pupil and the iris. To do this, the image is preprocessed to create an edge image and to robustly fit an ellipse to the edge points while ignoring any outliers. To remove features such as eyelashes and glints, a morphological “open” operation is performed to close small bright gaps in the otherwise dark-pupil region, without significantly affecting the pupil’s contour. The algorithm then finds the boundary between pupil and iris using a Canny edge detector. Finally, the algorithm fits an ellipse to the edge pixels using RANSAC as well as an image-aware support function. The support function ensures that the ellipse lies on a boundary from dark pixels to light pixels, and that they lie along strong image edges.
2.3 Pupil Labs
The Pupil Labs detector is integrated into the open source head-mounted eye tracking platform Pupil [15]. Figure 3 visualizes the different processing steps of the algorithm. In a first step, the eye image is converted to grayscale. The initial region estimation of the pupil is found via the strongest response for a center-surround feature as proposed in [30]. The algorithm then uses a Canny filter to detect edges in the eye image and filters these edges based on neighboring pixel intensity. It then looks for darker areas (blue region) while dark is specified using a user set offset of the lowest peak in the histogram of pixel intensities in the eye image. Remaining edges are filtered to exclude those stemming from spectral reflections (yellow region) and extracted into contours using connected components [29]. The contours are simplified using the Ramer–Douglas–Peucker algorithm [4] and then filtered and split into sub-contours based on criteria of curvature continuity. Candidate pupil ellipses are found using ellipse fitting onto a subset of the contours. Good fits are defined in a least square sense, major radii within a user-defined range and ellipse center in a “blue region” (see Fig. 3d). An augmented combinatorial search looks for contours that can be added as support to the candidate ellipses. The results are evaluated based on the ellipse fit of the supporting edges and the ratio of supporting edge length and ellipse circumference. If the ratio is above a threshold, the algorithm uses the raw Canny edge pixels to refine the ellipse fit and reports this final ellipse as the one defining the contour of the pupil. Otherwise the algorithm reports that no pupil was found.
2.4 SET
The SET approach consists of a combination of manual and automatic step estimates the pupil center [8]. Prior to pupil detection, two parameters, i.e., a threshold to convert the input to a binary image (see Fig. 4b) and the size of the segments (see Fig. 4c) considered for pupil detection are set manually [8]. The thresholded image is first segmented and pixels are then grouped into maximal connected regions to find pupil candidates [8]. For each segment larger than a threshold value, the Convex Hull method is applied to compute the segment border. In the last step, an ellipse is fitted to each extracted segment (see Fig. 4d). The ellipse that is closest to a circle is considered as the pupil (see Fig. 4e).
2.5 ExCuSe
ExCuSe is a recently introduced algorithm that builds on edge detection and morphologic operations [5]. The algorithmic work-flow is presented in Fig. 5. In the first step, the input image is normalized and a histogram of the image is calculated. If a peak in the bright histogram area is found, the pupil can be found based on edge analysis (first row in Fig. 5). To achieve this, a Canny edge detector is applied to the input image (see Fig. 5b). The resulting edges are then refined by removing thin lines and thinning thick edges using a morphologic operator. All remaining edges are smoothed and orthogonal edges are removed using morphologic patterns (see Fig. 5c). For each connected line, the mean position is calculated. Based on this information, straight lines are excluded from further processing. All remaining curved lines are kept as shown in Fig. 5d and further processed. For each remaining curve, the enclosed mean intensity value is calculated and the curve with the lowest value is chosen as pupil curve. Afterward, an ellipse is fitted to this curve (see Fig. 5e).
In case no bright peak in the intensity histogram is detected, a threshold based on the standard deviation of the image is calculated. These corresponding steps are visualized in the second row of Fig. 5. First the algorithm determines a coarse pupil position and then refines this stepwise to approach the pupil center. The coarse pupil position is estimated based on the Angular Integral Projection Function (AIPF) [23] on the thresholded image. More specifically, the input image is thresholded and pixels over the threshold are summed along the rows. This summation is done four times by rotating with an orientation angle of \(45^\circ \), Step 7 in Fig. 5. Once a coarse pupil center estimation has been performed, a refinement step based on features of the surrounding neighborhood is performed, Step 8 in Fig. 5. The assumption here is that pixels belonging to the pupil are surrounded by brighter or equally bright pixels. Finally, a thresholded image is used in ExCuSe to improve the edge image and refine the pupil edges. The result for the input image from Fig. 5f is shown in Fig. 5j.
2.6 ElSe
Similar to ExCuSe, the ElSe algorithms operates on Canny edge filtered eye images. The pupil center is found in a decision-based approach as described in [6]. Based on the edge filtered image, edge connections that could impair the surrounding edge of the pupil are removed by means of morphologic operators (see Fig. 6b). Afterward, connected edges are collected and evaluated based on straightness, inner intensity value, elliptic properties, the possibility to fit an ellipse to it, and a pupil plausibility check (see Fig. 6c). If a valid ellipse describing the pupil is found, it is returned as the result (see Fig. 6d).
In case no ellipse is found (e.g., when the edge filtering does not result in suitable edges), a second analysis is conducted. More specifically, as in ExCuSe, ElSe first estimates a likely location candidate and then refines this position. Since a computationally demanding convolution operation is required, the image is rescaled to keep run-time tractable. This rescaling process contains a low-pass procedure to preserve dark regions (see Fig. 6g) and to reduce the effect of blurring or noise caused by eyelashes in the eye image. Afterward, the image is convolved with two different filters separately: (1) a surface difference filter to calculate the area difference between an inner circle and a surrounding box and (2) a mean filter. The results of both convolutions are multiplied (see Fig. 6h), and the maximum value is set as the starting point of the refinement step. Since choosing a pixel position in the downscaled image leads to a distance error of the pupil center in the full scale image, the position has to be optimized on the full scale image based on an analysis of the surrounding pixels of the chosen position (see Fig. 6i). The center of mass of the pixels under this threshold is used as the new pupil position (see Fig. 6j). This position is evaluated regarding the possibility to be the pupil by analyzing the surface difference result of a validity pattern.
3 Data sets
3.1 Świrski
The data set introduced by Świrski et al. [30] in 2012 provides 600 manually labeled, high resolution (640 \(\times \) 480 pixels) eye images. The data was collected during in-door experiments with 2 subjects and 4 different camera angles. The main challenges in pupil detection arise due to highly off-axial camera position and occlusion of the pupil by the eye lid.
3.2 ExCuSe
This data set was recently provided with the ExCuSe [5] algorithm and includes 38,401 high-quality, manually labeled eye images (384 \(\times \) 288 pixels) from 17 different subjectsFootnote 1 Exemplary images are shown in Fig. 7. The first nine data sets in ExCuSe were recorded during an on-road driving experiment [13] using a head-mounted Dikablis mobile eye tracker. The remaining eight data sets were recorded during a supermarket search task [26]. These data sets are highly challenging, since illumination conditions change frequently. Furthermore, reflections on eye glasses and contact lenses often occur (Table 1). The experiments were not specifically designed to pose challenges to pupil detection, but reflect typical data collected out of the laboratory.
3.3 ElSe
The Data sets XVIII–XXIV (Table 1; Fig. 7) were recently published with the ElSe algorithm in [6]. This data set contains overall 55,712 eye images (384 \(\times \) 288 pixels) collected from seven subjects wearing a Dikablis eye tracker during various tasks. The Data sets XVIII–XXII were derived from eye tracking recordings during an on-road driving experiment [13]. The remaining Data sets XXIII and XXIV were recorded during indoor experiments with two Asian subjects. The challenge in pupil detection arises from eyelids and eyelashes occluding the pupil or casting shadows onto the pupil (and, in one case, glasses reflections). Further challenges associated with Data set XXIV are related to reflections on eye glasses. The challenges in the eye images included in the Data sets XVIII, XIX, XX, XXI, and XXII are related to motion blur, reflections, and low pupil contrast in comparison with to the surrounding area.
3.4 Labeled pupils in the wild (LPW)
The recent Labeled Pupils in the Wild (LPW) data set [32] contains 66 high-quality eye region videos that were recorded from 22 participants using a dark-pupil head-mounted eye tracker from Pupil Labs [15]. Each video in the data set consists of about 2000 frames with a resolution of 640 \(\times \) 480 pixels and was recorded at about 95 FPS, resulting in a total of 130,856 video frames. The data set is one order of magnitude larger than other data sets and covers a wide range of realistic indoor and outdoor illumination conditions, includes participants wearing glasses and eye make-up, as well as different ethnicities with variable skin tones, eye colors, and face shapes (see Fig. 8). All videos were manually ground-truth annotated with accurate pupil center positions.
4 Experimental results
We compared the algorithms StarburstFootnote 2 [18], SETFootnote 3 [8], Świrski et al.Footnote 4 [30], Pupil LabsFootnote 5 [15], ExCuSeFootnote 6 [5], and ElSe\(^{6}\) [6] on the above data sets from Table 1. All algorithms were employed with their default parameter setting. We report the performance of the above algorithms in terms of the detection rate for different pixel errors, where the pixel error represents the Euclidean distance between the manually labeled center of the pupil and the pupil center reported by the algorithm. Note that we do not report performance measures related to the gaze position on the scene, since this also depends on the calibration. We focus on the pupil center position on the eye images, where the first source of noise occurs.
Table 2 summarizes the performance of the evaluated algorithms on each data set. On 42 out of 47 data sets, ElSe [6] clearly outperformed the other state-of-the-art algorithms, being thus the most promising approach toward robust pupil detection in heavily noisy eye images. The average detection rates of the evaluated algorithms on the whole image corpus (i.e., 225,569 ground-truth eye images from Table 1) are presented in Fig. 9. Note that the results are weighted by the number of images on each data set. As shown in the Figure, ElSe shows superior performance, reaching an average detection rate of more than 60 % at a pixel distance error of 5.
A detailed performance analyses on each data set is visualized in Fig. 10. The highest detection rates are achieved on the Świrski et al. [30] data set. Since this data set was collected in a laboratory setting, it is the least challenging, although most of the contained eye images are highly off-axial. For this data set, the algorithms ExCuSe, ElSe, and Swirksi reach a detection rate far beyond 70 % at a pixel distance of 5. With a detection rate of 86.17 % (Table 2), ExCuSe is the best performing algorithm among the state of the art.
The data sets ExCuSe, ElSe, and LPW provide a large corpus of eye images collected in outdoor scenarios and represent the various challenges that have to be faced when head-mounted eye trackers are employed in such settings. Figure 10b shows the evaluation results on the ExCuSe data set. For this data set, ElSe is the best performing algorithm with a detection rate of 70 % at a pixel error of 5. The ExCuSe algorithm achieves also good detection rates of about 55 %, whereas the remaining algorithms show detection rates below 30 %.
Due to the many sources of noise summarized in Table 1, the ElSe data set contains the most challenging eye images. The best detection rates (for a pixel error of 5) are achieved by the algorithms ElSe (50 %) and ExCuSe (35 %), while the remaining algorithms show detection rates of at most 10 %.
According to the evaluation results on the LPW data set (Fig. 10d), ElSe proves to be the most robust algorithm when employed in outdoor scenarios. At a pixel error of 5, ElSe shows a detection rate of 70 %. Good detection rates (50 %) are also achieved by the algorithms ExCuSe and Swirksi, whereas the remaining approaches have detection rates below 40 %.
Figure 11 shows evaluation results for the most challenging data sets. Data set XIX in Fig. 11a is characterized by scattered reflections, which lead to edges on the pupil but not at its boundary. Since most of the state-of-the-art approaches are based on the edge filtering, they are very likely to fail in detecting the pupil boundary. In consequence, the detection rates achieved here are quite poor.
Data set XXI (Fig. 11b) poses challenges related to poor illumination conditions, leading thus to an iris with low intensity values. This makes it very difficult to separate the pupil from the iris (e.g., in the Canny edge detection the responses are discarded because they are too low). Additionally, this data set contains reflections, which have a negative impact on the edge lter response. While the algorithms ElSe and ExCuSe achieve detection rates of approximately 45 %, the remaining approaches can detect the pupil center in only 10 % of the eye images. Figure 12 presents examples of successfully found pupils in eye images from Data set XXI. The top row shows two input images, the middle row presents the filtered edges, and the bottom row shows the pupils detected by the ElSe algorithm. Among the evaluated algorithms, only ElSe [6] and ExCuSe [5] find the pupil in these eye images. The remaining algorithms fail due to the low contrast in the pupil area. More specifically, SET [8] fails, since in the thresholding step large parts of the iris are extracted and identified as pupil area. Świrski et al. [30] fails in the coarse positioning step, while Starburst fails while selecting the correct edge candidates that represent the pupil border.
The eye images contained in Data set XXVIII (Fig. 11c) are recorded from a highly off-axial camera position. In addition, poor illumination makes it difficult to separate the pupil from the dark regions at the eyelid areas. Both conditions lead to overall poor detection rates. Figure 13 shows failure cases of ElSe [6] on eye images from Data sets XIX and XXVIII. To demonstrate the challenges associated with automated pupil detection in these images we have chosen ElSe because it was the best performing algorithm. The left column presents the input images to the algorithms, the second column shows the filtered edges, the third column are the blob responses, and the last column the results. The first two rows contain images from Data set XIX and show the high impact of scattered reflections (first row) and induced curved edges by reflections (second row). In the second row, the third image is not present since ElSe [6] did not use blob detection. The last row shows an input image from Data set XXVIII, where the wrong blob response is due to eyelashes. Since the image is recorded highly off-axis, the pupil is only marginally available.
The last challenging Data set XXIX (Fig. 11d) is also characterized by highly off-axial images. In addition, the frame of the subjects’ glasses covers the pupil and most of the images are heavily blurred. This leads to unsatisfactory responses from the Canny edge detector. In consequence, the detection rates are very poor, e.g., ElSe (the best performing algorithm) can detect the pupil in only 25 % of the eye images. Figure 14 shows failure cases of ElSe [6] on eye images from Data sets XXI and XXIX. The left column presents the input images, the second column shows the filtered edges, the third column is the blob responses, and the last column the results obtained with the ElSe algorithm. The blob response of ElSe [6] in the top row is distracted by the high surface difference of the light shadow at the lower eyelid and the bright skin below it. For Data set XXIX, the main pupil recognition problems arise due to the bright eyeglasses frame, which distracts the blob response. In the above failure cases, further improvements in automated pupil detection could come from explicitly considering additional eye-related features such as eyelids and eye corners.
5 Conclusions
We presented a review of state-of-the art pupil detection algorithms for application in outdoor settings. The focus was primarily on the robustness of these algorithms with respect to frequently and rapidly changing illumination conditions, off-axial camera position, and other sources of noise. Six state-of-the-art approaches were evaluated on over 200,000 ground-truth annotated images collected with different eye tracking devices in a range of different everyday settings. Our extensive evaluation shows that despite good average performance of these algorithms on these challenging data sets, there are still problems in obtaining robust pupil centers in case of reflections or poor illumination conditions.
Notes
References
Braunagel, C., Kasneci, E., Stolzmann, W., Rosenstiel, W.: Driver-activity recognition in the context of conditionally autonomous driving. In: 2015 IEEE 18th International Conference on Intelligent Transportation Systems, pp 1652–1657 (2015). doi:10.1109/ITSC.2015.268
Bulling, A., Ward, J.A., Gellersen, H., Tröster, G.: Eye movement analysis for activity recognition using electrooculography. IEEE Trans. Pattern Anal. Mach. Intell. 33(4), 741–753 (2011). doi:10.1109/TPAMI.2010.86
Bulling, A., Weichel, C., Gellersen, H.: Eyecontext: recognition of high-level contextual cues from human visual behaviour. In: Proceedings of the 31st SIGCHI International Conference on Human Factors in Computing Systems (CHI), pp. 305–308 (2013). doi:10.1145/2470654.2470697
Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr. Int. J. Geogr. Inf. Geovisualization 10(2), 112–122 (1973)
Fuhl, W., Kübler, T., Sippel, K., Rosenstiel, W., Kasneci, E.: ExCuSe: robust pupil detection in real-world scenarios. In: Azzopardi, G., Petkov, N. (eds.) Computer Analysis of Images and Patterns, Springer, New York, pp. 39–51 (2015)
Fuhl, W., Santini, T.C., KŁubler, T., Kasneci, E.: Else: Ellipse selection for robust pupil detection in real-world environments. In: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, ACM, New York, NY, USA, ETRA ’16, pp 123–130 (2016)
Goni, S., Echeto, J., Villanueva, A., Cabeza, R.: Robust algorithm for pupil-glint vector detection in a video-oculography eyetracking system. In: Pattern Recognition, 2004. Proceedings of the 17th International Conference on ICPR 2004. IEEE (2004)
Javadi, A.H., Hakimi, Z., Barati, M., Walsh, V., Tcheang, L.: Set: a pupil detection method using sinusoidal approximation. Front. Neuroeng. 8, 4 (2015)
Jian, M., Lam, K.M.: Simultaneous hallucination and recognition of low-resolution faces based on singular value decomposition. Circuits Syst. Video Technol. IEEE Trans. 25(11), 1761–1772 (2015)
Jian, M., Lam, K.M., Dong, J.: A novel face-hallucination scheme based on singular value decomposition. Pattern Recognit. 46(11), 3091–3102 (2013)
Jian, M., Lam, K.M., Dong, J.: Facial-feature detection and localization based on a hierarchical scheme. Inf. Sci. 262, 1–14 (2014)
Kasneci, E.: Towards the automated recognition of assistance need for drivers with impaired visual field. PhD thesis, University of Tübingen, Tübingen (2013). http://tobias-lib.uni-tuebingen.de/volltexte/2013/7033
Kasneci, E., Sippel, K., Aehling, K., Heister, M., Rosenstiel, W., Schiefer, U., Papageorgiou, E.: Driving with binocular visual field loss? A study on a supervised on-road parcours with simultaneous eye and head tracking. Plos One 9(2), e87,470 (2014)
Kasneci, E., Sippel, K., Heister, M., Aehling, K., Rosenstiel, W., Schiefer, U., Papageorgiou, E.: Homonymous visual field loss and its impact on visual exploration: A supermarket study. TVST 3(6), 2 (2014)
Kassner, M., Patera, W., Bulling, A.: Pupil: an open source platform for pervasive eye tracking and mobile gaze-based interaction. In: Adjunct Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp), pp. 1151–1160 (2014). doi:10.1145/2638728.2641695
Kasneci, E., Kasneci, G., Kübler, T.C., Rosenstiel, W.: Artificial Neural Networks: Methods and Applications in Bio-Neuroinformatics, Springer International Publishing, chap Online Recognition of Fixations, Saccades, and Smooth Pursuits for Automated Analysis of Trac Hazard Perception, pp 411–434 (2015). doi:10.1007/978-3-319-09903-320
Keil, A., Albuquerque, G., Berger, K., Magnor, M.A.: Real-time gaze tracking with a consumer-grade video camera In: Vaclav, S. (ed.) WSCG’2010, February 1–4, 2010, UNION Agency–Science Press, Plzen (2010)
Li, D., Winfield, D., Parkhurst, D.J.: Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches. In: Computer Vision and Pattern Recognition-Workshops, 2005. IEEE Computer Society Conference on CVPR Workshops. IEEE, pp. 79–79 (2005)
Lin, L., Pan, L., Wei, L., Yu, L.: A robust and accurate detection of pupil images. In: 3rd International Conference on Biomedical Engineering and Informatics (BMEI), 2010, IEEE, vol. 1, pp. 70–74 (2010)
Liu, X., Xu, F., Fujimura, K.: Real-time eye detection and tracking for driver observation under various light conditions. In: Intelligent Vehicle Symposium, 2002, IEEE, vol. 2, pp. 344–351. IEEE (2002)
Long, X., Tonguz, O.K., Kiderman, A.: A high speed eye tracking system with robust pupil center estimation algorithm. In: Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE. IEEE (2007)
Majaranta, P., Bulling, A.: Eye Tracking and Eye-Based Human-Computer Interaction. Advances in Physiological Computing. Springer, London (2014). doi:10.1007/978-1-4471-6392-33
Mohammed, G.J., Hong, B.R., Jarjes, A.A.: Accurate pupil features extraction based on new projection function. Comput. Inf. 29(4), 663–680 (2012)
Peréz, A., Cordoba, M.L., Garcia, A., Méndez, R., Munoz, M.L., Pedraza, J.L., Sanchez. F.: A precise eye-gaze detection and tracking system. In: Václav S. (ed.) WSCG ‘2003: Posters: The 11th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision 2003, pp. 105–108. UNION Agency, Plzen (2003)
Schnipke, S.K., Todd, M.W.: Trials and tribulations of using an eye-tracking system. In: CHI’00 Extended Abstracts on Human Factors in Computing Systems. ACM (2000)
Sippel, K., Kasneci, E., Aehling, K., Heister, M., Rosenstiel, W., Schiefer, U., Papageorgiou, E.: Binocular glaucomatous visual field loss and its impact on visual exploration—a supermarket study. PLoS One 9(8), e106,089 (2014). doi:10.1371/journal.pone.0106089
Stellmach, S., Dachselt, R.: Look & touch: gaze-supported target acquisition. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2981–2990. ACM (2012)
Sugano, Y., Bulling, A.: Self-calibrating head-mounted eye trackers using egocentric visual saliency. In: Proceedings of the 28th ACM Symposium on User Interface Software and Technology (UIST), pp. 363–372 (2015). doi:10.1145/2807442.2807445
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graphics Image Process. 30(1), 32–46 (1985)
Świrski, L., Bulling, A., Dodgson, N.: Robust real-time pupil tracking in highly off-axis images. In: Proceedings of the Symposium on Eye Tracking Research & Applications (ETRA), pp. 173–176. ACM (2012). doi:10.1145/2168556.2168585
Tafaj, E., Kübler, T., Kasneci, G., Rosenstiel, W., Bogdan, M.: Online classification of eye tracking data for automated analysis of traffic hazard perception. In: Artificial Neural Networks and Machine Learning, ICANN 2013, vol. 8131, pp. 442–450. Springer, Berlin—Heidelberg (2013)
Tonsen, M., Zhang, X., Sugano, Y., Bulling, A.: Labelled pupils in the wild: a dataset for studying pupil detection in unconstrained environments. In: Proceedings of the ACM International Symposium on Eye Tracking Research & Applications (ETRA), pp. 139–142 (2016). doi:10.1145/2857491.2857520
Trösterer, S., Meschtscherjakov, A., Wilfinger, D., Tscheligi, M.: Eye tracking in the car: challenges in a dual-task scenario on a test track. In: Proceedings of the 6th AutomotiveUI. ACM (2014)
Turner, J., Bulling, A., Alexander, J., Gellersen, H.: Cross-device gaze-supported point-to-point content transfer. In: Proceedings of the ACM International Symposium on Eye Tracking Research & Applications (ETRA), pp. 19–26 (2014). doi:10.1145/2578153.2578155
Valenti, R., Gevers, T.: Accurate eye center location through invariant isocentric patterns. Trans. Pattern Anal. Mach. Intell. 34(9), 1785–1798 (2012)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Computer Vision and Pattern Recognition, 2001. Proceedings of the 2001 IEEE Computer Society Conference on CVPR 2001, vol. 1, pp. I–511. IEEE (2001)
Wood, E., Bulling, A.: Eyetab: Model-based gaze estimation on unmodified tablet computers. In: Proceedings of the 8th Symposium on Eye Tracking Research & Applications (ETRA), pp. 207–210 (2014). doi:10.1145/2578153.2578185
Zhu, D., Moore, S.T., Raphan, T.: Robust pupil center detection using a curvature algorithm. Comput. Methods Progr. Biomed. 59(3), 145–157 (1999)
Acknowledgments
This work was funded, in part, by the Cluster of Excellence on Multimodal Computing and Interaction (MMCI) at Saarland University as well as a JST CREST research grant.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fuhl, W., Tonsen, M., Bulling, A. et al. Pupil detection for head-mounted eye tracking in the wild: an evaluation of the state of the art. Machine Vision and Applications 27, 1275–1288 (2016). https://doi.org/10.1007/s00138-016-0776-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-016-0776-4