Abstract
Image-recognition Human Interaction Proof (HIP) schemes are widely used security defense mechanisms that are utilized by service providers to determine whether a human user is interacting with their system and not malicious software. Inspired by recent research, which underpins the necessity for designing user-centered HIPs, this paper examines, in the frame of an accredited cognitive style theory (Field Dependence-Independence – FD-I), whether human cognitive differences in visual information processing affect users’ visual behavior when interacting with an image-recognition HIP challenge. For doing so, we conducted an eye tracking study (n = 46) in which users solved an image-recognition HIP challenge. Analysis of users’ interactions and eye gaze data revealed differences in users’ visual behavior and interactions between Holistic and Analytic users within image-recognition HIP tasks. Findings underpin the added value of considering users’ cognitive processing differences in the design of adaptive and adaptable HIP security schemes.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Image-recognition CAPTCHA
- Human interaction proof schemes
- Human cognitive differences
- Eye tracking study
1 Introduction
Human Interaction Proof (HIP) schemes (or Completely Automated Public Turing Test to tell Computers and Humans Apart - CAPTCHA) are common and widely used security defense mechanisms in online services [1]. HIP schemes require users to prove that a human user is interacting with the system and not a malicious software through a challenge-response test, aiming to keep online services protected from malicious automated software agents [2]. The design of an efficient and effective HIP scheme is an inevitable tradeoff between usability and security. Increasing the HIP’s challenge difficulty leads to improved security of the mechanism, however, usability is significantly decreased [3, 4, 24, 30]. Therefore, numerous works focused on providing a better tradeoff between usability and security of these mechanisms [24,25,26,27,28,29,30,31,32,33,34,35,36]. Current HIP implementations can be broadly categorized as text-recognition HIP schemes, which require users to recognize a set of distorted textual characters, and image-recognition HIP schemes, which require users to solve image puzzle problems (e.g., identify a set of images among a larger set) [5].
Nowadays, one of the most commonly used HIP schemes is Google’s reCAPTCHA (Fig. 1) [6], which aims to minimize users’ cognitive burden through implicit user interaction data collection methods. In particular, the mechanism uses intelligent techniques to analyze the users’ interaction data on a given Website to implicitly infer that a human interacts with the service without asking the user to solve a challenge-response test. Nonetheless, in cases in which the mechanism is not confident on the data accuracy, a fallback image-recognition task must be solved by the user. This fallback task typically splits an image into a 3 × 3 grid, and asks the user to select the segments of the grid that contains the requested information (e.g., cars, traffic lights, cats, etc.).
Research Motivation.
From a cognitive processing perspective, the image-recognition HIP task requires visual information processing, and research indicates individual differences in such information processing, which suggest that individuals have an inherent and preferred mode of processing information either holistically (globally) or analytically (locally) [9, 10]. Among a plethora of cognitive processing differences theories, this work focuses on the field dependence-independence cognitive style theory [10], which is an accredited and widely applied model [11,12,13,14,15] that highlights human cognitive differences into Field Dependent (or Holistic) and Field Independent (or Analytic). Evidence suggests that Holistic and Analytic individuals have differences in visual perception and visual information processing [7, 10,11,12,13,14,15]. While Holistic individuals view the perceptual field as a whole and are not attentive to detail, Analytic individuals view the information presented by their visual field as a collection of parts and tend to experience items as separate from their backgrounds.
Given that such human cognitive differences exist, we believe that current “one-size-fits-all” approaches employed in image-recognition fallback HIP schemes might favor a certain type of cognitive style group (Holistic vs. Analytic). Hence, in this paper, we investigate whether human cognitive differences of visually processing information influence users’ visual behavior when interacting with an image-recognition HIP task. For doing so, we conducted an eye tracking study (n = 46) in which users solved an image-recognition HIP task. Analysis of results revealed several main effects of human cognitive differences towards user interaction and visual behavior in image-recognition HIP schemes.
2 User Study
2.1 Research Questions
-
RQ1. Are there differences in time to solve the image-recognition HIP challenge between Holistic and Analytic users?
-
RQ2. Are there differences in time to explore the image-recognition HIP between Holistic and Analytic users?
-
RQ3. Are there differences in users’ visual behavior while exploring and solving the image-recognition HIP challenge between Holistic and Analytic users?
2.2 Study Instruments
Image-Recognition HIP Mechanism.
We developed a Web-based image-recognition HIP mechanism (Fig. 2), in which an image is segmented in a grid of 3 × 3 smaller parts. The instructions of the task are displayed above the grid and the submit button is displayed below the grid. Users are asked to select all squares that contain the requested information (e.g., a window) in order to solve the challenge. Then, users are requested to click on the submit button to validate their solution. If the provided solution is incorrect, an error message is displayed to instruct users to retry.
Apparatus.
The study was conducted using an All-in-One HP personal computer with a 24” monitor at a screen resolution of 1920 × 1080 pixels. To capture the eye gaze metrics, we used the Gazepoint GP3 video-based eye tracker [16]. No equipment was attached to the participants.
Eye Gaze Metrics.
Following common practices, we selected fixation count as suggested in [8, 17], which is the total number of fixations during which the eyes of a user focus on a certain item within the surroundings.
Human Cognitive Factor Elicitation.
Users’ holistic and analytic characteristics were measured through the Group Embedded Figures Test (GEFT) [18], which is a widely accredited and validated paper-and-pencil test [11,12,13,14,15]. The test measures the user’s ability to find common geometric shapes in a larger design. The GEFT consists of 25 items. In each item, a simple geometric figure is embedded within a complex pattern, and participants are required to identify the simple figure by drawing it with a pencil over the complex figure. Based on a widely applied cut-off score, participants that solve less than 12 items are considered to have a holistic cognitive style, while participants that solve greater than or equal to 12 items are considered to have an analytic cognitive style.
2.3 Sampling and Procedure
Participants.
We recruited 46 participants that were undergraduate university students. We note that two users were outliers and did not have sufficient eye tracking measures, and were thus excluded from the analysis, resulting in a final dataset of 44 users. To increase the internal validity of the study, we recruited participants that had no prior experience with image-recognition HIP schemes, as assessed by a post-study interview.
Experimental Design and Procedure.
We adopted the University’s human research protocol that takes into consideration users’ privacy, confidentiality and anonymity. All participants performed the task in a quiet lab room with only the researcher present. To avoid any experimental bias effects, no details regarding the research objective were revealed to the participants until the end of the study. The user study involved the following steps: i) participants were informed that the data collected during interaction with the HIP mechanism would be stored anonymously and would be used only for research purposes; ii) users signed a consent form and completed a questionnaire on demographics; iii) an eye-calibration process followed; and iv) participants were then requested to solve an image-recognition HIP challenge in order to access an online service. Aiming to increase ecological validity of the user study, we applied the HIP challenge as a secondary task of user interaction. Finally, a post-study interview was conducted to get further insights on the users’ interactions and experiences with the HIP scheme.
3 Analysis of Results
Data are mean ± standard deviation, unless otherwise stated. There were two significant outliers in the data that were excluded from the analysis, as assessed by inspection of a boxplot. Figures 3, 4 and 5 illustrate the summary of results; the times to solve the image-based challenge, the times to explore the image-based challenge, and the number of fixations during user interaction with the image-based challenge respectively.
3.1 Differences in Time to the Solve Image-Recognition HIP Challenge Between Holistic and Analytic Users
To investigate RQ1, an independent-samples t-test was run to determine if there were differences in time to solve the HIP task between Holistic and Analytic users (Fig. 3). There was homogeneity of variances, as assessed by Levene’s test for equality of variances (p = .058). Results revealed that Analytic users needed more time to solve the HIP task (9.78 ± 5.45 s) than Holistic users (8.3 ± 3.6 s), however this difference was not statistically significant with a difference of 1.47 s (95% CI, −4.25 to 1.29), t(42) = −1.077, p = .28.
3.2 Differences in Time to Visually Explore the Image During Solving the Image-Recognition HIP Challenge Between Holistic and Analytic Users
To investigate RQ2, an independent-samples t-test was run to determine if there were differences in time to visually explore the image between Holistic and Analytic users (Fig. 4). There was homogeneity of variances, as assessed by Levene’s test for equality of variances (p = .246). In line with time to solve, results revealed that Analytic users spent more time to explore the image (7.33 ± 4.09 s) than Holistic users (5.19 ± 2.95 s), a statistically significant difference of 2.13 s (95% CI, −4.28 to 10.75), t(42) = −2.008, p = .051.
3.3 Differences in Eye Gaze Behavior During Solving the Image-Recognition HIP Challenge Between Holistic and Analytic Users
To investigate RQ3, we conducted two analyses with the number of total fixations and number of revisits on fixations as the dependent variables. We first investigated whether there were differences in total number of fixations between Holistic and Analytic users (Fig. 5). A Welch test was run due to the assumption of homogeneity of variances being violated (p = .001). Results revealed that Analytic users generated more fixations while exploring the image (29.45 ± 14.31) than Holistic users (20.2 ± 6.48), a statistically significant difference of 9.24 (95% CI, −15.81 to −2.66), t(25.448) = −2.669, p = .013. We further run a Welch t-test to determine if there were differences in number of AOI (Areas of Interest) revisits between Holistic and Analytic users due to the assumption of homogeneity of variances being violated (p < .001). Results revealed that Analytic users had more AOI revisits while exploring the image (17.65 ± 10.84) than Holistic users 10.12 ± 4.36), a statistically significant difference of 7.52 (95% CI, −12.4 to −2.64), t(24.114) = −2.911, p = .008.
4 Main Findings
The analysis of results revealed several main effects of human cognitive differences (holistic vs. analytic) towards user interaction and visual behavior of image-recognition HIP schemes. Next, we summarize the main findings of the study.
Finding A.
Analytic users required more time to solve the image-recognition HIP challenge compared to Holistic users (95% CI, −4.25 to 1.29; t(42) = −077, p = .28), which can be attributed to their analytical approach in information processing since Analytic users visually explored and processed more attention points compared to the Holistic users.
Finding B.
Analytic users spent significantly more time to visually explore the image-recognition HIP challenge compared to Holistic users (95% CI, −4.28 to 10.75; t(42) = −2.008, p = .051). Such a finding is in line with [11], which suggested similar effects in image-recognition graphical authentication schemes.
Finding C.
Analytic users fixated cumulatively on more attention points (95% CI, -15.81 to −2.66; t(25.448) = −2.669, p = .013) and had a significantly higher fixation count on attention point revisits than Holistic users (95% CI, −12.4 to −2.64), t(24.114) = −2.911, p = .008). This can be explained by their analytical approach in visual information processing, and hence generated more fixations than Holistic users who followed a more global approach in viewing the image grid.
5 Conclusions and Future Work
This paper presents the results of a cognitive-centered research endeavor, which investigated human cognitive differences in information processing and their effects on users’ visual behavior and interaction in image-recognition HIP schemes. For this purpose, an eye tracking study was designed, which entailed a psychometric-based survey for eliciting the users’ cognitive processing characteristics, and an ecological valid interaction scenario with an image-recognition HIP task.
The findings underpin the value of considering human cognitive differences as an important human factor, in both design and run-time, to implement more effective HIP mechanisms and to avoid deploying image-recognition HIP schemes that unintentionally favor a specific group of users based on the designer’s decisions. Specifically, results revealed that Analytic users spent more time to interact and explore the image-recognition HIPs, as well as generated significantly more fixations during interaction compared to Holistic users, which can be explained by the Analytic users’ inherent way of processing information using local information processing streams and paying more attention to detail.
Despite our efforts to keep the validity of the study, some design aspects of the experiment introduce limitations. First, we used a specific background image. Although users’ choices may be affected by the content and complexity of the image [22, 23], we provided images of the most widely used image categories (depicting a specific scenery and people [19,20,21]). Expansion of our research will consider a greater variety of image categories in order to increase the validity of the study. Moreover, considering the controlled in-lab nature of the eye tracking study, the users’ visual behavior and performance might have been influenced, however, no such comment was received from our participants at the informal discussions that followed the task completion.
References
von Ahn, L., Blum, M., Langford, J.: Telling humans and computers apart automatically. Commun. ACM 47, 56–60 (2004)
Chellapilla, K., Larson, K., Simard, P., Czerwinski, M., 2005. Designing human friendly human interaction proofs (HIPs). In: ACM CHI 2005, pp. 711–720. ACM (2005)
Golle, P.: Machine learning attacks against the Asirra CAPTCHA. In: ACM Conference on Computer and Communications Security (CCS 2008), pp. 535–542. ACM (2008)
Bursztein, E., Martin, M., Mitchell, J.: Text-based CAPTCHA strengths and weaknesses. In: ACM Computer and Communications Security (CCS 2011), pp. 125–138. ACM (2011)
Belk, M., Fidas, C., Germanakos, P., Samaras, G.: Do human cognitive differences in information processing affect preference and performance of captcha? J. Hum.-Comput. Stud. 84, 1–18 (2015)
reCAPTCHA. Online: https://www.google.com/recaptcha/about
Constantinides, A., Pietron, A., Belk, M., Fidas, C., Han, T., Pitsillides, A.: A cross-cultural perspective for personalizing picture passwords. In: ACM User Modeling, Adaptation and Personalization (UMAP 2020), pp. 43–52. ACM (2020)
Constantinides, A., Fidas, C., Belk, M., Pietron, A.M., Han, T., Pitsillides, A.: From hot-spots towards experience-spots: leveraging on users’ sociocultural experiences to enhance security in cued-recall graphical authentication. Int. J. Hum.-Comput. Stud., 149 (2021). https://doi.org/10.1016/j.ijhcs.2021.102602
Davidoff, J., Fonteneau, E., Fagot, J.: Local and global processing: observations from a remote culture. Cognition 108(3), 702–709 (2008)
Witkin, H.A., Moore, C.A., Goodenough, D.R., Cox, P.W.: Field–dependent and field–independent cognitive styles and their educational implications. ETS Res. Bull. Series 2, 1–64 (1975)
Belk, M., Fidas, C., Katsini, C., Avouris, N., Samaras, G.: Effects of human cognitive differences on interaction and visual behavior in graphical user authentication. In: Bernhaupt, R., Dalvi, G., Joshi, A., K. Balkrishan, D., O’Neill, J., Winckler, M. (eds.) INTERACT 2017. LNCS, vol. 10515, pp. 287–296. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67687-6_19
Hong, J., Hwang, M., Tam, K., Lai, Y., Liu, L.: Effects of cognitive style on digital jigsaw puzzle performance: a GridWare analysis. Comput. Hum. Behav. 28(3), 920–928 (2012)
Rittschof, K.A.: Field dependence-independence as visuospatial and executive functioning in working memory: implications for instructional systems design and research. Educ. Technol. Res. Dev. 58(1), 99–114 (2010)
Angeli, C., Valanides, N., Kirschner, P.: Field dependence-independence and instructional-design effects on learners’ performance with a computer-modeling tool. Comput. Hum. Behav. 25(6), 1355–1366 (2009)
Belk, M., Fidas, C., Germanakos, P., Samaras, G.: The interplay between humans, technology and user authentication: a cognitive processing perspective. Comput. Hum. Behav. 76, 184–200 (2017)
GP3 Eye Tracker. Online. https://www.gazept.com
Constantinides, A., Belk, M., Fidas, C., Pitsillides, A.: On the accuracy of eye gaze-driven classifiers for predicting image content familiarity in graphical passwords. In: ACM UMAP 2019, pp. 201–205. ACM (2019)
Witkin, H.A., Oltman, P., Raskin, E., Karp, S.: A Manual for the Embedded Figures Test. Consulting Psychologists Press, Palo Alto, CA (1971)
Alt, F., Schneegass, S., Shirazi, A.S., Hassib, M., Bulling, A.: Graphical passwords in the wild: understanding how users choose pictures and passwords in image-based authentication schemes. In: ACM MobileHCI 2015, pp. 316–322 (2015)
Dunphy, P., Yan, J.: Do background images improve draw a secret graphical passwords? In: ACM Computer and Communications Security, pp. 36–47. ACM (2007)
Zhao, Z., Ahn, G., Hu, H.: Picture Gesture Authentication: Empirical Analysis, Automated Attacks, and Scheme Evaluation. Journal of ACM Transactions on Information and System Security (TISSEC) 17, 4, Article 14, 37 pages (2015)
Wiedenbeck, S., Waters, J., Birget, J.C., Brodskiy, A., Memon, N.: Authentication using graphical passwords: Effects of tolerance and image choice. In: ACM Symposium on Usable privacy and security, pp. 1–12. ACM (2005)
Katsini, C., Fidas, C., Raptis, G. E., Belk, M., Samaras, G., Avouris, N.: Influences of human cognition and visual behavior on password strength during picture password composition. In: CHI 2018, p. 87. ACM (2018)
Fidas, C., Voyiatzis, A., Avouris, N.: On the necessity of user-friendly CAPTCHA. In: ACM CHI 2011, pp. 2623–2626. ACM (2011)
Belk, M., Germanakos, P., Fidas, C., Holzinger, A., Samaras, G.: Towards the personalization of CAPTCHA mechanisms based on individual differences in cognitive processing. Springer Human Factors in Computing and Informatics (SouthCHI 2013), Springer-Verlag, pp. 409–426 (2013)
Belk, M., Fidas, C., Germanakos, P., Samaras, G.: Do cognitive styles of users affect preference and performance related to CAPTCHA challenges? In: CHI 2012 Extended Abstracts on Human Factors in Computing Systems (CHI EA 2012), pp. 1487–1492. ACM (2012)
Elson, J., Douceur, J., Howell, J., Saul, J.: Asirra: a CAPTCHA that Exploits interest-aligned manual image categorization. In: Proceedings of the International Conference on Computer and Communications Security (CCS 2007), pp. 366–374. ACM (2007)
Belk, M., Germanakos, P., Fidas, C., Spanoudis, G., Samaras, G.: Studying the Effect of Human Cognition on Text and Image Recognition CAPTCHA Mechanisms. HCI 27, 71–79 (2013)
Vikram, S., Fan, Y., Gu, G.: SEMAGE: a new image-based two-factor CAPTCHA. In: ACM Conference on Computer Security Applications (CCS 2011), pp. 237–246. ACM (2011)
Fidas, C., Hussmann, H., Belk, M., Samaras, G.: IHIP: towards a user centric individual human interaction proof framework. In: CHI ‘15 Extended Abstracts on Human Factors in Computing Systems (CHI EA ‘15), pp. 2235–2240. ACM (2015)
Gossweiler, R., Kamvar, M., Baluja, S.: What’s up CAPTCHA?: a CAPTCHA based on image orientation. In: ACM World Wide Web (WWW 2009), pp. 841–850. ACM (2009)
Tanthavech, N., Nimkoompai, A.: CAPTCHA: Impact of website security on user experience. In: Proceedings of the 2019 4th International Conference on Intelligent Information Technology (ICIIT 2019), pp. 37–41. ACM (2019)
Sim, T., Nejati, H., Chua, J.: Face recognition CAPTCHA made difficult. In: Proceedings of the 23rd International Conference on World Wide Web (WWW 2014 Companion), pp. 379–380. ACM (2014)
Shishkin, A., Bezzubtseva, A., Fedorova, V., Drutsa, A., Gusev, G.: Text recognition using anonymous CAPTCHA answers. In: ACM Web Search and Data Mining (WSDM 2020), pp. 537–545. ACM (2020)
Lazar, J., et al.: The SoundsRight CAPTCHA: an improved approach to audio human interaction proofs for blind users. In: ACM Conference on Human Factors in Computing Systems (CHI 2012), pp. 2267–2276. ACM (2012)
Jiang, N., Tian, F.: A novel gesture-based CAPTCHA design for smart devices. In: BCS Human Computer Interaction Conference (BCS-HCI ‘13). BCS Learning & Development Ltd., Swindon, GBR, Article 49, pp. 1–5 (2013)
Acknowledgements
This research has been partially funded by the EU Horizon 2020 Grant 826278 “Securing Medical Data in Smart Patient-Centric Healthcare Systems” (Serums), and the Research and Innovation Foundation (Project DiversePass: COMPLEMENTARY/0916/0182).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Leonidou, P., Constantinides, A., Belk, M., Fidas, C., Pitsillides, A. (2021). Eye Gaze and Interaction Differences of Holistic Versus Analytic Users in Image-Recognition Human Interaction Proof Schemes. In: Moallem, A. (eds) HCI for Cybersecurity, Privacy and Trust. HCII 2021. Lecture Notes in Computer Science(), vol 12788. Springer, Cham. https://doi.org/10.1007/978-3-030-77392-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-77392-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77391-5
Online ISBN: 978-3-030-77392-2
eBook Packages: Computer ScienceComputer Science (R0)