Keywords

1 Introduction

Computer security systems encompass concepts and methods for the protection of sensitive information. In this context, user authentication is an essential security task performed daily by millions of users. Traditional solutions employ text-based passwords, which require users to memorize a sequence of alphanumeric characters. However, memorizing strong text-based passwords results in increased cognitive load and often leads to poor usability and limited security [1]. To offer a better tradeoff between security and usability, prior works proposed various picture password schemes [3], which require users to complete a picture-based task to authenticate.

An important interface design factor that affects both the security [4, 5, 7] and usability [9,10,11, 13] of picture password schemes is the background picture(s) used [4, 5, 15]. Several studies have investigated various picture content types, which can be broadly categorized as generic (i.e., not familiar to the users, e.g., stock, landscapes, abstract, etc.) and personal (i.e., highly familiar to the users, e.g., depicting scenes, people, or objects highly familiar to users), and reported their effects on the security and memorability of the user-chosen picture passwords. In particular, the use of generic pictures impacts negatively both the security and memorability of picture passwords [4, 15], while the use of personal pictures impacts negatively the security but leads to increased memorability of picture passwords [15, 16]. In an attempt to achieve a better tradeoff between security and memorability, more recent works investigated and proposed the use of personalized familiar pictures, which are bootstrapped to the users’ prior sociocultural activities, experiences and explicit memories [18, 19, 34, 35], revealing a positive impact on the security without hampering the memorability of picture passwords [18, 19]. Nevertheless, such personalized picture delivery approaches might be susceptible to attacks performed by insiders [21, 22] (i.e., people close to the user, such as, family members, acquaintances) with whom they share common experiences within the depicted familiar pictures.

Given that the process of picture password authentication is a visual search task, eye-tracking technology could be used to shed light on how a legitimate user’s gaze path relates to an insider attacker’s gaze path, and eventually infer whether the person attempting to login is a legitimate user or an insider attacker close to the legitimate user. While attempts have been made towards improving and estimating the security of authentication schemes using eye-tracking technology [15, 23,24,25, 27, 28], to the best of the authors’ knowledge, no research attempts have been made to estimate the legitimacy of the user authenticating in a personalized picture password scheme that leverages on users’ prior sociocultural activities, experiences and explicit memories. This work presents the initial findings of applying an eye gaze-driven metric for unobtrusively estimating the legitimacy of the person authenticating in a personalized picture password scheme by analyzing the users’ eye gaze behavior during login.

2 Related Work

2.1 Picture Content in Picture Passwords

Prior works investigated the use of picture semantics and their effects on the security and memorability of user-chosen picture passwords. Pictures can be broadly categorized as generic (i.e., not directly relevant nor familiar to the users, e.g., abstract, nature, landscapes, etc.) or personal (i.e., directly relevant and highly familiar to the users, e.g., depicting people, objects, or scenes highly personal to users). The use of generic picture content has a negative impact on both the security and memorability of the user-chosen passwords. Studies in [4, 15] revealed that various generic pictures are susceptible to hotspots (i.e., certain points on a picture that are more likely to be selected by users), which leads to the creation of predictable passwords that are prone to automated attacks [30]. From the memorability perspective, generic picture content leads to decreased memorability since users experience difficulties in creating strong connections between their episodic memories and the depicted content [16, 31]. The use of personal picture content also impacts the security and memorability of picture passwords. From the security perspective, the use of pictures that are familiar to the user increases the likelihood of certain areas on the picture to be chosen as part of the password [15]. However, from the memorability perspective, the use of personal pictures leads to increased memorability possibly due to familiarity of users with the depicted picture content [16]. More recent works investigated the use of personalized familiar pictures, which are bootstrapped to the users’ prior sociocultural activities, experiences and explicit memories, revealing a positive impact on the security without hampering the memorability of picture passwords [18, 19].

2.2 Eye Gaze in User Authentication

Eye-tracking technology has been widely used in the context of user authentication. Darrell and Duchowski [23] proposed a rotary interface for gaze-based PIN code entry during user authentication, while Bulling et al. [15] proposed to hide potential picture hotspots using saliency maps. A study conducted by Sluganovic et al. [27] revealed that the reflexive physiological behavior of human eyes can be used to build fast and reliable biometric authentication systems. More recent works employed eye gaze data for predicting image content familiarity in picture password schemes [36], as well as for understanding how individuals make their picture password selections [26]. Moreover, works in [24, 28] proposed eye gaze-driven security metrics for estimating the strength of picture passwords.

3 Eye-Tracking Study

Bearing in mind that when using the personalized picture password approach, the password selections are based on the users’ existing sociocultural experiences, it is probable that such personalized approaches might be susceptible to attacks performed by insiders [21, 22] (i.e., people close to the user, such as, family members, acquaintances) with whom they share common experiences. In order to shed light on this aspect, we conducted an in-lab eye-tracking human attack study focusing on attacks performed by insiders among people sharing common sociocultural experiences. Each session of the study embraced pairs of participants that were closely related (e.g., friends, couples, relatives, etc.) and who shared common experiences. In each session, both participants were first requested to create a picture password, and then each participant was requested to guess the password selections of the other participant.

3.1 Research Question

RQ. Is there a significant difference in users’ visual behavior between legitimate users and insider attackers when authenticating in a picture password scheme that employs personalized picture content?

Fig. 1.
figure 1

A subset of pictures used in the human attack study illustrating sceneries in which participants share common experiences.

3.2 Study Instruments and Metrics

Picture Password Authentication Scheme. We implemented a Web-based picture password scheme, similar to Windows 10™ PGA [32], in which users can create picture passwords consisting of three gestures (any combination of taps, lines, and circles). The picture is divided in a grid containing 100 segments on the longest side and scaled accordingly on the shortest side. The mechanism allows for a tolerance distance in terms of the coordinates on the grid (36 segments around each initial selected segment are acceptableFootnote 1 [13], thus, building a circle of 3 segments radius). This tolerance allows for better accuracy of users’ selections during login. However, there is no tolerance regarding ordering, type, and directionality of the gestures.

Picture Content.

To control participants’ sociocultural familiarity with the picture semantics and thus investigate the research question, we adjusted the picture semantics to reflect participants’ shared, individual and common sociocultural experiences from their daily life context (i.e., working places in the case of colleagues, café/bars in which couples or close friends usually hang out, etc.), as depicted in Fig. 1. For doing so, prior to the study, we asked each pair of participants to provide a set of pictures from places in which they share common experiences. To avoid bias effects, we did not inform the participants about the reason they were providing us the pictures until the end of the study. The sets of pictures were based on existing research that has shown that users tend to select pictures illustrating sceneries [5, 8, 33].

Considering that the number of hotspots and the picture complexity affect the password strength [6, 13], we chose pictures of similar number of hotspots and complexity. For doing so, we followed a semi-automated approach to detect the hotspots regions through a combination of computer vision techniques for object detectionFootnote 2,Footnote 3 and saliency filters [12]. Furthermore, we assessed the equivalence of the two picture sets by calculating the picture complexity using entropy estimators [29].

Equipment and Eye Gaze Metrics.

An All-in-One HP computer with a 24″ monitor was used (1920 × 1080 pixels, 16:9 aspect ratio). To capture eye movements, we used Gazepoint GP3Footnote 4 eye tracker, which captures data at 60 Hz and was calibrated following the manufacturer’s guidelines. No equipment was attached to the participants. Following existing approaches for capturing the variability of users’ eye movement characteristics within picture password schemes [24, 28], we relied on the gaze transition entropy proposed by Krejtz et al. [14]. In particular, we estimated the stationary entropy Hs, which captures the distribution of fixations over the stimulus (i.e., areas of interest (AOIs) in which the eye-tracking metrics are applied). Greater values of Hs occur when the visual attention is distributed more equally among AOIs, while lower values of Hs indicate that fixations tend to be concentrated on certain AOIs. Stationary entropy Hs was conducted using Shannon’s entropy equation:

$$ H_{s} \left( X \right) = \mathop \sum \limits_{i = 1}^{N} p_{i} *log_{2} \left( {\frac{1}{{p_{i} }}} \right) $$
(1)

where X is the set of fixations for each user, N is the number of the available AOIs, and p is the probability of a user to fixate on AOI i. Considering that fixation duration correlates with cognitive processing [17, 20], and that users who exhibit longer fixations on AOIs tend to select them [2], the probability pi is computed as follows:

$$ p_{i} = \frac{{d_{i} }}{N},\mathop \sum \limits_{i = 1}^{N} = 1 $$
(2)

where di is the distribution of pi across N, representing the total fixation duration on AOI i. By applying Eq. (2) to Eq. (1), the entropy of fixations is computed as follows:

$$ H_{s} \left( X \right) = \mathop \sum \limits_{i = 1}^{N} \frac{{d_{i} }}{N}*log_{2} \left( N \right) $$
(3)

N = 3: the picture is divided into three vertical AOIs [14].

3.3 Sampling and Procedure

Participants. A total of 18 individuals (9 females) participated in the study, ranging in age between 25–60 years old (m = 41.43, sd = 11.88). Since the purpose of this study was to understand whether there are differences between legitimate users’ and insiders’ visual behavior, we intentionally recruited pairs of participants that are close to each other (3 couples, 3 pairs of close friends, 3 pairs of colleagues). To increase the internal validity of the study, we recruited participants that had no prior experience with picture password authentication mechanisms, as assessed by a post-study interview in order to exclude any participants with prior knowledge on picture passwords.

Experimental Design and Procedure.

Participation in the study was anonymized to ensure privacy compliance according to the EU General Data Protection Regulation. Participants were informed that the collected data will be analyzed for research purposes only. Also, we took all the necessary measures against Covid-19 to ensure the participants’ safety. The study was conducted in a quiet lab room with only the researcher present and was split in two phases as follows: i) Phase A – Password Creation: Each pair of closely related participants (e.g., friends, couples, colleagues, etc.) visited the laboratory in a pre-scheduled time within the Covid-19 safety regulations. First, the eye calibration process started, and then participants were requested independently to create a picture password by drawing 3 gestures on the picture (any combination of taps, lines, circles) in order to access an online service. To avoid bias effects during Phase B (Human Guessing Attack), each participant created a password on a different picture that depicted places in which they share common experiences; ii) Phase B – Human Guessing Attack: We switched the picture of the pairs and each participant was requested to guess the other participant’s secrets by indicating 3 areas (i.e., 3 (x, y) segments on the grid) on the picture for which they believe that the other participant made their selections around them. Also, we adopted the think-aloud protocol aiming to elicit whether the rationale behind the attacker’s selections is related to the shared memories and experiences with the other participant from the same pair. Finally, both participants completed a questionnaire on demographics.

3.4 Analysis of Results

Visual Behavior Differences Between Legitimate Users and Insider Attackers During Login. To investigate our RQ, we ran a paired-samples t-test with the entropy from Eq. (3) as the dependent variable tested under two different conditions (i.e., during legitimate user login and during insider attacker login). The analysis revealed that insider attackers exhibited higher stationary entropy Hs (8.70 ± 2.02 bits) than legitimate users (1.55 ± 0.78 bits), a statistically significant difference of 7.15 ± 1.24 bits (95% CI, 3.35 to 10.94 bits), t(8) = 4.04, p = .001. Figure 2 shows the stationary entropy Hs of both legitimate users and insider attackers.

Fig. 2.
figure 2

Stationary entropy Hs of both legitimate users and insider attackers.

Revealing the Insider Attacker’s Strategy When Guessing a Picture Password.

To get further insights about the approach followed by the insider attackers, at the end of Phase B (Human Guessing Attack) we asked each participant to show us the picture password selections they made on the screen, and we labelled them as either H (Hotspot), E (Experience spot; provided by the user), or O (Other; non-hotspot, non-experience spot). In order to understand the similarities in terms of areas correctly matched on the picture grid between legitimate users’ password selections and insider attackers’ guessing selections, we disregarded the order and the type of the gestures and rather focused on the positions of the password selections as follows: For circles, we disregarded the radius and the directionality, and kept only the center of the circle as a (x, y) segment, while for lines, we considered only the (x, y) segment of the starting point of the line. Table 1 summarizes the approach followed by the legitimate users and the areas correctly matched by the insider attackers.

Table 1. Summarization of the approach followed by the legitimate users and areas correctly matched by the insider attackers. H denotes hotspot selection; E denotes experience spot selection; and O denotes other (non-hotspot, non-experience spot) selection. The insider attackers’ areas matched are highlighted in gray color.

4 Conclusions and Future Work

In this work, we conducted a controlled in-lab eye-tracking user study focusing on human attack vulnerabilities among people sharing common sociocultural experiences within personalized picture password schemes. Results revealed that insider attackers who share common experiences with the legitimate users can easily identify regions of their selected secrets, as shown in Table 1. The extra knowledge possessed by people who are close to the legitimate user was also reflected on their visual behavior during the human guessing attack phase. In particular, we found that the insider attackers exhibited higher stationary entropy Hs than the legitimate users. As stated previously, greater values of Hs occur when the visual attention is distributed more equally among AOIs, which might occur in cases of insider attackers who use extra knowledge to guess the user’s picture password, while lower values of Hs indicate that fixations tend to be concentrated on certain AOIs, which might occur in cases of legitimate users who know their passwords and make fixations on certain AOIs.

Such findings can be used for the estimation of the legitimacy of the user authenticating in a personalized picture password scheme that leverages on users’ prior sociocultural activities, experiences and explicit memories, and drive the design of assistive security mechanisms. We envision that such visual behavior differences in personalized picture password schemes can be used for the creation of multi-class classifiers for predicting the legitimacy of the individual during authentication (i.e., legitimate user, insider attacker, other attacker). Such a classifier will notify the legitimate users about the type of attacker attempting to login to their account, as well as limit the account lockout threshold accordingly (e.g., apply a more strict policy in cases of insider attackers). Expansion of our research will consider the feasibility of building such a multi-class classifier for predicting the legitimacy of the user authenticating, as well as conducting additional user studies to triangulate findings with diverse user communities and sociocultural experiences.