Keywords

1 Introduction

Humans are often in complex, attention demanding situations, which require them to process information from multiple sources at once. In an interface such as an airplane cockpit many different information sources are present in the form of instrument displays spatially distributed in front of the pilot. Many other such examples exist from air traffic control to remote monitoring of autonomous vehicles in case of a required emergency intervention [16]. Humans are limited in their attentional capacity and thus sample parts of their environment sequentially over time [11]. When humans ‘fail to notice’ it is because of sub-optimal sampling. High information flow due to the number of displays, rapid information change in displays and the dependence of information between displays challenges human attention limits [43]. This may lead to poor decisions with serious consequences.

Incorporating attention guidance with a complex multi-task interface is not straightforward, both from a conceptual and an implementation perspective. Existing approaches for building such systems, often focus on the conceptual aspects of attention guidance e.g. [15, 45], but little attention has been paid to conceptual frameworks that also have a systematic implementation. Other approaches, e.g. [50], use agents as cognitive assistants, they perform autonomous situation assessment and take into account the limitations of human information processing. Still, an important aspect remains open: how can we build an agent system that considers how to convey information to users about ongoing operations and environmental parameters within their attentional limits?

The aim of this work is to show how to rethink attention guidance in multi-task interfaces using cognitive agents [7] that perceive where a user looks, and formulate interaction of display objects as events happening in an agent environment [8]. By observing the state of the various interface objects and in-coming user input data (including eye tracking data) and aggregating it to form beliefs, we conjecture that cognitive agents will be able to provide useful guidance to a user while making important considerations relating to their attention. The system thus both measures the current location of attention (based on eye tracking data) and alters attention by guiding the gaze tasks requiring input. Our specific objective is to exemplify the framework by developing the methods for attention guidance in the MATBII cockpit task simulator [59], showing how to organise guidance for a concrete application. We also wish to demonstrate how the modularity of an agent-based approach eases the process of experimentation and provides some unique benefits for creating a system that is extensible and reproducible.

The contribution of our work is to provide a practical system that makes use of gaze location (a proxy for spatial attention), allowing agents to use this information to help users allocate their limited cognitive resources. To this end we reproduce and improve some aspects of MATBII [59], producing our own simple simulation of a cockpit-based task space. The resulting system, which we refer to as ICU, allows for display changes and eye movements to be monitored externally via an event-based API, making it suitable for experimental settings beyond this work. We have also embedded ICU in a resource-light agent environment, which re-implements in full a single GOLEM container [8]. This supports real-time attention guidance mechanisms using cognitive agents to monitor where a user looks and can support attention guidance in other domains, assuming they provide an ICU like API.

The work is structured as follows. We begin by first outlining related work in describing and measuring attention, its limitations, cognitive workload, the use of agents as assistants, and assess to what extent MATBII has proved useful as an example task space. In Sect. 3 we present ICU, an open source Python implementation of the MATBII task space [59], which functions independently of our agent system. We then describe the agent system ICUa (ICU with agents) we have developed as an experimental platform for bottom-up attention guidance. In Sect. 4 we test the system by simulating and exploring some simple potential human behaviours. Finally in Sect. 5 we discuss the potential of our system, the ability of agents to monitor the environment and user (via eye tracking) to provide useful attention guidance, and its suitability for future human testing and use in further applications.

2 Related Work

2.1 Workload, Eye-Tracking and Attention Guidance

The concept of workload and the demand on the limited attention of the human operator is important in human factors. Mental workload describes the demands on attention made by a cognitive task [42]. Often behavioural and physiological measures are used to try to classify situations as eliciting low or high mental workload e.g. [5, 17, 32, 66]. A high mental workload has the effect of decreasing performance and increasing stress [42]. Often the aim of classifying high/low mental workload is to arrive at a solution aimed at alleviating conditions when high workload is detected, in the form of automation that can be introduced to aid the human operator. However, since the earliest introduction of automation, it has been suggested that in many situations it is important to keep the human in the loop even when a task has been devolved to an agent. The evidence suggests that at most levels of automation it is important that the human operator is kept engaged whenever possible user response might be required e.g. in the case of automation failure [21]. Thus even highly automated systems may need to consider how to convey information is such a way that the user is able to react - the basis of this work.

Multiple ongoing tasks lead to divided attention, which is particularly detrimental to performance [42]. There is a general trade-off between the need for selective attention to solve a given task and the need to detect other tasks that may require attention. In divided attention conditions with complex tasks, the phenomenon of cognitive tunnelling is often observed [41]. In this case if a user is focused on solving a particular task, even salient cues can be missed.

Warning lights and alerts are used in interfaces to capture the attention of the user, but this may lead to a situation where several alerts are activated at once leading to ‘misplaced saliency’ [6]. In this case the attempt to make an area stand out more has in fact the opposite effect by highlighting several areas and thus further overloading the human, as they have to decide which to attend to first. Additionally, overuse of alerting can lead to ‘automation disuse’ where the user comes to ignore the help that is being offered, seeing it as a nuisance [68]. These aspects of attention are key to understanding how to improve situation awareness (SA). SA describes a person’s awareness of relevant aspects of their environment, the comprehension of these aspects, and predictions of what these will mean in the future [20]. Lacking situation awareness is one of the main causes of accidents attributed to human error [61].

In the context of describing how human attention is allocated over multiple displays, it is important to note that the spatial layout of these displays plays part in how the human user represents them [68]. Spatial memory is a key component in monitoring the work space, spatially reorganising parts of the display has been found to be detrimental to performance [25]. Hence, it is generally preferable for an attention guidance system to maintain the layout of the interface to allow a spatial representation to form.

Eye tracking has proved an invaluable tool in attempting to measure workload, through indicators such as changes in pupil size or the duration of each fixation [40]. The spatial specificity of eye tracking has also led to it being developed as a tool for interaction [39]. A recent example uses eye tracking information to ascertain which screen the user is currently looking at in order to guide them to another screen in multi-monitor displays [63]. We propose that the spatial specificity of eye tracking could be used for more localised guidance.

2.2 Gaze Contingent Attention Guidance

There have been many proposals over the years on how to design ‘attention aware systems’ [54]. Concepts such as gaze based notifications have been introduced and evaluated according to their ‘noticeability’ vs ‘distractiveness’ [33]. A great deal of work has been done on gaze contingent attention guidance in the field of education and training where the learner’s gaze is directed in an attempt to ensure optimal learning [56]. This is done by using online eye tracking to detect where the learner is focusing on the wrong information and using changes in the display to guide their attention - the same principles we intend to use in this work. A recent system for air traffic guidance makes use of online eye tracking to monitor the user’s attention and direct it according to a simple logic that decides where the user should be looking [48]. This very specific implementation, with a control system tailored to air traffic control uses peripheral and central cues to guide attention to the necessary parts of the scene. Initial tests with five users suggested some improvement in perceived workload, although clear performance metrics relative to a baseline were not presented in this preliminary work. Earlier work [52] directs the user attention to target locations using a moving dot. In this work they do not consider rules for guiding attention, and the eye tracking and performance results are again not compared to a baseline. However, users reported positively on their interaction with the system, suggesting that this type of display has potential.

2.3 Agents

Human-computer environments where software agents act on behalf of a user are not a new idea e.g. [38], nor is automating tasks to reduce demands on human attention, e.g. [39]. Often agent capabilities have also been developed to predict intention or task state from behaviour i.e. overt responses and interactions, to provide assistance e.g. [50, 57], and although eye-tracking agent assistants have been introduced, they still remain to be fully tested [65]. Adaptive interfaces have also been developed to use human physiological markers, such as heart rate and eye blinks to dynamically distribute tasks between agents and humans [28], but access to their corresponding test-beds is not available.

Cognitive assistants often use agent models to internalise perceptions as beliefs about the environment’s state, actions to produce results (e.g. [37]) or use the BDI model (e.g. see [60]) based on intentions for goals the agent can plan for. Goal reasoning [2] allows goals to be achieved or maintained, including external goals specified by user guidelines and norms [58]. Agent decisions are modelled with preferences over planned goals using logic if there is certainty (e.g. [31]) or probabilities if there is uncertainty (e.g. [22]). Agent decisions may be explained (e.g. [44]) to build trust with the user - key to successfully working with a human [23]. However, many cognitive agent models and their implementation platforms (see [10, 36]) are often resource heavy for real-time applications as demanding as eye-tracking. Although, light-weight versions exist, they are still at a prototypical stage [3]. In addition, the benefit of cognitive assistants for human performance has yet to be thoroughly evaluated experimentally in terms of assessing objective measures of performance compared to baseline - most evaluations rely on user questionnaire data reflecting subjective experiences [50].

To address some of the above limitations, our work is intended as a resource-light test-bed that combines agent environments and a teleo-reactive (TR) agent model [47] to support experiments for attention guidance applications where eye-tracking is a key requirement. TR agent models (e.g. [34]) and implementations (e.g. [13]) exist, and their link with models such as BDI have been studied (e.g. [14]). However, our work is the first to apply a resource-light TR model for attention guidance applications developed as agent environments.

2.4 MATBII as a Use-Case

MATBII [59] is widely used in the human factors literature as a multi-tasking space. It is comprised of clearly defined spatially separated sub-tasks often requiring rapid switching of attention. Difficulty is understood in terms of how often each sub-task needs attention, thus MATBII is often used to investigate low and high workload by changing the level of task difficulty, e.g. [24]. As shown in Fig. 1, the sub-tasks consist of a system monitoring task, checking for changes in colours of lights or positions of scales that require a mouse click response to return to correct state; a tracking task that requires keeping a target within a set of crosshairs; and a resource management task that requires manipulating pumps to keep fuel tanks at the right level. The pumps in the resource management task can be set to fail for a set amount amount of time. Pump failure is shown by a change in colour and the fuel level going out of range is also indicated by change in colour. MATBII is set up in such a way that under high frequency conditions the probability of ‘misplaced salience’ is high. A further important observation found from response patterns on MATBII is the presence of ‘cognitive tunneling’ as described above, manifesting itself as the inability to switch from one sub-task to another [24]. This provides us with multitasking situations, where it is objectively clear at any point what the user needs to look at.

Fig. 1.
figure 1

The MATBII system with sub-tasks labelled. (Color figure online)

There is not a great deal of literature on eye tracking users in MATBII [59]. Nelson et al. [46] report percentage time fixating on each task, Kim et al. [32] report changes in pupil size with increasing workload, and Berthelot et al. [5] extract a property called ‘self affinity’ from eye movement statistics. There is much yet to be explored in the spatial pattern of eye movements whilst completing the task, for instance the effects of misplaced salience and cognitive tunnelling have only been inferred from behaviour, it would be useful to see these effects in more detail by measuring the spatial allocation of attention, which our proposed system allows for and at the same time uses this information to guide attention.

3 Integrated Cognitive User Assistance

3.1 ICU

Although an open source Python implementation of MATBII with eye-tracking (and further) options available has been recently released [12], we found it better suited to our purpose of combining the interface with an agent architecture to develop our own version of MATBII. We have opted for implementing a stripped down version of MATBII, essentially the same in functionality, using just a subset of the tasks but with some functional improvements that we feel are essential for experimentation. We call this system the Integrated Cognitive User (ICUFootnote 1), which forms the interface part of the complete ICUa - with agents. Our system brings new scope for experiments in human factors research owing to more flexible manipulation of the task space, the ability to collect eye tracking data easily and interface in real time and also enables our work.

ICU has a bi-directional event API that may be used to interface with external programs and can be used in a number of ways, including monitoring the system in real time; for us its main purpose is to facilitate interaction with our agent system. We have also tried to provide an improved configuration formatFootnote 2, which can be used to quickly configure experiments by specifying event schedules concisely, and change aspects of the interface and task behaviours. Moreover, the system has built-in support for various kinds of user input, from standard input (e.g. keyboard/mouse) to eye tracking devices and could be easily extended to incorporate devices providing further physiological measures such as EEG or galvanic skin response. Devices are treated as part of the event system, device input is therefore exposed by the event API.

In terms of functionality, ICU reproduces the ‘system monitoring’, ‘tracking’, and ‘resource management’ tasks from MATBII [59] using Python 3, see Fig. 1. These tasks function similarly to those described in detail in [59]. Briefly, the system task involves responding to whether a green light switches off or a red light switches on, lights switch on/off according to a schedule, requiring a mouse click to reset to the correct state. It also includes a set of scales that change over time and that need to be kept as close the mid-point as possible and can be reset to mid-point by clicking on the scale level. The tracking task uses a joystick or keyboard presses to keep a randomly drifting target centred, the extent of the drift is configurable. The resource management task requires the user to switch pumps on and off to maintain the top two fuel tanks at the correct level, the pumps fail at certain times making them unusable, pump transfer rates, tank capacity, burn rate, frequency and duration of each pump failure, among other things can be configured.

To support eye-tracking, ICU provides a wrapper around the PsychoPy library [51], which enables any eye-tracker supported by the library to be used with ICU (we assume that the eye-tracker is already calibrated). The system was tested using a USB screen based X2-30 Tobii eye-tracker, sampling at 30 Hz on average. Raw gaze coordinates are filtered using an I-VT filter with standard moving average as specified in [49], coordinates are classed as fixation (eyes are stationary) or saccade (eyes are moving and thus unable to take in information).

3.2 ICUa: ICU with Agents

Previous work demonstrates the effectiveness of software agents for monitoring practical applications, e.g. see [9, 35, 55, 67]. Here we extend these works conceptually, by introducing an agent environment that contains ICU as an internal object, where different agents can monitor the state of ICU (including information provided by an eye tracker) and perform actions on it to highlight parts of the screen for the user’s benefit. Although our framework is demonstrated with ICU it is not specific to it, as ICU is used here more as an example to integrate any suitable multi-task interface.

An agent environments approach has some significant benefits from a software engineering perspective, especially modularity, which allows us to develop and swap out different objects and agent behaviours easily for experiments. Additionally, an agent-based approach leaves room for expanding the scope for more complex environments by relying on multi-agent communication and coordination models, and as a way of integrating complex cognitive capabilities for guidance e.g. reinforcement learning [4].

Fig. 2.
figure 2

ICUa reference architecture in PyStarWorlds showing the four agents deployed. We assign one agent to each of the first three application simulator tasks: system monitoring, resource monitoring and tracking. These agents subscribe to task specific events enabling them to perceive relevant information about the simulator’s current state, including eye-tracking information about saccade or fixation, and communications from other agents in the system. The agents’ actions have the effect of modifying the application interface i.e. to draw an overlay. We consider actions with two kinds of feedback (a) highlighting a particular sub-task and (b) draw an arrow at the current gaze location that points in the direction of a component that needs urgent attention. The fourth agent, the evaluator, monitors the user’s performance using specific performance metrics.

ICUa is ICU extended with agents implemented in PyStarWorlds [1], an agent environment library that supports Python agent applications. The reference architecture of ICUa, shown in Fig. 2, is based on a specialised single container version of the GOLEM framework described in [7], which is implemented as an event-processing system under a publish/subscribe model [8]. ICU is internalised as an environment object by the API it exposes, so that its state can be perceived and acted upon by agents. Agents have a mind and body [62], the mind controls the agent behaviour, while the body relies on sensors and actuators to situate the agent in the application environment. Agents perceive events with their sensors, make decisions with their mind and attempt actions with their actuators. A type-based publish/subscribe mechanism routes events to/from the sensors/actuators [8], which is known to be scalable. The environment provides a Physics module containing action execution rules where the semantics of each action are defined. We assume that agents are aware a priori of action preconditions/effects and so are able to decide which actions should be taken. ICUa is agnostic as to which agent model is to be used, different models can be adopted depending on the application domain.

For this domain, agents and their behaviours are specified in Python using condition-action rules following the teleo-reactive (TR) execution model [47] for goal directed behaviours (e.g. [18]). We assume a fixed perceive-revise-decide-attempt control cycle [30] that allows an agent to perceive the latest environment changes via the sensors, revise its internal state modelling the environment (or belief store), then decide about what action(s) to take, and finally attempt these actions using the agents actuators. In this setting, the TR model helps us structure the behaviours of the agent within the decide part of the agent’s cycle, according to the goals the agent seeks to achieve. These behaviours are specified as a set of condition/action rules of the form:

$$\begin{aligned} G : \{ C_1 \rightarrow A_1; C_2 \rightarrow A_2;\ldots ; C_i \rightarrow A_i;\ldots ; C_n \rightarrow A_n \} \end{aligned}$$

where G is a goal, \(C_i\) a condition over internal variables (beliefs), and \(A_i\) is either a primitive action, or a sub-goal (giving rise to a sub-behaviour) that can itself be a TR program of the form:

$$\begin{aligned} A_i : \{ C_{i,1} \rightarrow A_{i,1}; C_{i,2} \rightarrow A_{i,2};\ldots ; C_{i,m} \rightarrow A_{i,m} \}. \end{aligned}$$

This gives rise to a significant simplification of a BDI-style planning layer that manipulates a plan library in which plans are comprised of hierarchical, suspendable and recoverable teleo-reactive programs [14]. The top-level goal G for the agent is triggered inside the decide part of the agent’s cycle. The list of rules is scanned top-down for the first rule whose condition is satisfied, to select an intention and the corresponding action is attempted. It is important to note that the conditions are continuously being evaluated at each cycle step, so that when the first true condition changes due to new belief update, the intention changes accordingly. In other words, an action/sub-goal is revised, only when its true condition in the agent’s internal state ceases to be true.

It is straightforward to create a subset of the TR paradigm for developing agent behaviours using Python, or a similar programming language. Assuming a round-robin agent execution of an agent’s control cycle, there is a natural correspondence between TR programs and most programming languages, as shown in Fig. 3. An example of a simple monitoring behaviour that follows this model is given in Fig. 4(a). This kind of programming is quite flexible and can support more complex behaviours. For example, in principle an agent may be monitoring multiple parts of a screen (e.g. multiple pumps for the resource management task), it may attempt multiple actions in a single cycle (e.g. to highlight multiple pumps). As a result, the top-level goal in such cases needs to operate on sets of actions, simulating parallel execution of independent monitoring behaviours, each with the form of a TR program, as for example in the interpretation used by [13], but in our case using PyStarWorlds. An example is given in Fig. 4(b).

Fig. 3.
figure 3

Mapping of a simple version of TR rules in Python, which in PyStarWorlds are evaluated continuously. Sub-goals are method calls. C_n = True forces the last rule to always succeed if all other rules above fail to trigger.

Fig. 4.
figure 4

Agent monitoring examples in Python TR style. In (a) we show a simple single-action monitoring behaviour that highlights a component (part of a task) if needed. In (b), we operate over sets of actions. This enables the agent to highlight many components if necessary. In practice we limit agents to highlighting a single component (to avoid overloading the user), however the parallel execution of behaviours is useful for our simulated human users outlined in Sect. 4.

Using the above architecture we implement a few simple rules for our agents to adhere to. The agents’ goal is to shift the attention of the user to a sub-task that requires action if the user appears to have ignored it. Each agent has a built in grace period, which constitutes whether the sub-task is deemed to have been ignored. If in this time the required action has not taken place and importantly the gaze has not moved to the sub-task, then the agent responsible for the sub-task displays a relevant highlight. A highlight can be configured to constitute an outline of a sub-task, a transparent overlay, an arrow at the current fixation point or a combination of these. This involves agents checking the current gaze fixation position and whether it is in the required sub-task region of interest. Thus, we ensure that guidance is not displayed unnecessarily if attention has been transferred, but an action not yet produced. The agent also checks that no other guidance is being displayed at the time, as the aim is to not introduce a divided attention condition. So only one agent will be displaying guidance at any given time. If the requirements are met, the agent will display guidance and this will remain on display until the gaze position moves to the required sub-task or the task is resolved. Again, once gaze has moved we take this as an indicator that the task will be responded to as required. However, if the user moves their gaze away whilst the task still requires attention, it will again become highlighted after a second grace period, if the gaze has not returned. These simple rules are designed to move the user’s attention on from a cognitive tunnelling situations with minimal unnecessary competing visual additions to the display. We do not assign differential importance to any of the sub-tasks, but such a hierarchy could easily be implemented in future.

4 Simulating Simple Examples of Human Behaviour

Fig. 5.
figure 5

Graphs (a)–(e) show error of three simulated users according to the evaluation metrics given in Sect. 4.1, lower error means better performance by the agent. The ‘stay’ user will ignore agents advice until a particular task is complete (there is not more action to be taken for the moment). The ‘follow‘ user will always take the agents advice and move to solve the recommended task (see Fig. 6 for details). Error is shown as a function of the “delay” period introduced to each decision, a restriction on the user’s ability. There is no delay parameter for the ideal (ideal) user, we mark this minimal possible error as a horizontal line on the graphs for comparison. All results are averaged over 10 runs of 1 min each and normalised in the [0–1] interval. The shaded regions show 95% confidence intervals. Graph (f) shows the error for the ‘tunnelling’ user at two difficulty levels (medium and hard) on the resource management task. The difficulty is set in the configuration file by specifying different event frequencies. For all simulations grace periods until a highlight is displayed are set to 2 s.

To demonstrate the flexibility of our agent system and provide some insight into how the system might perform with different user behaviours we have implemented and evaluated four different kinds of ‘user’ agents, see Fig. 5. Each ‘user’ agent directly observes events from the ICU system and is able to provide fake user input e.g. mouse clicks, key input and eye movement. This set up also provides a basis for future researchers wishing to simulate more complex human behaviour.

The ideal user reacts immediately to MATBII events and is not constrained by any input delay e.g. eye movement speed or response delay, it can simultaneously observe and react to changes in all tasks. This ideal agent is used as a baseline and achieves the highest possible performance (i.e. lowest error rate with no need for guidance).

We also model a worst case scenario with a user agent that never moves their eyes from a sub-task (which in this case is the resource management task) and only makes responses to that sub-task, a case of cognitive tunnelling (‘tunnelling’).

The final two users are imperfect, in that they only respond to a sub-task once guidance is provided and they require some time to provide a response. They will only attempt to solve a sub-task when looking (i.e. while fixated on the sub-task area) and require time to act (including eye movement). In our experiments an agent moves its eyes at a constant speed 1000px/s mimicking the rapid saccades made my humans in between fixations, which we model here as the gaze remaining static in a given location. The relevant part of the two behaviours is presented in Fig. 6.

Fig. 6.
figure 6

Two exerts from the simulated imperfect users showing the key difference in their behaviour. (a) The ‘follow’ user’s gaze always follows guidance when present causing it to abandon its current, possibly unresolved task and move to another. (b) the ‘stay’ user remains focused on a task until no further action can be taken to resolve it then follows guidance to reach the next task.

The two follow/stay behaviours are set up to correspond to two extremes of behaviour, we expect human behaviour to lie somewhere in-between the two. With both, we vary the delay with which they are able to respond. With larger delay times, it is more likely that the ‘follow’ user will abandon a sub-task before it has been solved, while with the ‘stay’ user, other tasks will remain unsolved for a longer period.

4.1 Evaluation Metrics

To measure the performance of each user agent we use the following metrics, which are normalised over time and averaged over components where applicable. The metrics are representative of the user’s error when solving tasks, 0 being a perfect score for each metric.

  • Time that main fuel tanks are out of the acceptable range in the resource management task.

  • Deviation from acceptable state of the scales (\(\mathcal {L}_1 / time\)) in the system monitoring task.

  • Time that warning lights are in an incorrect state in the system monitoring task.

  • Deviation from the central acceptable box (\(\mathcal {L}_\infty / time\)) in the tracking task.

  • Time at least one warning (highlight or otherwise) is displayed on the overlay.

4.2 Simulation Results

The errors calculated using our evaluation metrics are shown in Fig. 5. The ideal user (see Fig. 5) has minimal error, any small error that exists is a result of the slight processing delay due to simulation speed (100 ms per agent cycle). If we look at the performance measure associated with the length of time that highlights are displayed, we see that this is zero for the ideal observer, reflecting that our system does not display highlights when not required.

At the other extreme our ‘worst case’ tunnelling user (Fig. 5) provides an upper bound level of error on tasks other than the resource management task they are focusing on. The on/off nature of the warning lights is reflected in the constant maximum error across delay, the scales and tracking are more variable in their error as it is possible for them to randomly return within acceptable parameters. For this user, after the first grace period, a highlight will always be displayed. In the resource management task of course they perform best as this is the task of focus. We can see the effect of delay in the responses making their performance worse.

In the case of our imperfect users that are guided by the highlights (the follow and stay users in Fig. 5), but do not respond otherwise, their performance is somewhere between the two more extreme ideal and tunneling users, as expected. This reflects that a human who makes use of the highlights to guide them, is of course not a perfect user, but is likely to perform better than a user who completely ignores the need for a response. With these users we can also see the effects of a delay in the response, as the response slows, so we can see the error generally increases. The advantage of staying on a task (‘stay’ user Fig. 5) until it is ‘solved’ varies to some extent with the delay of the response. The warning lights and scales parts of the system monitoring task suggest an initial small advantage for always following the highlighting, which then disappears with delay. The more continuous nature of the tracking task produces a different pattern, with an apparent small initial advantage for the user who remains on task until solved, but with increased delay the user that follows the highlighting performs better. The highlighting metric reflects these two behaviours in that the user that follows the highlighting has less highlights on screen over time. Of course we also see that following highlights will reduce performance relative to the tunnelling approach on the single task chosen to focus on (resource management).

In Fig. 5 (f) we illustrate how task difficulty can be manipulated in our system by changing the frequency of events. We show the results for our worst case ‘tunnelling’ user on the resource management task. With a short delay in the user response there is a relatively small difference in performance between low frequency and high frequency events, as they are able to respond quickly enough to resolve the high frequency events. With increased response delay, in each case the user performs worse, with a higher error for the more difficult case consistently.

Our simulations have shown how attention guidance may in principle improve performance for imperfect users in cases where users shift their attention immediately, compared to when they are unable to shift their attention due to cognitive tunnelling. Our guidance agents’ behaviour has been tailored in an attempt to provide the most useful feedback and avoid overloading the user. We have tested only one class of guidance behaviour, based on the principles outlined in Sect. 2, which works as a proof of concept. Our simulations also allowed us to visualise the effect of increasing task difficulty by increasing event frequency and how this depends on the delay in the user action.

In addition to our simulated user tests, we have tested the capacity of the system and found that the ICUa was able to deal with up to a million events per second without raising any performance issues (for reference, the event load under normal operation does not exceed more than one thousand per second with a high-throughput eye tracking device).

5 Discussion

We have successfully demonstrated, by modifying the widely used MATBII cockpit task simulator, how an information display and interface system can be monitored by agents and how a human user may be incorporated into the agent environment by monitoring of their eye movements and responses. Our agents have been deployed to enact simple attention guidance in a simulated setting. We have demonstrated this important test case as a proof of concept of the architecture of such a system in a simple task space known to replicate some of the problems that have been found with user inattention.

5.1 Simulation Summary

We used our agent system to build ‘user agents’ that are able to simulate some simplified examples of human behaviour synthetically. This enabled us to demonstrate the behaviour of the system by summarizing error patterns under different conditions. We can conclude that the system works as expected, and that following the guidance reduces the error from a worst case scenario where a user is only paying attention to a single task. The different simulated users showed different patterns across delay. Changing the rules we implemented for following the highlights resulted in different performance error patterns. We also demonstrated how the configuration can be set to manipulate the difficulty of a task with a resulting change in performance.

The user agents are designed to demonstrate only upper and lower bound performance, and the effect of following the attention guidance for improving performance. We expect human behaviour to be some combination of our simulated users. Under certain conditions humans will be able to respond with some delay to a sub-task that required a response; sometimes they will only respond when there is a highlight; and sometimes the highlight may cause them to move their attention before they have solved a sub-task. This system is now suitable for experiments with human users to explore these scenarios and to ascertain optimal rules for the agents. For instance, the simulations suggest that highlights may not always be advantageous if the user follows them before they have solved the current sub-task they are focusing on.

The inclusion of user agents opens up our system to further simulations using more complex examples of human behaviours that may occur under different conditions and to test how ideal display rules may vary with different examples of human behaviour.

5.2 Future Experiments with Human Users

ICUa runs on a desktop PC with an eye tracker attached and can record the performance of human participants under different specified conditions. A first step would be to test the current system with the existing simple rules and assumptions to determine if it is effective at guiding attention and thus improving performance in humans. From the simulations it is already evident that there will be certain conditions under which attention guidance is particularly useful. In a low workload condition it may be that user guidance has lesser impact as there are less demands on attention, although studies also show negative effects of low expectancy - very infrequent unexpected events can also be missed, especially if there are other tasks that require constant monitoring [68]. If the events are happening too quickly for human users to successfully deal with them their may be a floor effect where attention guidance no longer helps (as seen in the longer delay times in our simulations).

Experiments would involve manipulating the frequency of events in our system and also comparing highlighting alone vs arrows alone and the two presented together, to examine the cost-benefit of single vs multiple and central vs peripheral cues. We expect to find a ‘sweet spot’ where attention guidance works best. The system can be combined with subjective measures of workload such as the NASA-TLX [26] as used within the original MATBII [59].

The modified ICU interface makes its suitable for measuring eye movements during a task that is similar to MATBII, making it suitable for wider experimentation beyond our current set-up. This environment provides an ideal testing ground, for different methods of attracting and maintaining attention. Attention guidance could be varied by choosing different colours of highlights, or implementing synchronous flashing between the highlight and the arrow [68] or by blurring areas that are not of interest [27]. The current system allows for measuring associated eye movements, responses and performance with such changes. Moreover, by making use of the agent architecture, our simple rules for deploying attention guidance could be altered to observe the best effects on human performance. The use of agents provides a useful way of manipulating rules for changing displays.

5.3 Potential Further Extensions and Applications

As highlighted, one strength of the agent-oriented approach and our agent model is that it is modular - extra modules can be added in terms of additional interface tasks and associated inputs, but also in terms of physiological signals of attention and other measures of the human mental state we may want to represent. Not just visual, but also auditory inputs for example are possible and additional physiological measures can take us beyond tracking spatial visual attention, including other measures that can be read from the eye tracker such as pupil size. Pupil size has been a useful measure in terms of tracking vigilance, fatigue and workload [53].

As more complex inputs are added, so the agent behaviour repertoire can be expanded. More complex rules can be added, leading to the agents performing more complex calculations that exceed human capacity, defining for example what would be the best thing to attend to for the human, when this is no longer intuitively clear, especially under a moment of high pressure, or taking over some of the task and carrying out some of the required responses automatically. This could be done in an adaptive way [29], incorporating workload in to the agent model to enable adaptive processes, making the most of agents’ human-like ability and explainability. The aim of the explainability is to help the human interpret the environment and the actions of the agents. Using agent behaviour that can be transparent to the human helps build trust, which is critical to optimal human computer interaction [19]. There is no explicit user modelling in the ICUa currently, however agents are particularly suited to more complex user modelling such as those used to track learning through tutoring software [67] and our system is suited for this kind of extension.

Agents can also provide the basis for a learning framework. Whilst agent behaviour may in itself alter due to the ongoing conditions, such as ongoing high workload or fatigue as a way of achieving goals under different conditions, a further degree of individualisation to the user may be possible by enabling agents to learn form the past behaviour of the user.

Mobile eye trackers can be used to map eye position in real time to the surrounding environment recorded by a camera [64]. It has been suggested that gaze based interactive displays could be useful in a cockpit setting [39], which MATBII is set up to mimic some aspects of. Current AI cockpit applications involve automating many systems, which involves the human user handing over control. A future application of our system may be providing a way to ensure that human monitoring remains interactive, to keep the human in the loop in a cockpit environment. Increasingly cockpits are augmented, often being displayed in helmet in heads-up displays that can be moved around and tailored to the user, something that could be incorporated in to the agent system. The system we have developed aims to ensure that once a target for attention is known to the system, it is successfully processed by the human user. This emphasis means our work has applications in many systems where attention guidance might be called for, such as in semi-autonomous vehicles, within the remote operator room for automated vehicle systems, in air traffic control, or even alerts that may go unnoticed or not fully comprehended in everyday office computer usage.

6 Conclusion

We have built an attention guidance system using agent environments as the underlying framework. Central to our work is the notion that an interactive computer system construed as an agent environment should represent the human user as an entity providing continuing feedback, so that the system can ensure that they can process information within their limited attentional resources in order to produce the necessary human responses. Our proof of concept prototype aims to keep the human in the loop, in this case primarily via their eye movements and with feedback from agents. Our agent-based approach to attention guidance presents some clear advantages, such as modularity, scalability and extensibility. We propose that our approach, as exemplified by our system, is suitable for a wide range of experimentation where humans interact with multi-display interfaces based on attention guidance, and this is our next step for continuing this research.