Keywords

1 Introduction

Developments of automation have alleviated some of the workload pilots encounter when operating an aircraft. ‘Error proof’ or fully automated aircraft are currently still not in use, thus requiring pilots to remain engaged in making the best decisions (Orasanu, 2017). To this end, haptic inputs have been proposed as a human-machine interaction approach to conveying system state information to pilots to aid in pilot decision making process. From initial investigations of haptic input types there are limited experimental published works that identify improvement in user performance and subjective workload using directional cue types. The works show mainly investigation into the use of resistive force feedback guidance cues. Including stick centering to aid with path following (Alaimo et al., 2010; de Stigter et al., 2007; Olivari et al., 2012; Repperger et al., 1997; Schutte et al., 2012), control and using inverted resistive force cues (Lam et al., 2004) to help aid in obstacle avoidance pushing the stick away from the obstacle. In the above studies, the effect of providing haptic task revealed mixed workload results, however some findings have shown an improvement in task performance associated with haptic information. For instance, using stick centering for a path following tasks, De Stigter et al (2007) identified a reduction in subjective workload with an increase in physical workload. This was coupled with an increase in task performance and cognitive capability to perform a secondary task. From both studies the haptic cueing can be reviewed in its first principle state as a lateral and longitudinal stick tracking movement task. The increased physical workload (Lam et al., 2004) is not an ideal effect which is seen when using high resistive forces to push the user to the desired direction. This is especially important if the use of this type of force feedback is required for an extended period of flight time. Research is needed to determine if directional force feedback cues could be used to aid in a tracking task and review if a physical workload can be maintained. In addition, de Stigter et al. (2007) states having a secondary task that is suitable to participant’s ability level is critical and must be managed within the trial.

Research into side stick haptics delivered vibrotactile cues, another form of haptic information, this has demonstrated how cueing users with a “shaker signal” can be valuable source of stall warning information (Ellerbroek et al, 2016). Findings have identified that primary task performance improves with a non-directional vibrotactile cue and stick centering shift, however, these evaluations did not show any clear indications of subjective workload changes.

In the current study the benefits of a haptic directional force feedback cue, implemented alongside stick centering, are examined in a human-in-the-loop experiment. Haptic aiding will be added to the primary task. The study involved non-pilots completing a task that reviewed the impacts of haptic feedback upon two simultaneously completed visual tasks (i.e., a visual/visual dual-task). A cognitive task capable of manipulating task load to emphasize the potential secondary performance related benefits of the haptic implementation. Trials were designed to evaluate the performance, workload, and interface usability benefits of haptic directional cues. Quantitative (e.g., participant performance data, post-trial scale data) and qualitative data (post-study questionnaire) collection methods were employed in the current evaluation.

2 Methods

2.1 Participants

Eleven non-pilot aeronautical staff/students from the Coventry University participated in the study. Participants average age was 32 years (SD = 15.01). All participants had flown a desktop aircraft simulator with an average of 4.95 years (SD = 5.75), during a normal operating time would spend 1.45 h using a simulator (SD = 2.31). From the eighteen participants there were seventeen males and one female. The experiment was approved by the Coventry University Ethics Committee and was in line with ethics guidelines as per the British Psychological Society.

2.2 Materials

A PC was running MATLAB (The MathWorks Inc, 2021) script that was created for the trials. The PC was a HP Z4 G4 workstation Operating Windows 10 Pro 64 with an Intel Core i7 9800X Processor, 32 GB (2 × 16 GB) Memory and 512 GB SSD Hard Drive. Each participant was positioned with a gaming keyboard located on the left and the side stick (Microsoft Sidewinder) located on the right. The screen used for the participant was 1920 × 1080p Full HD screen with a 590 mm (width) × 332 mm (height). Participants were seated of 825 mm from the screen – producing a 46 screen pixels per visual angle (VA). Overall, this produced an overall viewing area of 40.85 (width) × 23 (height) degrees VA within the 1920 × 1080p screen.

Psychophysics Toolbox extensions (Brainard, 1997) was used to draw the visuals to the screen. The software maintained the visual and stick guidance cues logging at 60 Hz and allowed for the retrieval of relevant stick position and keyboard inputs for subsequent performance analyses.

The two continuous tasks were presented simultaneously. Each task was offset from the screen centre by 12 degrees of VA. Thus, tasks were spatially separated by 24 degrees of VA, remaining within participants near peripheral vision range (± 30 degrees of VA). The display area of each task was confined to an 8 (width) × 8 (height) degrees of VA area.

Fig. 1.
figure 1

Tracking (primary) task and Visual N-back (secondary) task presented on the right- and left-hand side of the screen, respectively.

2D Tracking (Primary) Task.

On the right-hand side of the screen the 2D tracking task was presented (see Error! Reference source not found.). This task served as the primary task, where participants were required to maintain the active stick controlled green dot on the moving target cross hair position. The duration of the task was set by the number of N-back trials, hence was 5-min. The dot and cross hair target positions were confined in the 8 × 8 VA task space. A non-harmonic wave frequency was used to define the 2D movement signal of the tracking cross hair. This was preferred to a typical sine wave defined movement trajectory, as non-harmonic wave frequencies reduce the predictability of a signal (Drop et al., 2016b; Scheer et al., 2014). The movement signal of the target cross hair was defined by combining the signals of 10 non-harmonic waves using Eq. 1 and 2.

$$ S = A\sin \left( {{\rm{\omega *P}}\left( {t + {\rm{\varphi }}} \right)} \right) $$
(1)
$$ WaveSignal = S_{1} + S_{2} + S_{3} + S_{4} + S_{5} + S_{6} + S_{7} + S_{8} + S_{9} + S_{10} $$
(2)

The difficulty of the tracking task was manipulated asymmetrically using a staircase method. Whereby the target’s default movement speed was set to 0.3 degrees of visual angle (VAdeg)/sec, which was incremented at a rate of 0.3 VAdeg/sec if the participant was able to maintain the controlled green dot within a proximity of 1.5 degrees of VA from the tracking target for 3 s. The difficulty of the task was lowered whenever the proximity of the controlled green dot exceeded the 1.5 VA radius proximity of the tracking target for 1 s. The difficulty of the 2D tracking task was limited to a speed of 3 VAdeg/sec.

Performance variables were taken from the tracking task which included the maximum difficulty (Settling Maximum, (Max)), Settling Minimum (Min) which was the minimum drop in performance, the steady state (SS), and the time taken to reach the initial (Rise Time, (RiT)) and maximum level of tracking (Transient Time, (TT)). The rise time (RiT) was calculated using the step-response characteristics of participants’ staircase performance on the tracking task using the stepinfo function in MATLAB (Fig. 2). RiT here was defined as the time taken for tracking performance to rise from 10% to 90% of the highest difficulty level achieved to the level of SS achieved presents how these different metrics summarized the tracking performance of one of the participants. Only SS is considered in the analysis.

Fig. 2.
figure 2

Tracking step response with annotations of Rise Time (RiT), Transient Time (TT), Settling Minimum (Min), Settling Maximum (Max) and Steady State (SS)

Haptic Characteristics.

Force feedback was delivered through a Microsoft sidewinder side stick with stick centering. The haptic nudge cue and force profiles are demonstrated in Error! Reference source not found. Stick centering force size was limited to allow for a comfortable force profile to be applied, with a low increase in physical force required to control the stick. There were no shifting forces, and a small dead band is given within the stick centering force profile.

The directional nudge was designed to both alert the participant to tracking task deviation and to provide corrective directional guidance. The cue used a discrete signal that produced a force in the direction of the preferred stick movement to reduce error between the controlled dot and the target cross hair. A nudge was activated if the deviation of the stick from the target was greater than 1 degree of visual Angle (VA). The nudge force cue was delivered as a X and Y directional tangential force. This was dependent on the x and y distance away from the target so that the user could perceive a directional cue (Fig. 3).

Fig. 3.
figure 3

Stick force feedback: (a) stick centering force (b) directional nudge cue force

2-Back Task (Secondary) Task.

A 2-back version of a visuospatial N-back task required participants to store and continuously update visuospatial information in the working memory. The 2-back task was presented on the left-hand side of the screen (see Fig. 1) and consisted of a 3 × 3 grid (each square - 2.5 × 2.5 VA), where target grid locations were illuminated grey for 500 ms. An inter-stimulus-interval of 2500 ms was used producing a duration of 3 s per N-back trial ​(Kane et al., 2007)​. The task consisted of 100 2-back trials and lasted for five minutes. A third of the total trials (32 trials) were randomly assigned as 2-back targets, trials where the cued grid location matched the cued grid location from 2 trials ago.

Fig. 4.
figure 4

Example of a 2-back condition

Correct responses were defined as instances where participants pressed the spacebar when a visual stimulus grid location was presented (see Fig. 4). Incorrect responses were classified where participants pressed the space bar when a non-target grid location position was presented (i.e., a false alarm). Misses were defined as instances where the participant did not press the spacebar when a target was shown. On each run an N-back score was calculated by subtracting the combined frequency of incorrect responses and misses from the number of correct responses (see Eq. 3).

$$ Nback\,Score\, = \,Correct\,\text{Re} ponse - (Incorrect\text{Re} ponses\, + \,Misses) $$
(3)

Post-Trial Workload Questionnaire.

Participants completed the NASA Task Load Index (TLX) questionnaire (Hart & Staveland, 1988). The TLX is a long-established scale designed to capture subjective workload ratings across six workload dimensions: mental demand, physical demand, temporal demand, performance, effort, and frustration. Each workload dimension is measured on a scale from 0–20 (Hart, 2006), where higher ratings represent higher subjective workload. The abridged non weighted approach was chosen due to the inconclusive evidence that dimension weighting improves the TLX’s sensitivity (Hart, 2006).

Post-Study Usability Questionnaire.

Participants completed a 10-min questionnaire at the end of the experiment to provide feedback on the different haptic elements. The questionnaires began with the high-level consideration of the effectiveness the cue modalities (visual alone, haptic alone, and multimodal) in supporting participants with the task. Questions then proceeded to a more specific, low-level review of the functional and physical qualities of the different side stick features. This structured approach was chosen to maximize the level of formative feedback that was achievable within the short time frame. The participants’ data was then grouped by stick preference, calibration and application acceptance. A summary of the difference is opinions and thoughts were reviewed and summarized.

Scenario and Procedure.

The scenario for the current trials were non-specific so that the application of the cue type can be reviewed without the application of effects to review a 2D tracking task using a haptic stick.

The trial lasted approximately 1.5–2 h. Participants began the trial by receiving a verbal and visual presentation briefing to introduce them to the 2-back and haptic interface features. This was followed by a practice session that allowed the participants to familiarize themselves with the layout and the 2-back task. All participants completed three single tasks: 2-back, task tracking task (no haptics) and task tracking task (with haptics). Practice trials were then followed by six dual task trials to evaluate experimental factor: Haptic Nudges presence (2 levels: on/off). The single tasks involved blanking one task and kept the visual angle offset of the task during the trial. After each trial, participants completed the NASA-TLX. At the end of the experiment the participants took part in a debrief questionnaire to obtain insights into the usability benefits of the different haptic features.

Data Analysis.

Performance and NASA-TLX workload data were analyzed with a series of general linear mixed effects models (GLMMs) using the MATLAB Statistical Toolbox. To examine the effects of the tracking task haptic directional cues on dual-task performance and workload the data were fitted using a fixed factor for Nudge (2 levels: Nudges On, Nudges off). In addition, change in dual-task performance and workload over the 3 dual-task runs was achieved using a continuous fixed effect called Run. GLMMs are a powerful statistical method that allows the analysis of multiple observations from each participant without violating the critical statistical assumption of independence. For repeated measures designs, the use of common ANOVA methods which average across individual participant observations, are discouraged in favour of these more robust GLMM methods (Graziotin et al., 2015; Gueorguieva & Krystal, 2004; Laird & Ware, 1982; Winter & Grawunder, 2012).

In the current analysis, we used maximal random effects structures that included random participant intercepts and slopes for all fixed effects that were included in the models. In this paper, we checked the significance (alpha = 0.05) of fixed effects by calculating p-values obtained by likelihood ratio tests. Visual inspection of residual plots was used to ensure no obvious deviations from homoscedasticity or normality.

3 Results

3.1 Tracking Performance

Steady state (SS) performance differences on the primary tracking task between trials with and without nudge cues were minimal; SS difference between conditions 0.05 VAdeg/sec. In addition, SS was relatively stable across the 3 runs. This was supported by the lack of significant (p > 0.05) main effect and interactions including Nudge.

3.2 N-Back Performance

N-back Score, the frequency of correct response minus errors, is presented in Fig. 5. Overall, the presence of the haptics increased participant scores by a mean of 1.29. Though there was considerable variability seen between the participants. Across the input mode and runs, participants scored the highest during the third run with the haptics (mean /SD = 15.82/7.18). Similar deviations were observed during run 2 with the haptics (mean/SD = 13.40/9.71). Across runs, as expected, scores were lower at the start compared to the second and third runs where learning has taken place. The GLMM analysis for N-back score revealed a significant interaction between Run and Haptics (p < 0.05), but no main effects for Nudge (p = 0.24) or Run (p = 0.67). The interaction represented a 3.31 mean increase (CI: 0.01 - 6.64) in N-back score for each run with the haptics.

Fig. 5.
figure 5

N-back score grouped by trial and presence of the haptic nudges. Score standard deviations are shown as error bars.

3.3 Subjective Workload (NASA-TLX) Performance

Overall, there was little difference between subjective workload rating on the TLX between when nudges were present or absent (mean TLX difference = 0.55 points). The GLMM analysis found no significant main effects or interactions for Nudge or Run.

3.4 Post Study Usability

Participant views were collected in a short questionnaire, which have been analyzed and grouped by stick preference, calibration, and application acceptance.

Stick Preference.

A participant struggled at the start to set up the stick comfortably but did achieve this during the task. Two participants struggled to get used to the nudges but felt they got better with time. Two participants mentioned that they found the nudges distracting and prompted them to look at the tracking task. Four participants found the nudges helpful and useful information. One participant struggled with feeling the nudge forces on the stick and only felt the nudge on a few small occasions all other participants could feel the transition between the stick centering and the nudge cue.

Calibration.

One participant mentioned that they found during the dual task they were overshooting the target because of the nudges and impacted their performance on the 2-back task. “When tracking I was travelling towards the force it was overshooting. My brain sent a signal to my hand, but nudges caused overshoot. Nudges draw my attention away from 2-back task which caused me to lose track of the order.”

One participant mentioned the dead band in the stick centering as distracting. Three participants found the nudges were too strong, one participant found it useful in the dual task but not in the single task when focused on the tracking task. Six participants were happy with the size of the nudges.

The nudges were seen to be useful for seven participants and the stick centering was also helpful one participant mentioned the ergonomic shape of the stick as useful.

Four participants thought it would be useful to calibrate the haptics, five participants were happy with the current configuration and did not want to change anything and one was unsure.

Application Acceptance.

Four participants preferred a haptic stick than the usual aircraft controls. Three thought that haptics would be as beneficial as regular yoke or side stick. Two mentioned that with experience the nudge cues would be useful. Participants mentioned priority layering, risk based hierarchical cues that could be used in several phased of flight conditions, user calibration of the size of stick force, training is required and timing between cues are needed to prevent misinterpreting the stick forces.

4 Discussion

In the current study, benefits of side stick force feedback nudge cues were evaluated in a human-in-the-loop trial. Eleven non-pilot participants completed a dual-task experimental that required the completion of two simultaneously presented visual tasks: a primary 2D manual tracking task and a secondary visual-spatial 2-back memory task. The benefits of haptic guidance within the context of a dual-task paradigm were explored by augmenting the primary manual control task with directional force feedback guidance, delivered via the haptic side stick, on half of dual task trials. Enhanced visuospatial memory efficiency was expressed as improved performance in the absence of a corresponding self-reported workload increase.

Research findings investigating force feedback (i.e. stick stiffness) and vibrotactile (i.e. stick shaker) cues (Ellerbroek et al., 2016) have indicated that task performance improves with a non-directional cue, without an increase in subjective workload which supports the study’s findings using directional cues. Stick centering cues have been shown to reduce primary task error and secondary task decision time, and primary task workload (de Stigter et al., 2007) which was not seen in this study.

A possible explanation for the current and previous research findings can be found within Wicken’s Multiple Resource Theory (MRT) (Wickens, 2002, 2008). MRT describes the human brain as having several semi-independent cognitive resources that serve different sensory modalities (e.g., visual, audio, or haptic), and that tasks requiring the use of different resources can often be effectively performed together. Conversely, where two visual tasks are performed simultaneously, as in the current study, the tasks will be in competition with one another for the same resource and will produce interference. Hence, by augmenting one of these tasks with haptic information it is possible the degree of interference between tasks can be reduced. Specifically, participants’ required allocation of visual attention to the target tracking task is reduced in the instance where haptic augmentation is provided. Allowing further visual attention resources to be allocated to the visuospatial 2-back task. Some research suggests that the haptic senses may be a further independent information processing resource that can be used in parallel to the auditory and visual channels (e.g., Sklar and Sarter, 1999; Ho, Tan and Spence, 2005).

User-feedback in the current study revealed that participants perceived haptic inputs through an active stick to be helpful. This is corroborated by a range of applied haptic studies where pilots have reported haptic feedback to be beneficial (Ellerbroek et al., 2016b; van Baelen et al., 2021), even in the absence of any objective benefit (Blundell et al., 2020, van Baelen et al., 2019). However, some participants remarked that the nudge force cue was distracting and needed time to learn to be effective. They also suggested a suitable training period being given. Some participants found the stick forces too large or too small and calibration to the pilots preferred force should be reviewed in future studies.

5 Conclusions

Overall, the findings of the study underline the benefits of adopting a human-centered design approach from early in the design process of complex aviation systems. Results from the study highlight that the haptic augmentation of a manual visual task will likely improve efficiency of a second visuospatial task, when the two tasks are conducted in concert. This consideration will be taken forwards with future applications and experiments. These include increasing the spatial separation between the primary and secondary tasks for mid and far peripheral ranges measuring visual time on task and impacts of turbulent conditions. Specifically, incorporating the end user in the design process results in the development of systems with improved acceptance and should be reviewed and taken forwards with pilots within application-based experiments.