Keywords

1 Introduction

The era of automated driving is upon us as recognized recently by both human factors experts [1] and public opinion [2]. Moreover, just as manual driving comprises of a multitude of activities in varying situations/scenarios, automated driving spans a wide range of functional allocation frameworks for the division of responsibilities between human and vehicle [3]. Far from entirely removing human factors that are often lamented as culprits for crash causation [4], the introduction/evolution of automation in driving vehicles continues to present ironies of automation [5] and rather than replacing human performance issues, it only changes them [6].

Human-systems integration benefits are expected from the monitoring of in-vehicle human occupants, especially the driver (or would-be driver), particularly in lower levels of automated driving that require his/her mental engagement and/or readiness to drive. But even in full automation, where pedals/wheel are removed entirely, the behavior of on-board occupants might serve as a communicative feedback input to the automated driving system to better adapt its performance. For example, if the automation is driving in an uncomfortable way human occupants, they may naturally exhibit their worry with greater attention to the driving scene whereas increased trust might be seen from reduced attention [7]. In lower levels of automated driving, humans are needed to periodically become available to drive as conditions may develop beyond operational boundaries of the automation. Under periods or manual, assistive, or supervisory driving control, responsible human beings must continuously monitor the driving activity but unfortunately humans are known to be susceptible to decrements in sustained attention tasks [8] particularly when not in control, lacking prediction/feedback, and forced to discriminate between subtle differences over prolonged periods of time between target signal events [9]. In all of the above cases, it would be useful for a driving system to know if and by how much a human is currently ready (and/or becoming ready) in terms of attention to the task of driving.

Many driving attention studies have been conducted in driving simulators for both good and bad reasons thus resulting in both high quality research and results that are open to criticism [10]. Especially regarding eye-tracking, use of real life visuals is certainly compelling regarding face-validity, and many studies exist employing previously recorded videos of driving rather than artificially generated graphics [11]. Moreover, with the advent of augmented reality, simulated items such as virtual hazards may soon combine on/through the windshields of real life vehicles for new innovative approaches to driver training and research [12] e.g., exploiting real world dynamics/aspects while retaining more benefits of experimental exposure, control, and repeatability than previously practically available.

Furthermore, catalyzed by exponential gains in computational power and reductions in size and cost of measurement equipment (especially cameras), on-road field operational and naturalistic studies have been growing in the last few decades [13] and benefit not only from real life visuals but all the pertinent interactive behavior that comes with being transported in a real vehicle with real life traffic complexities, social interactions and driving difficulties/dangers. Furthermore, alongside the rapid market release of living laboratory beta-testing of automated driving systems (i.e., Tesla Autopilot) and other commercially available features from traditional automobile manufacturers, many on-road studies are being conducted specifically regarding ADAS and automated system interaction and often include eye-tracking and attention components in the research.

Considering all the recent technology developments and advances in conducting driving research and especially the reduction of cost of measurement equipment, another source of data worth considering is attentional behavior from passengers to complement that collected from drivers. Under normal driving, presumably, a driver completes a traditional closed loop control-feedback system whereby consequences of exerted actions are compared for deviations from desired reference goals/targets which in turn influence/determine future driving control inputs. A passenger, by definition is out of that control loop in terms of lacking physical input devices, yet retains (mostly) the full range of possible visual feedback afforded to drivers and hence may at times show similar and sometimes disparate eye behaviors. Presumably, then passenger eye behavior data may reflect that of a driver who is not actuating on driving input devices (i.e., various levels of automated driving). For example, an assumption of SAE level 2 driving automation systems [3], is that only the hands and feet activity of a driver should be relieved through automated lateral and longitudinal control, whereas the eye and mind responsibilities should remain with the driver. How in- or out-of-the-loop such a driver might be, might overlap with or be informed by measurable behavior from passengers in manually driven cars, without risking safety of experimenting with less than 100% attentive drivers of supervisory control.

A unique benefit to approaches utilizing passengers paired with drivers in the same vehicle, is that both are driving by definition subject to the same external factors, e.g., weather, traffic, local road infrastructure, time of day, etc. whereby they may serve as pseudo controls of one another in ipso facto manner. While not yet widely discussed or mainstream in driving safety research, such an innovative approach utilizing passengers is emerging. Through the power of suggestion and suspension of disbelief, Wizard-of-Oz procedures [14] essentially treated on-road passengers as drivers/supervisors of automated driving systems wherein the vehicle was actually driven by another human hidden behind a partition. Most relevant to the current study, [15] took simultaneous measurements from both the driver and passenger together via contact physiological instruments (i.e., EOG, EEG) to assess eye and attentional behavior of paired participants. The underlying motivation of [15] is an assumption shared by our present study, in which we believe the differentiation in the eye data from such participants may inform the development of detection systems monitoring the visual behavior of drivers with automated systems (i.e., where there are breaks/disconnects in driving physical control but not necessarily so in visual/mental attention).

2 Methods

2.1 Participants

Participants (16 pairs, 78% male, 22% female, mean age = 27.3, SD age = 2.4) were recruited from Delft University of Technology, and the study was run after obtaining written informed consent under the approval of the Human Research Ethics Committee “On Road In Vehicle Eye Tracking: Drivers and Passengers” (26-09-2016). Pairs were formed around a quasi-experimental variable of familiarity, such that half of the pairs knew one another well, whereas with the other half, participants were not known to one another in advance. However, analyses pertaining to this aspect have been reserved for adjunct publications. Each participant had normal or corrected-to-normal vision and reported having capability/comfort in driving from having obtained their initial driver’s license for at least more than one year prior to the experiment.

2.2 Driving Route

The driving route began from the rear parking lot of the 3 mE Faculty building of Delft University of Technology (Leeghwaterstraat) and proceeded across campus and southbound on Schoemakerstraat, westward along Kruithuisweg to join the A4 highway northbound until exit 12 for route N 211, at which point the route crossed over the highway to return back along the same roads in reverse direction (Fig. 1). The full route was completed as one trip of about 20.0 km and around 30 min on average, and repeated per pair with a switching of driver/passenger role, for a total of 32 trips.

Fig. 1.
figure 1

The driving route in the experiment covering mixed urban and highway roads. Representative screen shots are provided for various route segments at labeled points on the map.

2.3 Procedures

Drivers were always given no other instructions than to drive as they normally would in a safe manner. Passengers were initially given no instruction other than to allow for them to do whatever they wanted and normally might do as a passenger, but from the turnaround point (along the on-ramp to re-join the A4 highway southbound), the passengers were given a piece of paper to covertly assign an additional experimental instruction to “Please imagine that you are doing the driving. So try to pay attention and behave with your eyes as if you are currently driving. You do not need to move your hand/feet like a driver.” In lieu of naturalistic motivations, no restrictions were expressed/placed on conversation, use of electronic devices, etc.

2.4 Apparatus

Both passenger and driver participants wore UV shielded eye-tracking glasses from SensoMotoric Instruments (SMI) coupled by a single USB cable each to their own dedicated Samsung Galaxy smartphone running only the eye-tracking software and were held by ride-along experimenters in the backseats (Fig. 2). The car driven was a 2014 Toyota Prius Hybrid passenger vehicle with automatic transmission without use of any cruise control. Subsequent publications are planned to detail additional equipment and data collected during this experiment from concurrent on-board telemetry of the research vehicle including forward radar, GPS, steering, and pedal inputs, etc.

Fig. 2.
figure 2

Passenger and driver equipped with minimally invasive eye-tracking glasses.

2.5 Eye-Tracking Data Measurement/Analysis

The SMI glasses recorded eye measurement samples at 60 Hz. The gaze eye data was indexed along a 960 wide × 780 tall pixel coordinate location system (upper left corner origin) in respect of the image plane of the integrated forward facing camera. It should be noted that the coordinate frame moved as the participant moved his/her head and it remains for us later outside the scope of the present study to further (re)analyze our data in a rectified/resolved 3D world model as needed. Eye data was classified via the automatic categories provided from SMI BeGaze with the default settings as follows. The “low speed event detection” algorithm derived saccades between fixations that were identified from a minimum duration setting of 80 ms and a maximum dispersion area of 100 px. Blink events registered as special fixation cases, whenever the pupil diameter was less than 1 pixel and/or the horizontal and vertical gaze position equaled 0, while discarding durations shorter than 70 ms.

Eccentricity scores were computed as a product of the duration and distance of raw gaze samples away from an individually calibrated central point in head coordinates. First, a diagonal Euclidean distance was computed from a gaze sample location to the central X/Y coordinate point (480,360), and then divided by 600 (and multiplied by 100) to result in a percentile of distance from the center to the corner of the coordinate grid. Next, such relative distances were rounded into bins of size 10 and over an entire period of analysis interest, a modal percentage bin was selected as representative of a new calibrated average center gaze location (i.e., allowing for this to be somewhere off the coordinate grid center point). Finally, for a given analysis period of interest, for every new sample that is outside of that central modal bin (and the previous was not), the duration was recorded until the next gaze sample returns to that central modal area.

In accordance with the analysis procedures used in the previous driver and eye-tracking differentiation study [15], all replicated eye measurement dependent variable data was individually normalized according to the equation: Vnorm = (V − VpLow)/(VpHigh − VpLow) where V indicates the non-normalized variable, and VpLow (VpHigh) indicates the bottom (top) 20-percentile of the variable data population of each participant. Subsequently, bandwidth cutoff thresholds of 0 and 1 were applied to the resultant Vnorm values. In the current study, eccentricity scores were not yet normalized, however, to explore the potentially meaningfulness of off-nominal behavior captured in the data as a more robust differentiating measure of in/out-of-the-loop.

The current study did not yet sectionalize different portions of the route into specific urban and highway scenario portions, but instead examined the three following time/event window periods of interest. Additionally, for the first two periods of interest with shorter time windows, eye tracking data was analyzed with half of the data (i.e. those from the first trip the paired participants drove) in order to help mitigate potential learning/bias effects of participant attentional/visual behavior.

  1. (1a)/(1b)

    “Post/Pre Task” = 120 s

The portion of time for the few minutes immediately before compared against the time period immediately after the passenger task instruction presentation (±120 s). Here the data was taken from passengers only and the “post-task” data (1b) are regarded as “pseudo drivers” attempting to represent visual control whereas the “pre-task” data (1a) are regarded still as natural (untasked) freely varying passenger eye data.

  1. (2)

    “Entering A4” ~around 45 s;

The first highway on-ramp and merging period where it was assumed a driver would be likely to prioritize and evidence high levels of dedicated driving control visual behavior (about 45 s). Both driver and passenger eye data are included.

  1. (3)

    “Gate to Task” ~around 900 s;

From the start of the trip (leaving the parking lot gate) up until the start of the passenger task manipulation (about 900 s). Both driver and passenger eye data are included.

3 Results

The result sections are first organized by dependent eye measure: saccade amplitude, blinks, and eccentricity, and within each divided by sectionalized periods of interest (pre/post passenger task instructions, entering A4 highway, and gate-to-task).

3.1 Saccade Amplitude

Figure 3 suggests that across the three analysis periods of interest, smaller saccades appeared more frequently in drivers (or pseudo drivers) than in passengers, whereas the reverse was true of larger saccades. This relational difference was expressed most apparent within the period of entering the A4 highway and least when taken across the full period leading up until the passenger instructional task. In interest of replicating the results from similar previous work [15], the same proportional divisions for “small” and “large” saccades were taken and compared, see dashed vertical boundary lines in Fig. 3 spanning from 0.05 to 0.2 and from 0.2 to 0.8 of the normalized range respectively.

Fig. 3.
figure 3

Relative frequency of saccade amplitude in each analysis period of interest.

A two-way analysis of variance ANOVA for role (“driver”, passenger) and analysis period (around passenger task change, enter A4 highway, gate to task) was conducted separately for the number of normalized small and large saccades (presented per second and re-scaled to absolute units in Fig. 4). For small saccades, the hypothesized directional difference was found in higher rates of small saccades per second for driver roles (m = 0.418, SD = 0.38) than passengers (m = 0.366, SD = 0.37), F (1,117) but failed to obtain significance in the present analysis, (p = 0.50). A large increase in rate of small saccades was observed in the period surrounding the passenger task change relative to the other two periods of analysis, but also failed to obtain significance, (p = 0.16). Lastly, the interaction effect also did not reach significance levels (p = 0.90). Similarly, for rates of large saccades, the role appeared in the expected direction with higher probabilities of larger saccades for passengers (m = 0.754) than for drivers (m = 0.673) but did not reach significance (p = 0.49), nor for analysis period window (p = 0.26), nor for significant interaction effect (p = 0.79).

Fig. 4.
figure 4

Number of small (left panel) and large (right panel) saccadic eye-movements.

3.2 Eye Blinks

A two–way analysis of variance ANOVA for role (“driver”, passenger) and analysis period (around passenger task change, enter A4 highway, gate to task) was conducted for eye blink duration (Fig. 5). Significant results showed for both main effects with a significant interaction. Overall, drivers evidenced shorter blinks than passengers, F (1,15922) = 12.832, p < 0.001, \( \upeta_{{\rm p}}^2 = 0.001\) and significant differences were found between the analysis period windows, F (2,15922) = 6.15, p < 0.01, \( \upeta_{{\rm p}}^2 = 0.001\).

Fig. 5.
figure 5

Mean duration of blinks, normalized (left panel) and in milliseconds (right panel).

Tukey’s post hoc analysis of period window showed average blink durations significantly lower in the full 900 s period from gate to task compared to the ±120 s window periods around the passenger task instruction (p < 0.01). The interaction effect showed an increasing differentiation between drivers and passengers in higher specified/controlled contexts (p < 0.01), \( \upeta_{{\rm p}}^2 = 0.001\). Differences did not reach levels of significance regarding a main effect of analysis window period (p = 0.36).

3.3 Eccentricity

A two–way analysis of variance ANOVA for role (“driver”, passenger) and analysis period (around passenger task change, enter A4 highway, gate to task) was conducted for the raw non-normalized scores of eccentricity (i.e., a product of off-center distance multiplied by off-center duration). Significant results were obtained for the main effect of role (Fig. 6) with passengers evidencing higher average eccentricity scores (m = 20.766, SD = 8.97) over drivers (m = 15.508, SD = 8.81), F(1,117), p < 0.005, \( \upeta_{{\rm p}}^2 = 0.079\). Differences did not reach levels of significance regarding a main effect of analysis window period (p = 0.36), and the directional main effect of the driver vs. passenger role evidenced parallel differentiation without interaction effect (Fig. 6).

Fig. 6.
figure 6

Eccentricity dimensional distribution (left) and average scores (i.e., the functional product of off-modal-center gaze sample distances multiplied by duration of time until returning to that center) across driver vs. passenger role and analysis period of interest (right).

4 Discussion/Conclusion

In regards to the recent and innovative driver/passenger study [15], the present study complements by replication and extension of both procedural methods and results. In both cases, simultaneous eye-tracking information was extracted from both drivers and front passengers in the same vehicle on the same trip, and with a reversal of role assignment, the trip was repeated. Relative to active in-control drivers, the major eye-tracking results of [15] found passengers to exhibit lower amounts of smaller saccades and higher amounts of larger saccades, found lower visual processing loads of large saccades, and longer blink durations indicative of reduced arousal and collectively reflective of a decrease of attention in passengers. Comparatively, the present study also sought and found such a relational reduction in attention to driving control from drivers to passengers but employed less invasive and demanding eye measurement devices (i.e., eye-tracking glasses similar in form factor to sun-glasses attached by a single USB cable to a smart-phone), as well as direct instruction to task the passenger and his/her attention at a specific point in the drive. Additionally, the added location information from camera based eye-tracking glasses relative to the EOG electrodes of [15], allowed for direct distance measurements. Such eye distance movement data was used in combination with time to derive a measure of eccentricity (i.e., product of off-center samples multiplied by duration of time off-center) to more robustly/meaningfully differentiate between the attentional behavior of in-control drivers and passengers who have no direct vehicular control.

For driver state monitoring aims, it is not the direct reliable classification of passenger eye behavior per se that is most valuable, but what is particularly productive from our results are the main trends (e.g., >85%) evidenced in self-to-self comparisons (Fig. 7) of an increasing eccentricity from an in-the-loop role driver to an either in-/out- of-the-loop more freely varying attention of the passenger. For example, individualized momentary proportionate increases in eccentricity when comparing the same person when tasked as passenger (i.e., to be acting with their eyes like a driver) back to him/her-self as an untasked passenger role (m = +78%, SD = 0.95) and when comparing a longer extended role change of the same individual from serving as a veridical driver vs. a natural passenger (m = +53%, SD = 0.39) may serve as informative to potential threshold ranges to trigger upon within a driver state monitoring system. For example, such thresholds might be built into a system to evaluate if the driving attention of a human supervisor of automated driving is lapsing into a more passive passenger role or in another scenario serve as a check if the human has reached the required level of attentional performance when returning to driving after a period of being away from it (i.e., automated-to-manual transition of control).

Fig. 7.
figure 7

Eccentricity across analysis periods of interest and across participants both in self and partnered comparisons. Note: passenger eccentricity exceeded drivers in 14 of 16 self- cases (88%, upper left); in 14 of 16 cases (88%, upper right); in 20 of 28 cases (71%, middle); and in 28 of 30 cases (93%, bottom).

Furthermore, such results were obtained from raw gaze samples in spite of inherent noise and lack of head tracking and/or sophisticated techniques to resolve/orient together within a 3D world model. Within our present simplistic eccentricity analysis, saccadic amplitude measures were most likely impacted by any co-occurrence of head movement with a saccade (e.g., in the same direction) that would effectively mathematically reduce, cancel out, or otherwise confound the saccade amplitude measure compared to that same size saccade made without or with less head movement. While this is not of practical concern for blinks (generally) or for saccade amplitude (in [15]) because measurement is at an origin regardless of head orientation, it is interesting that our results show that a measure of modal eccentricity is still able to differentiate in control drivers from lack of control passengers in spite of such head movement confounds. In general, such practical methodological measures and benefits are in line with approaches of both the raw gaze-based “unfiltered” percentage road center (PRC) and central gaze-based PRC approaches described and validated within [16].

Our results regarding eccentricity should be taken with several caveats. In our present preliminary analyses, the “pre-task” passenger eye data included reading the instructions. Furthermore, if a driver more fully concentrated on a secondary task to the point of becoming a primary task, it might be expected that eccentricity would decrease rather than increase. Without world knowledge, the eccentricity measure is agnostic as to what specifically is being concentrated on, but instead reflects more only the presence/absence of concentration/control. Lastly, the differentiating impact of blink duration changed across our analysis period segments whereas eccentricity did not. Future studies should examine the attententional impact of contextual aspects of the driving scene for example such as velocity, road curvature, and other traffic.