Keywords

1 First Section

1.1 A Subsection Sample

In recent years, autonomous driving technology has rapidly advanced, continuously increasing the level of vehicle automation [1]. The development and adoption of autonomous vehicles are expected to enhance traffic efficiency [2], reduce environmental pollution [3], and decrease traffic accidents by approximately 90%, potentially saving about one million lives annually when fully implemented [4].

According to the classification standards set by the Society of Automotive Engineers (SAE), autonomous driving is divided into six levels: no automation (L0), driver assistance (L1), partial automation (L2), conditional automation (L3), high automation (L4), and full automation (L5) [5]. In 2021, the SAE updated and refined the classification standards for driving automation, further clarifying the functions of Automated Driving Systems (ADS) at different levels of automation and the corresponding human-machine allocation issues. Level 3 conditional autonomous driving can perform all dynamic driving tasks (DDT) within its Operational Design Domain (ODD), without the need for driver supervision of the L3 ADS. However, when the ADS fails or the driving environment exceeds the system's operational design domain and cannot support the continuation of ADS, the system will issue a takeover request (TOR) to the driver. The driver needs to respond promptly and take control of the vehicle to resume dynamic driving tasks [6]. Therefore, when the L3 ADS is operating normally, drivers can engage in non-driving-related tasks (NDRT), such as using a phone, reading, or watching videos. Studies have shown that engaging in NDRT can diminish a driver's ability to understand the surrounding road environment, and this negative impact can persist following a takeover request.

From the perspective of human-machine interface design, the interface and medium for interaction between humans and machines within the autonomous driving system play a crucial role. How the information is presented during the takeover process is vital for enhancing the driver's takeover performance. From the perspective of explainable artificial intelligence, autonomous vehicles, as a quintessential product of AI, can focus on technology, developing new or improved machine learning techniques to enhance the transparency and interpretability of autonomous driving systems’ decisions [7]. Alternatively, it can adopt a user-centered approach, investigating how the content, manner, and timing of information display affect users’ understanding of autonomous vehicles’ decision-making behaviors [8, 9].

This study aims to explore the impact of four information presentation strategies—no explanation, explaining the reason, explaining the outcome, and explaining both the reason and the outcome—on drivers’ situational awareness and takeover performance in L3 automated vehicles when the autonomous driving system issues a takeover request.

2 Related Work

2.1 Autonomous Driving Takeover

In the context of autonomous driving, the most prominent change compared to traditional manual driving is the emergence of takeover tasks by autonomous driving systems [10]. Takeover refers to a new task that traditional drivers need to perform during the autonomous driving process, either due to system malfunctions or simply based on subjective desire [11, 12]. When the system encounters situations it cannot handle, such as limitations in sensors or actuators, unclear environmental observations, etc., reaching the system's capability boundaries, drivers are required to intervene and take control within a reasonable transition time to avoid collisions [10]. Failure to take over a malfunctioning system within the allowed time may lead to accidents involving autonomous vehicles. The takeover performance of drivers is crucial for road safety in the context of autonomous driving, and thus, takeover failures or poor performance may pose potential risks to future road safety.

Previous research has indicated a close correlation between warning system design in autonomous driving systems and drivers’ situational awareness and takeover performance. Specifically, the design of takeover request (TOR) in the warning system is closely related to these factors [13]. Zhang et al. investigated the design of available time before automation system reaches its limits, known as prelude time design [14]. Ma et al. studied the impact of progressive and hierarchical warning stage designs on human takeover performance [15]. Qu et al. explored the relationship between different ways of presenting warning information in autonomous driving systems and driver trust [16]. Bazilinskyy et al. studied the effects of different channels of warning methods on human perception of warning information and system takeover [17]. Therefore, in the context of autonomous driving takeover scenarios, the impact of different warning takeover designs on drivers’ situational awareness and takeover performance has been explored in previous research. However, previous studies have mainly focused on the effects of takeover system designs under specific conditions, and there is still insufficient research on the effectiveness and differences of multimodal takeover prompts.

Hence, the primary objective of this study is to elucidate the influence of explainable information types (multimodal) and explanation structures during the initiation of takeover requests by intelligent autonomous driving systems on drivers’ situational awareness and takeover performance in emergency situations. This research aims to provide design recommendations for the development of takeover warning systems in autonomous vehicles and to expand the application of explainable artificial intelligence in diverse scenarios.

2.2 Explanations in Autonomous Vehicles

Currently, deep learning (DL) and artificial intelligence (AI) models play a crucial role in autonomous vehicles (AV) or connected autonomous vehicles [18,19,20]. Deep learning models exhibit excellent high representation and generalization capabilities but are known for their inherent black-box characteristics, making them challenging to interpret. This opacity hinders the ability to demonstrate the fundamental principles behind the decisions made by intelligent systems. Moreover, due to the system's opaqueness, system failures are often unpredictable and challenging to diagnose, significantly impacting user trust in the system. However, research has shown that trust is a key factor influencing user acceptance and support for autonomous driving systems [21, 22]. Therefore, providing explanations for the decisions made by automated systems is effective, if not essential, in enhancing user trust. This capability not only improves the transparency and accountability of decision-making in autonomous driving but also aids in evaluating the role of autonomous vehicles in pre- and post-critical events (e.g., collisions or near misses).

Research on explanations for autonomous driving systems can be approached from two perspectives: a technical-centric view and a user-centric view [23]. From a technical-centric perspective, efforts focus on studying how to generate effective explanations in dynamic traffic situations. Despite significant progress in this field, explanations for autonomous driving vehicles are far from perfect. The impact of these explanations on human drivers in autonomous vehicles has sparked extensive research.

User-centric explainability research has explored how to interpret content, form, and timing of explanations in autonomous driving systems. Koo et al. investigated the different impact patterns of information explanation structures on driver performance and experience [24]. Rezvani et al. found that the content of explanations for autonomous driving systems must be limited, requiring appropriate information explanation content to optimize system performance [25]. Yan et al. discovered that the perception of environmental information by autonomous driving systems contributes to enhancing driver situational awareness and trust [26]. Wang et al. developed a system using 3D auditory prompts to present advisory information, contributing to driver performance and situational awareness [27].

2.3 Measures of Situational Awareness in AV

The dynamic decision-making of human drivers is closely associated with Situational Awareness (SA), encompassing three cognitive levels: perception, comprehension, and projection [28]. Methods for measuring SA in automated vehicles include freeze-frame techniques, post-trial methods, and question probes, which can be employed to understand drivers’ decision-making processes [29].

Simultaneously, the level of trust that drivers place in autonomous driving systems is a critical factor influencing the safety of autonomous vehicle operation. Miscalibration of trust in the autonomous system can lead to adverse (even fatal) consequences [30]. In prior research, Muir developed a scale for assessing automation trust [31], which was subsequently adapted for the context of autonomous driving by Du et al. [32]. However, this scale can only measure the drivers’ general trust (GT) throughout the entire driving process. Although general trust is paramount, situational trust (ST) is equally crucial in the field of autonomous driving (AV), as it may vary across different driving scenarios [33]. Holthausen et al. developed a Situational Trust Scale for Automated Driving (STS-AD) and conducted preliminary validations in the AV context [34]. Both general trust and situational trust are included as measurement indicators in this study.

Research on the assessment of situational awareness in autonomous driving also involves studies on emotional experience measures, such as drivers’ preferences and anxiety as dependent variables. Drivers’ preferences for autonomous vehicles are often investigated when exploring specific technologies or features, which is also applicable to the study of explanations [35, 36]. Additionally, anxiety is another attitude indicator negatively correlated with the effectiveness of explanations [35, 37]. In our study, we concurrently include these subjective measurement indicators to assess drivers’ subjective attitudes throughout the entire driving process under specific explanation conditions.

3 Method

3.1 Participants

Totally, 20 participants (8 males and 12 females) aged from 22 to 29 (M = 24.7years old; SD = 2.13 years old) were recruited for this experiment. Each one needs to test four different HMI conditions (see Sect. 3.3). All the participants held valid driver’s licenses, and the mean of years obtaining the driver’s license was 4.7 (SD = 2.105).

3.2 Apparatus and AD System

Logitech G29 Driving Force Driving Simulator was used in this study (see Fig. 1). The physical system included a steering wheel with input buttons and force feedback, a turn signal lever, brake and gas pedals and a shift lever. We created a simulation environment of urban roads and scripts of different driving takeover scenarios. We used an iPad to simulate the central control screen to display driving takeover prompts.

Fig. 1.
figure 1

Experiment apparatus

3.3 Experiment Design

Totally, 20 participants (8 males and 12 females) aged from 22 to 29 (M = 24.7years old; SD = 2.13 years old) were recruited for this experiment. Each one needs to test four different HMI conditions (see Sect. 3.3). All the participants held valid driver’s licenses, and the mean of years obtaining the driver’s license was 4.7 (SD = 2.105).

HMI Setting.

This study aims to explore the impact of four information presentation strategies on driver takeover in L3 automated vehicles when the autonomous driving system issues a takeover request: no explanation, explaining the reason, explaining the outcome, and explaining both the reason and the outcome. We selected a scenario with a high probability of human intervention in autonomous driving, where the autonomous vehicle requires human control due to hardware sensor failures (such as LIDAR, cameras) that lead to issues in environmental perception or positioning. Based on this scenario, our four HMI strategies are set as follows:

Group I (None): Displays “Take over” without any explanation.

Group II (Why): Displays the takeover icon and explains the reason for the autonomous driving system's error.

Group III (What): Displays the takeover icon and explains the outcome of the autonomous driving system's error.

Group IV (What + WHY): Displays the takeover icon and explains both the reason for and the outcome of the autonomous driving system's error (Figs. 2 and 3).

Fig. 2.
figure 2

HMI design with four conditions (examples in perceived failure)

Fig. 3.
figure 3

The display effects of the HMI design on the central control screen

Experiment Environment.

In the simulation environment, a city road scenario measuring 5000 m in length and 50 m in width was constructed. Before a takeover request was issued, the Level 3 (L3) autonomous driving system operated the vehicle at 50 km/h in an urban road environment, which included a four-lane, two-way road. As the vehicle approached an intersection, the autonomous driving system initiated a takeover request. Participants were required to respond timely and take control of the vehicle to resume dynamic driving tasks. The criterion for successful takeover was whether the participant followed the navigation route displayed on the central control screen. A takeover was considered successful if the turning direction matched the navigation instruction; otherwise, it was deemed a failure. The experiment had three possible turning directions: (1) turn left, (2) go straight, (3) turn right (as shown in Fig. 4). In the four trials, the direction of the navigation route was randomly selected, and the sequence of the four HMI explanation strategies was also random to minimize the interference caused by learning effects.

Fig. 4.
figure 4

Steering Task for Determining Successful Takeover

Measurements.

SART method is one of the self-rating techniques that can be used easily and quickly which was made by Taylor in 1990 [38]. This method uses 10 dimensions as a parameter for calculating the SA value with low to high categories. Of the 10 dimensions, it can be grouped into 3 main parts which are commonly called 3D SART. The measure comprised the following dimensions: Demand—demands on attentional resources, Supply—supply of attentional resources, and Understanding—understanding of the situation. Participants will be asked about a scale from 1 to 7, namely 1 for low to 7 for the highest which is used as a scale that can represent the results to be achieved. After participants answer according to the existing scale, SA will be calculated using the formula: SA = U – ( D- S).

3.4 Procedure

The whole study was mainly divided into four sessions, including pre-test questionnaires, a training session, a formal test drive, and post-drive questionnaires and interviews (see Fig. 5).

Fig. 5.
figure 5

Experiment procedure.

Introduction and Pre-test Questionnaire.

After the initial introduction, participants were first given informed consent on their arrival and were requested to fill out a questionnaire of demographic information. Before being guided to the driving simulator, each participant was given a brief explanation of the AD system condition as well as the HMI settings on the screen.

Training Session.

In the initial phase, participants were acquainted with the simulator's operations, Human-Machine Interface (HMI) elements, and Automated Driving (AD) system settings. To align their understanding with the objectives and tasks of the forthcoming formal test drive, participants traversed an identical route in the simulator. It is important to note that the training session did not involve any actual tasks. Each participant was allotted approximately five minutes to achieve proficiency in executing these operations within the simulator.

Formal Test Drive.

During the formal experiment, participants engaged in four distinct simulated driving tasks, each involving different types of interpretable information, within a Level 3 (L3) autonomous driving simulator.The experiment was structured into three stages:

L3 Autonomous Driving Phase (30–60 s): Here, participants were involved in non-driving related tasks while the vehicle operated under L3 autonomy.

Takeover Request (100–150 m before an intersection): In this stage, the simulator replicated a scenario where the autonomous vehicle failed to detect lane lines, necessitating an urgent manual takeover by the participant. The vehicle's display provided interpretable information alongside auxiliary decision-making data to assist in this process.

Manual Takeover and Navigation Stage: Participants were required to manually take control of the vehicle, using the provided interpretable and decision-making information to navigate the planned route.

Across the four simulation trials, the location and timing of the takeover event varied slightly. Additionally, the sequence of presentation for interpretable information, auxiliary decision-making data, and the planned route was randomized. After each trial, participants took a brief rest of 1–2 min. Meanwhile, the experimenter documented and saved the data, and prepared the simulator for the next scenario.

Post-test Questionnaires and Interviews.

After the formal test drive, participants filled out questionnaires regarding situation awareness. Then we conducted interviews to explore their underlying thoughts about how they made decisions and how they felt in different scenarios. Finally, valid participants were rewarded with gifts and discharged. The whole study lasted for about 30 min.

4 Result

We obtained a total of 20 sets of valid experimental data, including 12 sets from drivers with extensive driving experience and 8 sets from drivers with limited experience; 12 sets were from female drivers, and 8 sets were from male drivers. The analysis that follows will focus on two dimensions: situational awareness and takeover performance.

4.1 Driver Situational Awareness

The comparison of participants’ SA scores under different explanation methods is shown in Fig. 6. Shapiro-Wilk tests for normality indicated that the data for all groups were normally distributed (p values were 0.686, 0.478, 0.486, and 0.381, all > 0.05). The differences in SA scores among the explanation methods were statistically significant (one-way ANOVA, F = 6.557, P = 0.001 < 0.01**). Post-hoc tests revealed that the SA score for “Why” was 3.750 points higher than for “None” (95% CI: 0.96–6.54), a difference that was statistically significant (P = 0.009 < 0.01**); “What” was 5.050 points higher than “None” (95% CI: 2.26–7.84), also statistically significant (P = 0.001 <  = 0.001***); and “Why + What” was 5.650 points higher than “None” (95% CI: 0.96–6.54), with statistical significance (P = 0.000 < 0.001***).

Figure 7 shows the distribution of SA scores among participants with different driving experiences under various explanation methods. There were no significant differences in SA scores among participants of different driving experience levels across the four explanation methods (independent samples t-test, p values were 0.494, 0.413, 0.400, and 0.345, respectively). Among participants with extensive driving experience, there were no significant differences in SA scores across the explanation methods (one-way ANOVA, F = 2.160, P = 0.112 > 0.05). However, among participants with limited driving experience, significant differences were observed in SA scores across explanation methods (one-way ANOVA, F = 4.398, P = 0.009 < 0.01**).

Figure 8 also presents the distribution of SA scores among participants of different genders under various explanation methods. There were no significant differences in SA scores between genders under the “None,” “What,” and “Why + What” explanation methods (independent samples t-test, p values were 0.739, 0.555, and 0.667, respectively). However, under the “Why” explanation method, a significant difference was observed, with males scoring higher (M = 19.13, SD = 3.00) than females (M = 15.00, SD = 4.65; t = 2.210, p = 0.04 < 0.05).

Fig. 6.
figure 6

Distribution of SA overall scores for four HMI setting types

Fig. 7.
figure 7

Distribution of SA overall scores for different driving experiences

Fig. 8.
figure 8

Distribution of SA overall scores for different genders

4.2 Driver Situational Awareness

This paper assesses drivers’ takeover performance by determining whether the driver successfully took over control. The criterion for successful takeover is whether participants turned according to the navigation route displayed on the central control screen. A takeover is considered successful if the turning direction matches the navigation direction; otherwise, it is deemed a failure (Table 1).

Table 1. Chi-square test of takeover performance overall scores for four HMI setting types

As shown in the table above, using the chi-square test to investigate the relationship between different explanatory schemes and performance, it is evident that different explanatory schemes significantly affect performance (p < 0.05). The schemes show a level of significance at 0.001 (X2 = 24.689, p = 0.000 < 0.01). A comparison of percentage differences reveals that the “Why + What” scheme has a 90.00% success rate in completing the takeover driving task, significantly above the average of 53.75%. The “What” scheme has a success rate of 70.00%, also significantly above the average of 53.75%. The success rates for the “None” and “Why” explanatory schemes are 20% and 35%, respectively, both below the average of 53.75%.

5 Discussion

In this study, we looked into the impact of four options on the driver's takeover performance: no explanation (none), explanation of reasons (why), explanation of results (what), and explanation of reasons and results (why + what) when the autonomous driving system issues a takeover request to the driver in L3 automated vehicles. Combining the results above with subjective statements from the interviews, we attempted to discuss and develop implications for presenting explanations in driving takeover situations. Also, the limitations of this study and prospects for future work were clarified in this section.

5.1 The Impact of Interpretable Information Types

Four types of explanations—no explanation (none), explanation of reasons (why), explanation of results (what), and explanation of reasons and results (why + what)—significantly impact drivers’ situational awareness and performance during takeover.

Situational awareness includes the driver's ability to predict and understand the surrounding environment, its dynamic changes, and future states. Research indicates that a high level of situational awareness can significantly improve the efficiency and safety of drivers taking over from autonomous driving systems [39]. However, long-term reliance on autonomous driving systems may reduce drivers’ level of situational awareness, thereby diminishing their attention to and alertness of the surrounding environment. Without any explanation when an autonomous driving system issues a takeover request, drivers are unable to promptly assess the vehicle's status or formulate future action plans. The no explanation scenario (None), serving as a baseline condition, reveals the challenges drivers face in taking over tasks without sufficient information.

The low takeover performance and situational awareness scores in this scenario emphasize the importance of information presentation for drivers to make quick and accurate decisions in emergency takeover scenarios. Explaining the reasons for an autonomous driving system's failure (Why) enables drivers to understand the root cause of the failure and why the car is requesting takeover, significantly improving situational awareness and takeover performance over providing no explanation.

Explaining the results of an autonomous driving failure (what) directly addresses the capability deficits caused by the vehicle's malfunction. Drivers can use these deficits to predict and understand the vehicle's own status and the road environment outside, thus significantly improving situational awareness and takeover performance compared to no explanation. This study posits that directly explaining the results (what) better assists drivers in assessing the vehicle's status without needing to deduce the limitations from the root cause of the failure. In terms of takeover performance, the success rate of explaining the results (what) is significantly higher than that of explaining the reasons (why).

Finally, explaining both the reasons and results (Why + What) clearly offers the best outcomes in terms of situational awareness and takeover performance. However, this study suggests that the cognitive load for drivers with (Why + What) is the highest. In more complex and urgent scenarios, the performance of the explanation scheme that includes both reasons and results (Why + What) warrants further in-depth investigation.

5.2 The Impact of Driving Experience and Gender

Experienced drivers showed no significant differences in situational awareness across various explanation methods, suggesting that adaptability or familiarity with driving tasks may reduce the need for detailed explanations. In contrast, drivers with less experience exhibited significant differences in situational awareness based on the explanation method, indicating that the type and depth of information provided can be crucial in supporting their understanding and performance.

In the “Why” explanation method, gender differences were particularly pronounced, with male drivers scoring higher in situational awareness than female drivers. This finding suggests that men and women may process and utilize explanatory information differently, possibly due to differences in information processing or cognitive preferences. Of course, more experiments are needed to further investigate this.

5.3 Limitations and Future Work

In this section, we present some limitations to better interpret our findings. First, most of the participants recruited in this study are university students. So our findings are based on a relatively young and less experienced sample. For more representative results, further experiments are to be arranged with participants from a more general population of drivers.

Second, this experiment was executed in a fixed-base driving simulator with light traffic. Although most participants highly rated its immersive experience and near-real physical simulation, some of them also claimed that they were less cautious due to the nature that they were free of dangers in the simulator. And lack of physical movement might influence their awareness in lane-changing scenarios. The future study should involve more traffic.

Finally, we assessed drivers’ takeover performance solely based on whether they successfully took over control. In the future, we will employ more measurement methods to evaluate drivers’ takeover performance, such as gaze movement, takeover time, and the number of gazes.

6 Conclusion

This research highlights the importance of effective information presentation in autonomous vehicle human-machine interfaces, particularly in L3 systems. The study indicates that in L3 autonomous driving systems, simply displaying a takeover request with icons and text is insufficient. It is also necessary to explain the reasons for and outcome of errors in the autonomous driving system. Compared with explaining (why), explaining reasons (What) does not significantly affect situational awareness recovery, but significantly enhances takeover performance. Optimal takeover performance, without increasing driver cognitive load, occurs when both reasons and outcomes (Why + What) are explained. Our findings show that gender differences also play a role in the effectiveness of these strategies, with “Why” explanations being more beneficial for male drivers. This research not only is crucial for designing human-machine interaction interfaces in L3 autonomous vehicle takeovers but also underscores the necessity of tailoring information delivery to driver-specific characteristics for optimal performance and safety.