1 Introduction

With the introduction of automated vehicles, mobility could be enhanced in terms of traffic congestion, traffic flow and overall comfort for the vehicle user (Montgomery 2018). Partly automated driving functions are one step towards this future of fully automated vehicles. These functions are able to perform parts of the dynamic driving task, but the vehicle user still serves as a fallback. The user has to continuously monitor the vehicle functions and the surrounding traffic environment. With rising automation capabilities, the driving related tasks of the vehicle user will decrease and the role will change from a driver’s to a passenger’s one (van den Beukel 2016). According to SAE Level 4 (High Driving automation), highly automated vehicles (AV) perform all parts of the dynamic driving task and the responsibility for a safe and hazard-free trip within so-called Operational Design Domains (ODD) (SAE 2021). There is an ongoing discussion about whether AVs can ever be integrated into general road traffic (Tabone et al. 2021). A particular challenging environment is urban traffic, in which an AV will be surrounded by various traffic participants like other AVs, manually driven vehicles, pedestrians and cyclists. Even though urban traffic is generally organized by formal rules, there are situations in which communication needs to take place in order to resolve traffic flow. Examples are general deadlock situations at unmarked intersections or bottleneck scenarios. Traffic participants might use headlights, gestures or adapt their driving behavior by accelerating or decelerating beforehand, to show their intention and resolve these situations (Färber 2016; Miller et al. 2022; Rettenmaier et al. 2021). The AV would have to register these signals from other road users, derive meaning from them and integrate them into its own driving behavior. It is foreseeable that an AV will not be capable to solve all urban situations due to lacking sensor information or due to problems with identifying the meaning of communication in such situations. Not only failure to register and process communication might lead to problems, but also complex situations where the general rules of right of way cannot be applied: for example, intersections with construction work or blocked lanes. In all of these situations an AV might be hindered to process the situation accordingly. As the AV user is not obliged to take-over responsibility, the AV would have to stop driving and execute what is called a Minimal Risk Maneuver (MRM) to achieve a Minimal Risk Condition (MRC), wherein the vehicle is in a stopped and safe condition (Aptiv et al. 2019; SAE 2021). In a possible traffic scenario, this would mean the AV halts, then is not able to continue the drive and subsequently initiates a MRM by diverting onto a sidewalk. This maneuver effectively stops the automated drive. Consequently, the user would need to maneuver the vehicle back onto the street, creating inconvenience and potentially causing a disruption that could impede traffic flow.

A possible solution to these problems is the involvement of a human agent to resolve sensor or processing failures of an AV. One approach is the involvement via teleoperation. Teleoperation would include an external agent from a control room locally removed from the situation. There is research on this approach (Gnatzig et al. 2012; Kettwich et al. 2021). Even though this solution is sought by some, issues like infrastructure availability and latency could be a problem (Neumeier et al. 2019; Zhang 2020). Another solution would be the involvement of the human user in the AV. The user of the AV itself is present in the situation, receives signals from other traffic participants and has access to further environmental information. In these situations, the user is able to provide the AV with the necessary input to bridge the system limits of the AV and to directly continue the drive. In a deadlock situation for example, the user could cooperate with the vehicle by assessing if the situation is safe, then giving the AV affirmation to continue, whereupon the AV would resume its prior maneuver. Furthermore, it would be possible that the user temporarily takes on the full driving task until the AV has full control again. In both approaches, the vehicle user and the AV would work cooperatively to continue the drive and bridge the system limits to overcome the boundary of the ODD. This cooperation process would ensure a direct continuation of the automated drive, preventing the AV to perform a MRM, effectively stopping the automated drive. However, it is unclear how this cooperation process should be implemented.

The two prior described examples shift different parts of the dynamic driving task between the AV and the user. More parts when performing the whole driving task, less parts when only giving affirmation to perform a maneuver.

Which solution is best, depends on the question what users prefer. Users might want to be as little involved as possible, seeing the cooperation process as a disturbance, trusting the system to be capable of handling the situation while conjointly feeling safe and comfortable. Or they might want to be in total control during the process not trusting the AV to be capable of handling the scenario, feeling safer when handling the situation by themselves. Or users could prefer a middle way where they are more involved in the cooperation process, but do not have to take on all parts of the driving task. The question of how much the user wants to be involved in this process is not clear and needs research.

In this paper, we introduce and examine four different cooperative strategies to bridge automation limits in inner city traffic in a driving simulator. The cooperation strategies differ in terms of their distribution of parts of the driving task between AV and user, ranging from pressing a single button to completely driving manually. The four cooperation strategies are evaluated regarding different criteria aiming to examine the question how much vehicle users want to be involved in a cooperation to overcome automation limits. In general, it is examined how many parts of the driving task the user is willing to perform when bridging system limits.

2 Related work

Hoc (2001) describes cooperation as two agents being in a cooperative situation, both having individual goals, and being capable of interfering the other in terms of goals, resources or procedures. Further, each one tries to manage the interference to facilitate the individual activities and/or the common task. Additionally Hoc (2001) describes that the individual activities of the agents are not independent and are facilitated by cooperation. This cooperation can be between two humans or between a human and a machine. For automated driving this cooperation can be found in so-called cooperative human machine interfaces. These interfaces have already been studied in partially automated driving, mainly focusing on the topic of situational awareness. Under the term “cooperative control”, Flemisch et al. (2014) investigated the concept of horse-mode (H-mode), in which the driver takes over the vehicle control jointly with the automation (Flemisch et al. 2003). By means of a joystick with haptic feedback, the driver controls the vehicle together with the automation. The metaphor H-Mode refers to riding a horse that has its own will, perceives the environment independently and navigates within it. However, it is still under the agency of the rider. Regarding driving the vehicle, this metaphor can also be applied to the degree of involvement. In “loose reign” mode, the automation would perform parts of the driving task more strongly, and in “tight reign” mode, it would perform parts of the driving task more weakly (Flemisch et al. 2003). Regardless of the mode, the vehicle user is constantly involved in controlling the driving task.

Another concept that corresponds to “cooperative control” is called “conduct-by-wire”. In this concept, the vehicle user makes a choice of several maneuver commands at regular intervals during an automated drive (Flemisch et al. 2014). When driving on the highway the user could choose whether to initiate a lane change or whether the vehicle should maintain the current maneuver (Kauer et al. 2010). What differs from the H-Mode concept is the level at which cooperation takes place during the driving task. According to Michon (1985), the driving task can be divided into three levels (strategic (planning), tactical (maneuvering) and operational (control)). In the conduct-by-wire concept cooperation only occurs at the strategic and tactical level in the selection of maneuvers. Control in H-Mode is also carried out at the operational level. Both approaches pursue the idea that the vehicle user continuously participates in the driving task. This could have the advantage that through continuous involvement situation awareness can be maintained, and the so-called “out-of-the-loop” problem could be prevented (Endsley and Kiris 1995). However, the continuous involvement is directly opposed to the promises of automation, most prominently, spending time with anything other but the driving task. Therefore, cooperative systems that involve the user must shift from continuous to temporary involvement.

Walch et al. (2019) describe another cooperative approach, which sets its focus on temporary involvement. In their concept, the vehicle user is only involved when the automated system reaches its system limits. The authors distinguish the concept of cooperation from that of shared control. They differentiate the two based on the duration of the interaction between human and machine and how long the human is kept in the “control loop” (Walch et al. 2019). The interaction concepts that implicate continuous involvement are grouped into shared control and those in which there is temporary involvement are classified under the term cooperation. More specifically, they distinguish that shared control can be carried out at all levels of the driving task (strategic, tactical, operational) and cooperation is not carried out at the operational level. In a driving simulator study, Walch et al. (2016) investigated such a cooperation system, which serves to bridge automation system boundaries. During a highly automated drive on the country road, participants experienced a situation in which a vehicle broke down after an intersection. The system informed the participants about the situation, and they were able to choose how the situation should be managed based on two to three suggestions. The participants were able to choose whether they wanted to enter a junction, to overtake the vehicle via the oncoming lane or to take over manually. The number of possible maneuvers, the complexity of the situation, the interaction with the system and the legal feasibility of the maneuver were varied. The results show participants preferred to have the driving maneuver carried out by the automation. Furthermore, extended interaction times were found in the language modality, with larger maneuver selection and if there were no restrictions to the legal navigability.

All of the presented studies on cooperative human–machine interfaces involve the human user in the driving task to prevent automation failures. They use different strategies, such as continuously involving the user to prevent the loss of situation awareness. These cooperative human–machine interfaces primarily focus on scenarios where the vehicle is in motion. However, there is a lack of depth in current research when it comes to stationary situations of Level 4 AVs in urban traffic. In urban traffic the vehicle would halt and analyze the situation, before opting to perform a MRM. During this period, the urgency for human intervention is reduced and the vehicle is stationary. While Walch et al. (2016) definition of cooperation provides a starting point by suggesting temporary user cooperation at tactical and strategic levels, it overlooks the operational level, leaving uncertainty about the full range of user preferences for involvement. Do users want minimal involvement to ensure comfort, or do they prefer more involvement to feel safer during the cooperation process? This aspect of user involvement has not yet been sufficiently explored. Particularly in the context of future Level 4 vehicles in urban traffic, examining user involvement to bridge system limits is crucial for understanding user needs within this context. Investigating this aspect could lay the necessary groundwork for the future design of AV technology. Moreover, adopting a cooperative approach may be crucial for the successful deployment of Level 4 vehicles in cities, potentially leading to benefits such as improved road traffic safety, more efficient traffic flow, and enhanced comfort.

3 Research question

In this paper, we examine the question to what extent the user of automation is willing to cooperate with an AV to bridge system limits in inner-city traffic scenarios, i. e. how much of the driving task is the user of the AV willing to perform during the cooperation. The following research question is therefore formulated:

R1: How much do users of an AV want to be involved in a cooperation with the AV to overcome automation limits?

To answer this question, a driving simulator study was conducted, in which participants experienced multiple drives with an AV in inner-city traffic. During each of the drives, five traffic scenarios occurred, in which the AV reached its system limits. The traffic scenarios depicted situations, in which AVs might have problems in the future to detect and understand communication and behavior of other traffic participants. All these traffic scenarios were either unclear for an AV or had no clear yielding rules, wherein the user had to be involved to solve the situation. The traffic scenarios were not time critical, but required action to continue the drive. Participants experienced in four drives four cooperation strategies to bridge the system limits in these five traffic scenarios. The cooperation strategies differed in terms of how much of the driving task the vehicle user needs to perform during the cooperation in the given traffic scenarios. The more parts of the driving task performed by the user the higher the involvement of the user.

In the frame of the present study, the research question should be answered by evaluating the different cooperation strategies with their different degrees of involvement by means of different dimensions of interest. These dimensions are introduced in the following. By examining these dimensions, we hope to provide a more comprehensive understanding of the cooperation process and what factors might also play a role during the cooperation:

In general, we see that comfort is one important dimension for evaluating highly automated driving. Automated driving according to SAE Level 4 enables a gain in comfort, as the user is no longer involved in the driving task and can devote the journey time to non-driving related tasks. However, the cooperation strategies bridge system limits of a Level 4 vehicle, but with that comes a possible interruption of the automated drive. The temporary interruption of the automated journey and the degree of involvement during the cooperation process could have a negative impact on the comfort of using the automation.

Due to the fact that users would be interrupted from performing the non-driving related task in the form of, for example, playing games when involvement is required, is a specific aspect that is addressed by the dimension discomfort. Less involvement could lead to less discomfort when dealing with a traffic scenario that requires involvement.

The mental demand of using the cooperation strategies is of further interest. Depending on the cooperation strategy, more or fewer parts of the driving task have to be performed by the user. Even if the usability does not differ between the strategies, the mental demand should differ between the strategies due to the number of parts of the driving task to be performed. However, the extent to which this is the case is of interest and should be investigated.

As the cooperation strategies are implemented by using different HMI elements, one further dimension of interest is the usability of these elements, because it cannot be ruled out that negative usability ratings of an HMI element have negative effects on the general evaluation of the cooperation strategies.

Furthermore, automation should in general be perceived as safe so that it is actually used by users. If the perceived safety is too low, this would possibly prevent the use of automation (Xu et al. 2018). In this case, the user might prefer to drive manually. Also, the perception of safety during the cooperation process presumably plays an important role; if the cooperation process is perceived as unsafe, involvement is likely to be rejected.

Since trust can be seen as a key factor for the use of automation in general (Choi & Ji 2015; Lee & See 2004), trust in automation might also play a role when evaluating the cooperation strategies of user and AV in the frame of the present study.

Another dimension of interest is the personal benefit during the cooperation process. If users would experience that the cooperation with the AV provides personal benefit, a higher willingness to cooperate would be more likely (Pastoors & Ebert 2019). When no benefit is seen by the users, the users might completely reject the thought of being involved.

Lastly, the user’s personal preference of the cooperation strategies should give us a more direct inside on the question of how much users want to be involved, detecting possible unseen differences within the other dimensions.

Addressing these dimensions, we hypothesize that the comfort will be rated higher when users are less involved during the cooperation. Users have to perform less parts of the driving task resulting in higher perceived comfort due to having to do less during the cooperation.

H1: Cooperation strategies with higher involvement of the user will be rated less comfortable than cooperation strategies with lower involvement of the user.

We hypothesize further, that for discomfort an inverse relationship between involvement exits. The more users have to do during the cooperation, the more discomfortable the cooperation will be.

H2: Cooperation strategies with higher involvement of the user will be rated more discomfortable than cooperation strategies with lower involvement of the user.

Further, we hypothesize that the more users are involved in the cooperation, the higher the mental demand is. The higher involvement comes with performing more parts of the driving task. Therefore, users have to do more in terms of perceiving, thinking, and deciding during the cooperation, resulting in a higher mental demand.

H3: Cooperation strategies with higher involvement of the user will be rated higher for mental demand than cooperation strategies with lower involvement of the user.

As usability serves as control dimension for a potential impact of the HMI elements, it is not address with a specific hypothesis.

For perceived safety we hypothesize that this dimension will be rated higher when users are more involved in the cooperation due to the users having more control during the cooperation.

H4: Cooperation strategies with higher involvement of the user will be rated higher in terms of perceived safety than cooperation strategies with lower involvement.

While we do not formulate a specific hypothesis regarding trust in automation, it is important to acknowledge its potential impact. Significant differences in trust across the four cooperation strategies could highlight critical weaknesses of the degree of involvement. If trust is perceived to be lower in one strategy than in another, then the strategy itself, and thus the level of involvement, is inappropriate for a user.

Further, we hypothesize that the personal benefit experienced during cooperation will be rated higher when users are less involved during the cooperation.

H5: Cooperation strategies with lower involvement of the user will be rated higher for personal benefit than cooperation strategies with lower involvement of the user.

Regarding the personal preference we also don´t formulate a hypothesis. The prior defined hypotheses are somewhat contradictory for the personal preference. Higher involvement would lead to a higher perceived safety as well as lower perceived comfort, higher discomfort and higher personal benefit. Even though we don´t formulate a hypothesis, we think that this dimension might detect how users perceive the importance of the examined dimensions.

4 Methodology

4.1 Study environment

The experiment was conducted in the static driving simulator at the Würzburg Institute for Traffic Sciences (WIVW). The vehicle is identical to a production type Opel Insignia. In order to simulate a realistic steering torque, a SensoDrive steering motor is integrated. Five projectors are installed to provide a 300° screen image, thus nearly an all-round visibility. The exterior and interior mirrors function as LCD displays. An operating room is used to control the system. From here, the test driver can be observed via the video system. The contact with the test supervisor is maintained via an intercom system. The simulator is run by the WIVW driving simulation software SILAB, which allows to design or customize driving scenarios for research questions as needed. In the center of the vehicle’s console, an automation display was attached. On the display, a 2D representation of the environment was shown, called World in Miniature (WIM) due to its low-fidelity depiction of the driving environment. In the experiment the display was used to display AV sensor information. The WIM displayed the position of the AV, other vehicles, pedestrians, cyclists, motorcyclists, roads and buildings. Furthermore, general information was positioned at the top of the display including current speed, system status, and navigation information. In this study, two variants of the WIM were introduced as a between-subjects factor. The variants differed in the amount of information shown during the cooperation process. However, this aspect is out-of-scope for this paper. Possible influences of the variants were checked statistically by performing mixed ANOVAs with the factor of the display incorporated. No significant interaction effects and no significant main effects of the factor was detected. Possible influences of the variants on the strategies were therefore ruled out.

4.2 Traffic scenarios

In total, the participants experienced five traffic scenarios, in which the automation reached its system limits. The scenarios were chosen as examples for traffic scenarios, in which the automation might have problems detecting signals of other traffic participants or has problems analyzing the situation due to the behavior of other traffic participants. The scenarios were based on realistic situations, but not on their actual frequency in real traffic. In every scenario, the system asked the vehicle user for cooperation by means of a display message on the automation screen and an auditory signal. All traffic scenarios required the involvement of the user to continue the drive. The traffic scenarios were not designed to be time critical and the participants did not have to stop or react to any sudden hazards during the ongoing cooperation. After the system limits were bridged by cooperation, the AV took full control of the driving again. The five traffic scenarios were always presented in a fixed order and were embedded in an inner-city track with a parking lot as start and finish point. The drive took about 15 min in total.

In the following, the traffic scenarios are listed, ordered by occurrence in the drive. Images of the traffic scenarios are shown in Fig. 1.

Fig. 1
figure 1

Traffic scenarios in the driving simulator from the user´s point of view

4.2.1 PED_NOTCROSS

The AV approaches a pedestrian crossing with a pedestrian standing on the left curb. The AV stops, intending to let the pedestrian cross the street. The pedestrian performs a hand gesture, indicating that he does not want to cross the street. The AV is not able to recognize the gesture and asks for cooperation.

4.2.2 CYC_PHONE

A cyclist approaches from the right arm of an intersection having the right of way. The AV stops. The cyclist slows down, coming to a stop at the intersection close to the curb but still on the road. The cyclist is standing and looking at her phone. The AV does not know if the cyclist is still relevant in this situation and asks for cooperation.

4.2.3 MV_CROSSING

The AV approaches a vehicle that is stopped on the same lane in front of a zebra crossing. The AV comes to a stop and waits for the vehicle to continue the drive. The driver of the stopped car performs a waving gesture with an outstretched arm, indicating that the AV should overtake. The AV does not understand the gesture and asks for cooperation.

4.2.4 MV_BOTTLENECK

The AV approaches a bottleneck with an oncoming vehicle and stops. The oncoming vehicle flashes its headlights, indicating that it will yield and the AV should drive. The AV is not able to make a decision and asks for cooperation.

4.2.5 MV_BUS

The AV approaches a T-intersection, where a vehicle on the oncoming lane blocks the way. A bus having the right of way arrives from the right side of the intersection. The AV stops to yield to the right of way of the bus. The bus wants to turn left in direction of the AV, but cannot, because the standing vehicle on the oncoming lane is blocking the pathway. The bus driver is performing a hand gesture, indicating for the AV to move along so that he in turn can overtake the standing vehicle. The AV is not able to perceive the gesture and asks for cooperation.

4.3 Cooperation strategies

In the experiment, four cooperation strategies were investigated differing in the involvement of the user during the cooperation. The involvement was varied by the task allocation of the driving task between the AV and the user. A higher involvement means the user has to perform more parts of the driving task. Conversely, less involvement means the user has to perform fewer parts of the driving task. The parts of the driving task that were shifted between user and automation in this experiment are those of lateral and longitudinal movement, monitoring the situation, and decision-making in form of assessing the situation. The cooperation strategies were developed with the background of shifting parts of the driving task between the user and the automation. For the concrete design of the strategies, we chose to incorporate existing HMI elements that were already present in the car’s interior. These were the gear shift and the electric handbrake. The HMI elements are accessible and usable. By repurposing these familiar elements, we avoid developing new HMI components that may not be as usable for the users. We try to control a possible detrimental effect of the repurposing by assessing the usability for the experiments for the cooperation strategies.

In the following, the four cooperation strategies are presented ranging from the strategy with the least involvement to the strategy with most involvement. Table 1 lists the four cooperation strategies with their HMI-elements, a picture and the user´s involvement.

Table 1 Overview of cooperative strategies, ordered from least to most involvement of the user from left to right

The first strategy is called ACKNOWLEDGEMENT. In this strategy the user needs to assess the situation and decide, whether it is safe to continue in this situation. To confirm the continuation, the user has to press a button on a touchscreen where the automation view is displayed. The button features a forward arrow and is positioned right next to the text asking for cooperation. The button has to be pressed once to confirm the continuation. Users do not have to keep pressing the button. Lateral and longitudinal movement as well as monitoring the environment during the situation is done by the AV.

With the MONITORING strategy, the user needs to assess the situation and monitor the environment during the cooperation. The user operates the built-in electrical handbrake button to continue the drive. The system accelerates to 15 km/h and keeps this speed as long as the user actuates the adapted handbrake button. If the button is released the vehicle stops. The lateral and longitudinal movement is controlled by the AV.

In the Longitudinal Guidance (L.GUIDANCE) strategy, the user has to assess the traffic scenario, monitor the environment, and control the longitudinal movement. For this, an adapted gear selector which is configured as a 1-Degree-of-Freedom (DOF) lever is used. The lever controls the longitudinal velocity of the vehicle. When the lever is pushed forward, the speed increases up to a value of 30 km/h. A medium activation allows the vehicle to drive at a lower speed. The users can thus set their own speed. When the lever is released, it is pushed back into its original position by an integrated spring and the speed is reduced to 0 km/h. Lateral movement is controlled by the AV.

With the strategy MANUAL, the user is taking over the complete driving task. Meaning the user is fully responsible for assessing the situation, monitoring the environment, and taking over the longitudinal and lateral movement. The user operates the steering wheel and accelerator and brake pedals to control the vehicle dynamics.

4.4 Non-driving related task

In the experimental drive, participants were instructed to perform a non-driving related task by using a tablet with preinstalled games except during the traffic scenarios. The games were “Block Puzzle”, “openSudoku”, “Memory HD” and “Okay?” and are openly accessible. The participants were instructed to play the games, as it is quite likely that non-driving related activities will take place during highly automated driving. Furthermore, all participants should experience the traffic scenarios similarly, having to relocate their attention to analyze the situation. Throughout the experiment, participants were monitored to make sure the tablet was used when not in the cooperation process. If participants stopped using the tablet, they were reminded to resume playing the games.

4.5 Variables and experimental design

In this experiment, the four cooperation strategies were subject of the evaluation and served as independent variables. The evaluation criteria comfort, discomfort, mental demand, usability, perceived safety, trust in automation, personal benefit, and personal preference served as dependent variables in this experiment.

As the participants should be able to compare the four strategies and derive a personal preference from this comparison, we chose a within-subject design with four measurement times according to the four strategies. Every participant experienced each of the four cooperation strategies in a randomized order in four separated drives. Each drive contained the five traffic scenarios in a fixed order (as described above).

4.6 Questionnaire

The questionnaire was defined with regard to the research question of the extent to which users want to be involved in a cooperation process. Since to our knowledge, there is no questionnaire for addressing the different dimensions of our research question, we composed a questionnaire that is short enough to evaluate each cooperation strategy in our experimental setting, but contained all relevant aspects for evaluating the cooperation strategies and thus answering the research question.

The dimensions were partly addressed by items of established questionnaires. These items were adapted to fit in the frame of the experiment relating to the cooperation strategies. In addition, we formulated our own items when we couldn’t find suitable items from established questionnaires. For comfort, we formulated two questions to assess the dimension of comfort. For discomfort, we formulated one question trying to capture the aspect of interruption from the non-driving related task when having to cooperate. For mental demand, we used the corresponding dimension of the NASA-TLX questionnaire and adapted it to our experiment (Hart and Staveland 1988). To measure usability, we included two items from the System Usability Scale (SUS) and adapted them for the experiment. Perceived safety was measured by a self-formulated item. Trust in automation was measured by the two representative items from Körber´s Trust in Automation questionnaire (TiA) (Körber 2019). Personal benefit was assessed by one self-formulated item. Further we incorporated two questions regarding the general comfort and the general perceived safety of the overall drive. To examine if the strategies might influence the experience of the overall drive.

Summarizing, we composed a questionnaire with 12 items total, including two questions for measuring the general comfort and general perceived safety for the overall drive and ten questions for the use of the cooperation strategies within the traffic scenarios, covering comfort, discomfort, mental demand, usability, perceived safety, trust in automation, and personal benefit of the strategies. Table 2 shows the 12 items of the questionnaire.

Table 2 Questionnaire items—items that are coded negative are marked with a (–)

Regarding personal preference, we directly asked the participants after experiencing all four cooperation strategies to rank the cooperation strategies from 1 to 4 in the final interview. They should also give reasons for the first and the last rank, so pro and contra arguments could be gathered for the cooperation strategies. Participants were also asked to recall the traffic scenarios by means of screenshots and to select a preferred cooperation strategy for each scenario. If this differed from the overall preference, participants were asked to give reasons for their scenario-specific preference.

4.7 Participants

In total, N = 36 participants from the WIVW driver panel were recruited. The participants were trained in the simulator and had previously participated in different driving simulator studies. The sample consisted of 14 women and 22 men. The average age was 45 years (SD = 14.36). All participants had at least two hours of experience with highly automated driving from prior driving simulator experiments. Furthermore, every participant had a valid driver’s license.

4.8 Procedure

At the beginning of the experiment, the participants were greeted and informed about the aims of the study. They filled out the informed consent and proceeded with an introductory drive, in which they experienced an automated drive in an inner-city traffic environment without any interruptions. The introductory drive served as a way to familiarize the participants to the driving simulator and the automation. Afterwards, the first instructional drive for the first cooperation strategy was started. Before the drive started, the participants were instructed about the capabilities and the usage of the cooperation strategy. In this instruction the participants experienced the cooperation strategy on an empty street. The participants were able to familiarize themselves with the cooperation strategy. Following the instruction, the first experimental drive was started. After the participants experienced the five traffic scenarios with one of the four cooperation strategy, the AV came to a stop in a parking lot. The participants were then interviewed about the drive in an adjoining room. They had the possibility to take a short break before the next experimental block. Succeeding the interview, the next block was started with an instructional drive and an experimental drive for the next cooperation strategy. In this experimental drive the participants experienced the same five traffic scenarios with the new cooperation strategy. The procedure was then repeated for every remaining strategy. After each of the four drives, participants were interviewed. They rated the cooperation strategies based on their overall strategy preference and on the level of the individual traffic scenarios. Furthermore, participants were asked about their reasoning for their particular ratings. The duration of the experiment was approximately 2.5 h and 45 Euros were paid as compensation.

4.9 Data collection and analysis

Questionnaire data and answers to open questions were recorded. The questionnaire consisted of self-formulated items and adapted items from existing questionnaires (Trust in Automation Scale, System Usability Scale, NASA-TLX) (Körber 2019; Brooke 1996; Hart & Staveland 1988). Participants rated the items on a 16-point verbal-numeric scale in a two-staged process (Heller 1982) (see Fig. 2). Firstly, they were asked to choose a verbal category. Secondly, they should choose one of the three numeric ratings within the verbal category. This scale was used throughout the questionnaire to ensure a seamless response process during the experiment. In addition, our participants have encountered this scale in several previous studies and are thus very familiar with this scale.

Fig. 2
figure 2

16-point scale

For the analysis of the questionnaire, Repeated-Measure ANOVAs were conducted using R 4.1.2 (R Core Team 2021) and the package rstatix (Kassambara 2021) for each item. Normality distribution was tested with the Shapiro–Wilk-Test. Post-hoc tests were performed using pairwise-t-tests with Bonferroni correction. Greenhouse–Geisser correction was applied if the assumption of sphericity was violated. If normality distribution was violated a Friedman test, with a subsequent Wilcoxon-rank-sum test with Bonferroni correction as a post-hoc was conducted. Kendall’s W was calculated to determine the effect size of the Friedman test.

5 Results

The results are presented in the following order. Starting with the ratings on the overall drive, general comfort and general perceived safety are presented. Subsequently the results of the ratings regarding the cooperation process are presented, in order of comfort, discomfort, mental demand, usability, perceived safety, trust in automation, and personal benefit. The ratings for the individual items were not normally distributed with the exception of the ratings on mental demand, which was normally distributed. For the graphical representation of the questionnaire items boxplots were utilized for items with non-normal distributions, mean plots with standard deviation were used for the normally distributed item. The section ends with the presentation of the results on the personal preference.

5.1 General rating on comfort and perceived safety

The general comfort for the overall drives was rated high (Acknowledgement: Mdn = 13.0, IQR = 12.0–14.0; Monitoring: Mdn = 13.0, IQR = 12.0–14.0; L.Guidance: Mdn = 13.0, IQR = 11.0–14.0; Manual: Mdn = 12.0, IQR = 11.0–14.0). No significant difference between cooperation strategies was detected (χ2(3) = 4.81, p = 0.187, W = 0.04) (see Fig. 3 general comfort).

Fig. 3
figure 3

Boxplots of ratings for general comfort and general perceived safety

The general perceived safety for the drives as a whole was rated as high to very high with all cooperation strategies (Acknowledgement: Mdn = 13.0, IQR = 12.0–14.0; Monitoring: Mdn = 12.5, IQR = 12.0–14.0; L.Guidance: Mdn = 14.0, IQR = 13.0–14.0; Manual: Mdn = 13.0, IQR = 12.0–15.0). A significant difference between the cooperation strategies was detected (χ2(3) = 8.47, p = 0.037, W = 0.08). Post-hoc tests revealed a significant difference between the strategies Monitoring and L.Guidance (z = 2.51 p = 0.012, r = 0.42) (see Fig. 3 general perceived safety).

5.2 Comfort

For the comfort items significant differences were detected for both items. For item comfort – 1 “The operation in the cooperation situations was convenient.” a significant difference was found between the cooperation strategies (χ2(3) = 20.0, p < 0.001, W = 0.19) (Acknowledgement: Mdn = 14.0, IQR = 12.0–15.0; Monitoring: Mdn = 14.0, IQR = 12.0–15.0; L.Guidance: Mdn = 12.5, IQR = 9.75–14.00; Manual: Mdn = 11.0, IQR = 9.0–14.0). Significant differences between the cooperation strategies Monitoring-Manual (z = 2.51, p = 0.012, r = 0.42), Acknowledgement-Manual (z = 2.43, p = 0.015, r = 0.41) and Acknowledgement-L.Guidance (z = 2.12, p = 0.034, r = 0.35) were found. Also, for item comfort—2 “I found driving through the cooperation situations to be comfortable.” a significant difference between the cooperation strategies was found (χ2(3) = 15.1, p = 0.002, W = 0.14) (Acknowledgement: Mdn = 12.0, IQR = 12.0–14.0; Monitoring: Mdn = 13.0, IQR = 10.0–14.0; L.Guidance: Mdn = 12.0, IQR = 9.75–13.00; Manual: Mdn = 10.0, IQR = 8.75–13.25). Significant differences were detected between the cooperation strategies Monitoring-Manual (z = 2.48, p = 0.013, r = 0.41) and Acknowledgement-Manual (z = 2.06, p = 0.039, r = 0.34) (see Fig. 4).

Fig. 4
figure 4

Boxplots of ratings for comfort

5.3 Discomfort

A significant difference between the cooperation strategies was found for the item discomfort “I found driving through the cooperation situations to be disruptive.” (χ2(3) = 9.97, p = 0.019, W = 0.09) (Acknowledgement: Mdn = 1.0, IQR = 0.0–3.0; Monitoring: Mdn = 0.5, IQR = 0.0–3.0; L.Guidance: Mdn = 1.0, IQR = 0.00–3.25; Manual: Mdn = 3.0, IQR = 0.00–6.25). Significant differences between the cooperation strategies L.Guidance-Manual (z = 2.20 p = 0.028, r = 0.37), Monitoring-Manual (z = 2.58, p = 0.010, r = 0.43) and Acknowledgement-Manual (z = 2.17, p = 0.03, r = 0.36) were detected (see Fig. 5).

Fig. 5
figure 5

Boxplots of ratings for discomfort

5.4 Mental demand

Regarding the mental demand item “I had to perceive, think and decide a lot in the cooperation situations.” a significant difference was found between the cooperation strategies (F(3,105) = 5.49, p = 0.002, ηp2 = 0.14) (Acknowledgement: M = 6.19, SD = 3.66, Monitoring: M = 6.31, SD = 3.69, L.Guidance: M = 7.08, SD = 3.76, Manual: M = 8.06, SD = 3.54). Post-hoc tests revealed significant differences between the cooperation strategies Monitoring-Manual (t(35) = , p = 0.023, d = 0.52) and Acknowledgement-Manual (t(35) = , p = 0.038, d = 0.49). (see Fig. 6).

Fig. 6
figure 6

Mean plots with standard deviation of ratings for mental demand

5.5 Usability

For the usability items no significant differences were found. For the item Usability—1 “I find the system easy to use in the cooperation situations.” no significant differences between the cooperation strategies were detected (χ2(3) = 5.99, p = 0.112, W = 0.06) (Acknowledgement: Mdn = 13.5, IQR = 12.0–15.0; Monitoring: Mdn = 13.0, IQR = 13.0–15.0; L.Guidance: Mdn = 13.0, IQR = 12.0–15.0; Manual: Mdn = 13.0, IQR = 11.0–14.0). Also, no significant differences were detected for item Usability – 2 “I had difficulties operating the system in the cooperation situations.” (χ2(3) = 6.75, p = 0.080, W = 0.06) (Acknowledgement: Mdn = 0.0, IQR = 0.0–1.0; Monitoring: Mdn = 0.0, IQR = 0.0–1.0; L.Guidance: Mdn = 0.0, IQR = 0.0–2.0; Manual: Mdn = 0.0, IQR = 0.0–1.0) (see Fig. 7). The item Usability 2 is coded negative.

Fig. 7
figure 7

Boxplots of ratings for usability (Usability 2 is coded negative)

5.6 Perceived safety

For item perceived safety “I felt safe in the cooperation situations.” no significant difference between the cooperation strategies was detected (χ2(3) = 3.01, p = 0.390, W = 0.03) (Acknowledgement: Mdn = 12.5, IQR = 11.0–13.0; Monitoring: Mdn = 13.0, IQR = 12.0–14.0; L.Guidance: Mdn = 13.0, IQR = 12.0–14.0; Manual: Mdn = 13.0, IQR = 12.0–14.0) (see Fig. 8).

Fig. 8
figure 8

Boxplots of ratings for perceived safety

5.7 Trust in automation

No significant differences were found for the items related to trust in automation. For item Trust—1 “I trust the system in cooperation situations.” no significant difference between the cooperation strategies was detected (χ2(3) = 3.30, p = 0.347, W = 0.03) (Acknowledgement: Mdn = 12.0, IQR = 9.0–14.0; Monitoring: Mdn = 12.0, IQR = 11.0–14.0; L.Guidance: Mdn = 12.0, IQR = 11.0–13.00; Manual: Mdn = 12.0, IQR = 10.0–14.0). Also, for item Trust – 2 “I can rely on the system in cooperation situations.” no significant difference between the cooperation strategies was found (χ2(3) = 5.34, p = 0.149, W = 0.05) (Acknowledgement: Mdn = 13.0, IQR = 10.75–14.00; Monitoring: Mdn = 12.0, IQR = 11.75–14.00; L.Guidance: Mdn = 12.5, IQR = 11.00–13.25; Manual: Mdn = 12.0, IQR = 10.75–14.00) (see Fig. 9).

Fig. 9
figure 9

Boxplots of ratings for trust in automation

5.8 Personal benefit

For the item personal benefit “I have benefited from the cooperation with the vehicle.” a significant difference between the cooperation strategies was found (χ2(3) = 15.6, p = 0.001, W = 0.14) (Acknowledgement: Mdn = 12.5, IQR = 11.0–13.0; Monitoring: Mdn = 13.0, IQR = 12.0–14.0; L.Guidance: Mdn = 13.0, IQR = 12.0–14.0; Manual: Mdn = 13.0, IQR = 12.0–14.0). Significant differences between the cooperation strategies Monitoring-Manual (z = 2.23, p = 0.026, r = 0.37) and Acknowledgement-L.Guidance (z = 2.06, p = 0.039, r = 0.34) were detected (see Fig. 10).

Fig. 10
figure 10

Boxplots of ratings for personal benefit

5.9 Personal preference

The cooperation strategies were ranked differently by the participants (see Fig. 5). The Acknowledgement strategy was chosen as the 1st rank most often (11), followed by Monitoring (10), Manual (8) and L.Guidance (7). As 2nd rank the Monitoring strategy was chosen most often (17), followed by L.Guidance (8), Acknowledgement (7) and Manual (4). As 3rd rank the L.Guidance was chosen most often (12), followed by Manual (11), Acknowledgement (8) and Monitoring (5). As the last rank, the strategy Manual was chosen most often (13), followed by Acknowledgement (10), L.Guidance (9) and Monitoring (4) (see Fig. 11). Participants gave reasoning for the choice of first and last rank. Table 3 lists cluster arguments on why participants preferred or disliked a cooperation strategy thereby listing pro and contra arguments for the strategies. The count behind the argument represents the number of unique participants mentioning the certain argument. The count does not relate exactly to the given preference numbers. Participants could mention several arguments for one cooperation strategy. Furthermore, participants tried to find pro arguments for the cooperation strategy they ranked lowest which were subsequently rated as a pro argument for the cooperation strategy and included in the clustering. Only arguments which were mentioned by more than two participants are included.

Fig. 11
figure 11

Preference rankings for strategy overall

Table 3 Pros and cons of the four cooperation strategies, arguments mentioned more than 2 times

After the overall preference rankings, participants were asked, which cooperation strategy they would prefer in each situation. Figure 12 shows the preferred cooperation strategy per traffic scenario.

Fig. 12
figure 12

Preference rankings for strategies for each traffic scenario

These rankings show a different distribution than the overall preference rankings and also vary compared to the 1st rank of the overall cooperation strategy preference. This means that the traffic scenario has a clear effect on the preferred strategy. A possible explanation derives from the reasons the participants give for their change in preference as participants divided the traffic scenarios into safer and more unsafe scenarios. Table 4 shows the counted mentioning’s of safe or unsafe situation as a reason for preferring a cooperation strategy, other than the overall preferred strategy, in an individual traffic scenario.

Table 4 Overview of naming safe or unsafe scenario as a reason for choosing another cooperation strategy for a specific traffic scenario than the overall preferred strategy

In traffic scenarios considered as unsafe, cooperation strategies with more involvement and a higher degree of control were preferred. The argument of unsafe traffic scenario and a higher desire for control was mentioned a total of 15 times. In traffic scenarios that were classified by the participants as clear and safe, cooperation strategies with less involvement and subsequently less control were preferred. This was particularly true for the Acknowledgement strategy, in which the argument of an unambiguous and safe traffic scenario for the preference was mentioned a total of eight times. On the other hand, in the case of the strategy Manual, the argument that the traffic scenario in question was unsafe was put forward 27 times by the participants.

In Table 5, the changes in preference of cooperation strategy are listed individually for the respective traffic scenarios. This table shows the change from the preferred overall strategy toward a cooperation strategy with more or less involvement of the user. In the first traffic scenario PED_NOTCROSS, more than twice as many participants switched to a cooperation strategy with less involvement than in the other direction to a cooperation strategy with more involvement.

Table 5 Overview of number of preference changes from the overall preferred strategy to strategies with more or less involvement as well as number of no change of preference per traffic scenario

This is consistent with the results in Fig. 8, where the lower-involvement cooperation strategies of Acknowledgement and Monitoring were the preferred cooperation strategies in the traffic scenario PED_NOTCROSS. Furthermore, participants gave the safety of the traffic scenario as a reason for the choice of the Acknowledgement strategy. The other four traffic scenarios showed the opposite picture. Here, more participants switched towards more control than vice versa. This matched with the safety assessment of the participants for the individual traffic scenarios. CYC_PHONE, MV_CROSSING, MV_BOTTLENECK and MV_BUS were rated as less safe due to possible erratic behaviors of the other traffic participants or possible obstructions where sudden hazards could emerge.

6 Discussion

General comfort was rated high to very high, with no significant difference between the cooperation strategies. Furthermore, general perceived safety was rated high to very high in all drives with each cooperation strategy. Although a significant difference was found between Monitoring and L-Guidance, all cooperation strategies seemed to be perceived as safe in terms of general perceived safety. This indicates that the cooperation process within the automated drive in this experiment didn´t have a major detrimental effect on general comfort and general perceived safety. For the significant difference between the L.Guidance strategy and the Monitoring strategy we have no explanation. Perhaps more perceived control led to this difference, however there are no clear indications from the results of the experiment that would provide an explanation.

Regarding comfort during the cooperation, post-hoc tests revealed significant differences between the Manual strategy and the Monitoring strategy and also significant differences between the Manual and the Acknowledgement strategies. The differences were found in the direction that the Monitoring and Acknowledgement strategies were rated higher than Manual in terms of comfort. No significant differences were detected between the Acknowledgement and Monitoring strategies. The L.Guidance strategy laid in between the strategy Manual and the two strategies Monitoring and Acknowledgement. For one question it differed significantly from the Acknowledgement strategy. The results therefore support the hypothesis H1, that cooperation strategies with higher involvement of the user will be rated as less comfortable than strategies with less involvement of the user.

For discomfort, post hoc tests revealed significant differences between the manual strategy and all other strategies. No significant differences were found between the cooperation strategies Acknowledgement, Monitoring and L.Guidance. The manual strategy seems to be rated as more discomfortable than the other strategies. The user has to perform the whole driving task during the cooperation, possibly resulting in the higher discomfort ratings. However, no significant differences between the other strategies were found. The results indicate a relationship between involvement and discomfort, but the relationship may not be strong enough to fully support hypothesis H2, that cooperation strategies with higher involvement of the user are rated more discomfortable than cooperation strategies with lower involvement.

For mental demand, significant differences were found between the strategies Manual and Acknowledgement as well as Manual and Monitoring. With no significant difference the L.Guidance Strategy lies in between Manual and the two strategies of Monitoring and Acknowledegment. This indicates a relationship between higher mental demand with more involvement of the user. Users have to perform more parts of the driving task, resulting in a higher mental demand for the cooperation process. This supports the hypothesis H3, that cooperation strategies with higher involvement of the user will be rated higher for mental demand than cooperation strategies with lower involvement of the user.

In terms of usability, no significant differences were found between the cooperation strategies. This suggests that the use of different HMI elements repurposed for the cooperation strategies didn’t have a significant negative impact on the overall usability of the strategies.

Further for perceived safety, no significant differences are found between the strategies. Although the ratings for the acknowledgement strategy are slightly lower, the strategies within the cooperation strategies are highly rated by users. This finding argues against the hypothesis H4, that cooperation strategies with higher involvement of the user will be rated higher in terms of perceived safety. The systems worked fine and no unexpected behavior of other traffic participants, that would require a reaction, occurred leading to an overall high perceived safety. However, the preference data for the cooperation strategies in the individual traffic scenarios seemingly showed a relation between involvement and perceived safety. Participants preferred cooperation strategies with higher involvement in traffic scenarios which were rated less safe. This is shown in Table 5, where participants changed their preference to a cooperation strategy with more involvement for all traffic scenarios except PED_NOTCROSS. Further, the participants preferred a cooperation strategy with less involvement in the traffic scenario PED_NOTCROSS stating that the scenario was safe. Participants wanted to have more control in unsafe traffic scenarios. The cooperation strategies with higher involvement allow this. Participants explicitly state the higher control as a positive for the L.Guidance and Manual strategy. Based on this data, it cannot be fully assumed that there is no relationship between perceived safety and involvement. One possible reason for the lack of significant differences in the data may be the very high perceived safety in all drives and the lack of criticality in the traffic scenarios.

In terms of trust in automation, no significant differences were found between the cooperation strategies. Trust is generally rated high across strategies. For Trust Item 1, the range of ratings is higher for the Acknowledgment strategy, but no findings are found between the strategies. Trust is one key factor for the acceptance of AVs. In turn this is good for the cooperation process with every strategy, because one factor of user adaptation seems to be already fulfilled. However, the automation worked besides the cooperation flawlessly, possibly influencing this result.

For personal benefit, participants rated the manual strategy significantly lower than Monitoring. Furthermore, L.Guidance is rated significantly lower than Acknowledgement. Although the ratings are generally high, there seems to be a discrepancy between the two strategies with less involvement (Acknowledgement and Monitoring) and the two strategies with more involvement (L.Guidance and Manual). Participants see a higher benefit when they are less involved with the strategies of Acknowledgement and Monitoring.

This may be due to the users having lower costs to cooperate with the automation in the traffic scenario. They have to perform fewer parts of the driving task and see the ratio of costs and benefits more favorable when they have to do less. The results seem to support the hypothesis H6 that cooperation strategies with lower involvement of the user will be rated higher in terms of personal benefit than cooperation strategies with lower involvement.

The cooperation strategies are rated significantly differently on the dimensions of comfort, discomfort, mental demand, and personal benefit. The direction in which the strategies are rated is generally in favor of the strategies with lower involvement. This is also seen in the preference data for the cooperation strategies as well as in the comments on the positives of the cooperation strategies. The most preferred cooperation strategy is Acknowledgement, followed by the Monitoring strategy. Both cooperation strategies require the least user involvement. The comments for why participants chose the cooperation strategies on their first rank, most frequently mention that they are comfortable to use. There were no remarks regarding comfort on the L.Guidance strategy and negative remarks for the Manual strategy. The participants seem to rank the cooperation strategies in terms of the comfort, discomfort as well as mental demand and personal benefit. The number of mentions regarding comfort, highlights the importance of this aspect in the cooperation process.

Based on the results, cautious conclusions can be drawn. Involvement of the user seems to have an impact on comfort, discomfort, mental demand, and personal benefit. Furthermore, the influence of perceived safety should not be prematurely excluded. Cooperation strategies with lower involvement of the users are generally preferred. However, there seems to be a correlation between perceived safety of a traffic scenario with the need for more control (i. e. more involvement). Based on these findings, we suggest to incorporate safety and comfort aspects when designing the human–machine-cooperation for a system with the task of bridging system limits. The monitoring strategy is probably the most preferred presented in this experiment: containing the positives of comfort and perceived safety—being comfortable to use and having sufficient control of the vehicle in the traffic scenarios, so the user feels safer.

The research question of how much do users of an automated vehicle want to be involved in a cooperation to overcome automation limits, cannot be fully answered here. Users seem to prefer cooperation strategies with lower involvement of the user, but they want to be more involved if the traffic scenario is seen as unsafe. The traffic scenarios did not require an intervention of the user or the automation in critical scenarios in this experiment. The effects of more critical traffic scenarios might change the preference to cooperation strategies with more involvement of the user. Further, it is unclear, how the frequency of cooperation requests might influence preference and the willingness to be involved.

7 Limitations

The cooperation strategies were presented in randomized order. However, it should be noted that potential sequence effects were not accounted for in the statistical analysis. Therefore, the possibility that the positioning of each cooperation strategy may have influenced the results cannot be entirely ruled out. Also, the traffic scenarios were presented in fixed order for all participants. Order effects might have influenced participants ratings.

Furthermore, for the evaluation of the cooperation strategies, we used a self-devised questionnaire. Even though items of established questionnaires were used, the items regarding comfort, discomfort, perceived safety, and personal benefit haven´t been validated. Further, the two items of comfort and the inversely-coded item for discomfort show the same ordering, and similar significant effects across the strategies. This consistency suggests that these items may be capturing aspects of a single underlying construct. We therefore advise caution in interpreting these specific dimensions as distinct. Even though the questionnaire revealed significant differences between the cooperation strategies, the questionnaire should be tested in a detailed validation study.

Additionally, the strategies were implemented using different HMI elements and further different capabilities. An influence of the perceived quality of the product or individual preferences on HMI elements or capabilities cannot be ruled out. Furthermore, the electrical handbrake and the gear shifter were repurposed to be used to move the vehicle. This might have a potential effect on the ratings because of higher affordance to map the known elements to driving functions. However, we tested the usability of the strategies and didn´t find significant differences. Therefore, a possible effect seems to be negligible.

Further the participants experienced two variants of an information display. The presence and the differences between the variants might have an influence on the evaluation of the strategies. However, a mixed ANOVA was used to check potential influences of the factor and no significant difference between the variants and no interaction effects were found. The sample data seems to be sufficient and balanced. However, previous experience with automated driving might influence the participants’ behavior. Lastly, the participants were used to being in a driving simulator setting. This might have an influence on the external validity of the results.

Despite these possible limitations, this study indicates potential differences between individual strategies, thereby contributing to a better understanding of user involvement as a means to overcome automation limits. This supports the advancement of AVs and provides valuable information for the development of cooperative human–machine interfaces.