1 Introduction

Modern vehicles are far away from being only a means of transportation as they offer numerous comfort functions, entertainment and communication options. Therefore, the driver is often engaged in a number of other activities which are not directly related to the primary task of operating the vehicle. Nevertheless, they should be performed with lower level of priority compared to driving tasks, such as manoeuvring the vehicle and taking care of the safety [7].

Interacting with in-vehicle infotainment and entertainment systems differs from other desk-based human-machine interaction (HMI). It represents a secondary task to the driver and is characterized by the driver’s limited ‘working’ area and the accessible space. Since driving itself is a demanding, challenging and most importantly a responsible task, the driver should neither be distracted nor lose oversight of the vehicle controls when interacting with the In-Vehicle Infotainment System (IVIS) [29]. This can be achieved with careful IVIS design and taking into consideration the driver’s limited capacity to perform several tasks at a time. Drivers primarily use visual, tactile and auditory communication channels for interaction with IVIS, which are based on three primary human senses through which people perceive their surroundings. Individual senses are very different from one another and therefore have unique features and limitations that need to be considered when designing user interfaces as well as their input-output components which the user interface.

The most often and traditionally used input interfaces for IVIS are buttons, knobs and handles on the steering wheel and dashboard. Such type of interaction is useful for a limited set of functions, but with the growing complexity, new functionalities and also new digital screens and head-up displays, new concepts of HMI have arisen in the last years. Nowadays, some vehicles provide also alternative input interfaces such as touchscreens, multifunctional rotary knobs, speech recognition, gesture recognition, free-hand interaction or touchpads [26, 30].

Despite the great potentials of these new input interfaces, it is always a challenge to successfully integrate them into the IVIS and achieve a satisfactory level of user satisfaction and acceptance. Quality aspects such as stimulation and novelty of a product are also very important to the IVIS users [24]. In order to obtain the greatest usability of IVIS and at the same time not compromise safety in any way, the interface design should follow and adapt to the characteristics of the tasks performed while driving. New input interfaces should therefore be designed in a way that they do not negatively affect the driver’s primary activities, which are operating the vehicle and maintaining safety [7]. On the contrary, they should be designed in a way to increase driving safety and mostly reduce driver’s cognitive load. It has been shown that increased cognitive load can affect driving performance and thus reduce driving safety [10, 19]. Therefore, when evaluating interaction designs for IVIS, it is necessary to assess also the cognitive load imposed to the driver. For example, designs that support driver’s naturalistic and intuitive actions and tasks in a vehicle should replace complicated and unnatural systems which require intensive learning and impose high cognitive load.

Motivated by the lack of such broader and holistic evaluation of new interaction designs [26, 30], we built a prototype of an IVIS which can be operated with three different input interfaces and shows output on a modern head-up display. These three input interfaces include “free-hand gesture-based” interface, “touch-based” interface and conventional “buttons on the steering wheel” interface. We compared all three input types in a driving simulation study whereby assessing user experience, usability, driving safety and cognitive load. Furthermore, since traffic has changed significantly in the past decades due to the ever increasing number of vehicles and roads, we also evaluated the performance of all three types of interfaces in different traffic conditions.

2 Related work

Due to the specific characteristics of the driving environment, interaction with IVIS has been evaluated and explored in many areas from usability and user experience to any attentional and safety effects that such interaction can impose on the driver and driving performance.

When evaluating usability and user experience it was found that performing tasks using a gestural free-hand user input can cause half less mistakes [6], shorter task completion times [6], and less and shorter eye glances [9, 14] compared to use of buttons or rotary knobs [6]. The use of a touch systems on the other hand, showed that although multimodal mid-air gestures exhibit safer secondary task dwell patterns, their use leads to longer task completion times and a higher workload compared to direct touch interfaces [14, 25]. It was also shown that many participants associated touchpads with higher workload and rated them as not pleasurable to use, when compared to a touch screen, button on a steering wheel or a rotary knob [20]. In this field, users rated gestural input as more pleasant and they perceived it as less distracting than use of a touch screen [9, 25].

Furthermore, it was found that both usability and user-experience depend also on the type and level of task difficulty, and not solely on the input modality [2, 14]. Overall, using a touch screen results in shortest task completion times for majority of tasks. However for tasks such as adjusting the volume using the rotary button is much more time effective [14]. For simple menu selection tasks (e.g. selecting a preset radio station), the touch screen is much more effective and also preferred by users compared to a touchpad or rotary knob. However, for tasks that required precise value selection (e.g. setting temperature to an exact value or dial a number), both performance and users’ feedback was more favourable to using a touchpad [2]. Differences were found also when observing different input gestures within the same input modality. For example when observing gestural input, using an angle scroll gesture results in shortest task completion times for both short and long menu tasks compared to a swipe or height control gesture [25]. Furthermore, it has been suggested that free-hand interaction can be used to increase driver’s user experience when interacting with IVIS, because by introducing a new interaction modality (alternative to visual and tactile, which are mainly used for driving) psychological needs and motives such as security, competence and pleasure stimulation can be fulfilled [23].

Feedback was also recognized as an important factor in terms of safety and user experience of the new interaction designs. Norberg and Rahe [28] discuss that it is possible to develop a touchpad as a means of interaction design that the user would accept as long as the design takes into consideration the advantages and disadvantages of using such an interface. For example, they suggest a touchpad should be mounted at an angle where the back edge is a bit higher than the front edge to give a consistent feeling of resistance when swiping the fingers on the touchpad, that haptic or auditory (or bi-modal) feedback should be used, and that the surface should be a bit rough to decrease friction. Vilimek and Zimmer [38] also showed that the use of multimodal feedback can be superior to the traditional rotary push-button, and that task performance can be increased and visual distraction reduced by using alternative auditory and tactile feedback, which new interaction designs often offer. Regardless of the great potential and proven benefits of multimodal interaction, which in addition to vision and touch includes also auditory interaction, most vehicle manufacturers today still prefer primarily visual-manual interaction designs and expose big majority of in-vehicle functionalities through these two modalities. Therefore, we decided for this study to explore different possibilities of visual and tactile communication channels and to exclude multimodal or voice based interaction designs.

In terms of driving safety and the effect of interacting with IVIS on the primary task of driving, Geiger et al. [6] did not find any differences in the performance of the primary task of the driver - a simulated steering task. Bach’ at el. [14] study, which was performed in both simulated and real environment, showed similar results for longitudinal control, however their results showed significantly more lateral errors when using the traditional buttons and rotary knobs compared to gestural and touch interaction designs. When comparing touchscreen, touchpad, rotary knob and buttons on the steering wheel, it was discovered that the touchscreen has greatest impact of all on the lateral, longitudinal and speed control [20], as well as causing greatest distraction [17, 33].

Based on the literature review we can conclude that each interface type or interaction design shows several advantages and disadvantages when compared to the traditionally used input designs. Most of the available research mainly focuses on the usability and the user experience of the evaluated systems, and very few explore also the effects of its use on the driving performance and driving safety. Additionally, each study was completed in different experimental conditions with different task forms and levels of difficulty, which does not allow direct comparisons of results between studies and evaluated systems. Intrigued by the complexity behind in-vehicle interaction designs, we have conducted a study that evaluates three different input modalities for IVIS, by comparing user experience, usability, impact on driving safety and the driver’s cognitive load. We compared two relatively novel and non-traditional input modalities based on free-hand interaction and touchpad-based interaction, and one more traditional interface in the form of a set of buttons on a steering wheel. While the first two types of interfaces can currently be found mainly in the most modern and typically high-priced vehicles, the third type of interface is very commonly implemented and available in the majority of vehicles in the market. We additionally tested the interfaces in two different driving conditions in order to observe how increased and more demanding traffic conditions may impact their usability and affect driving performance. With this study, we try to answer the following research questions:

  • Which type of interface is the safest to use while driving considering different driving situations and different types of IVIS tasks (has the lowest impact on driving performance and driver’s cognitive load)?

  • Which interaction design provides the best driving experience (the shortest task completion times and subjective user experience ratings)?

The study was performed in a driving simulator which provided a controlled and safe study environment for all participants.

3 Method

3.1 Participants

Thirty participants (16 male and 14 female), 20 to 42 years old (M = 27.3 years, SD = 6.2 years) participated in the study. All the participants had a valid driving license (M = 9 years, SD = 6 years). The participants were recruited with direct emails to all of the employees of University of Ljubljana, Faculty of electrical engineering and faculty’s official social media pages and communication channels. Due to this, most of the participants were employees and students from the University. Before the experiment procedure, each participant was informed about the study procedure, the intent of the study and form of data collection. Each participant that agreed to participate in the study signed a consent form which included all of these details. The study was conducted in compliance with the Code of ethics of University of Ljubljana, which provides guidelines for studies involving human beings. Each participant received an incentive for taking part in the experiment in a form of a small practical reward (worth of 5 EUR).

3.2 Apparatus

3.2.1 Driving simulator

This study was performed in a high-fidelity Nervtech [27] driving simulator. The simulator consists of a real car seat, a Fanatec [4] steering wheel and pedals (Wheel Base V2, Fanatec ClubSport Steering Wheel Porsche 918 RSR and Fanatec ClubSport pedals). This setup provides highly realistic force feedback and realistic driving experience. For the visuals, a triple-screen configuration which covers a 120° horizontal field of view and consists of three equal Samsung curved 48’ HD TV screens was used. The simulation software was OKTAL SCANeR Driver Training [1], which ran on a high-end gaming computer with an i7-6700K CPU and GeForce GTX 980Ti graphics card. The driving scenarios were custom made and the infotainment system and controls were implemented directly into the simulator software.

3.2.2 Eye tracker

For assessing the driver’s cognitive load, the participants’ pupillary response and activity was observed using Tobii Pro Wireless glasses 2 [34], as it has been shown that pupillary dilations indicate cognitive effort [13, 16]. The pupil data was recorded at a 100 Hz sample rate.

3.3 Interaction designs

A custom-developed system consisting of different input interaction designs was used in the study. The output was always the same – a visual head-up display (HUD), which appeared in the lower left corner of the windshield of the simulated vehicle. It had a hierarchical menu structure, with 3-5 levels, depending on the task. The selected option (line) was highlighted with a noticeable green colour (Fig. 1).

Fig. 1
figure 1

HUD used in the study

Three interaction designs for input were used in this study: buttons on a steering wheel, a touchpad and a free-hand interface. Each design enabled the same four different commands: up, down, confirm selection, return.

3.3.1 Hand tracker

The free-hand gesture interaction system used a Leap Motion Controller [22]. The device tracked the right hand palm position and orientation, and extracted the vertical orientation data. To select an option on the HUD, the driver had to change the pitch of the palm – pointing it up or down to select a line in the menu. To confirm the selection, the palm had to be held still for 1000 milliseconds. The visual feedback of the confirmation was a loading bar behind the selected element with a slightly different colour (Fig. 2). The driver could initiate the return command by rolling the palm clockwise for 90 degrees.

Fig. 2
figure 2

Loading bar with a different colour behind the selected element

3.3.2 Touchpad

The touchpad system was implemented on an Android smartphone (Samsung Galaxy S4). The phone’s touchscreen was used as a touchpad. A simple custom application turned the screen black and sent all touch data to the computer’s main system. The up and down commands were performed by sliding forward and backward, confirmation was a tap gesture, and the return command was initiated by sliding left. The smartphone was placed on the right side next to the driving seat (Fig. 3), as it is often placed in some vehicles (for example a lot of models of Audi) and similar research studies [20]. Such placement enabled easy and comfortable interaction with the touchpad.

Fig. 3
figure 3

The smartphone used as touchpad was placed on the right side of the driver’s seat, enabling comfortable access while resting the arm on the arm rest (top view)

3.3.3 Button on a steering wheel

The third interaction design used a button incorporated on the Fanatec steering wheel, which could be pushed in four different directions (Fig. 4). Up and down were used as such, right was used for confirmation, and left for returning to the previous level. All other buttons on the steering wheel were inactive.

Fig. 4
figure 4

Steering wheel used in the study for the button-based interaction design. The used button enabled four discrete commands – up, down, left and right as indicated with the red arrows

3.4 Experiment design

The study had a 2 × 3 mixed-factorial study design.

The participant’s primary task was safe driving. Half of the participants (group A) drove on a less demanding road (easy driving conditions), and the other half (group B) were exposed to more demanding driving conditions).

Group A drove on a country road with no traffic. The participants’ task was to follow a leading vehicle. The road was mainly straight with curvy sections between straight parts. The curves had a very big radius, just to keep the driver engaged to the driving task but no need for more demanding manoeuvres. The leading vehicle was changing the speed slowly and periodically between 50 and 90 km/h.

Group B drove in a city with traffic on a four-lane road (two lanes driving direction, two lane opposite direction). Participants had to follow directions from a navigation system that gave both visual and auditory instructions. The intersections in the city had traffic lights which were programmed to turn green when the participant arrived near the intersection. Also the surrounding traffic was programmed not to block the driver, so the driver could focus on following the navigation commands.

In both groups the experiment was conducted in simulated day time.

Additionally to the driving task, they were asked to perform a set of tasks with each of the interaction design systems while driving. Three different but comparable sets of tasks were used for each trial to avoid a learning effect. Each set consisted of four “easy” and four “difficult” tasks. The easier tasks required fewer steps for completion and the final step was on the third hierarchical level (e.g. Temperature- > Seat warmers- > On). The difficult tasks required more steps for completion and the final step was on the fifth hierarchical level of the system (e.g. ).

The sequence of the interaction designs and tasks was counterbalanced to avoid sequence learning effects and confounds. The complete list of tasks is shown in Table 1.

Table 1 Task set – the sequence of performed tasks was counterbalanced

3.5 Variables

Interaction design type (free hand, touchpad and buttons) and traffic difficulty (easy and difficult) were defined as independent variables. The dependant variables were divided into two groups – one connected to driving safety and the second to the usability and user experience of the interface designs. Driving safety variables were speed limit violation, acceleration and lane deviation. The second group of variables were cognitive load, task completion time and user experience.

3.5.1 Driving safety

For the evaluation of driving safety, we used a model which takes into consideration the violation of speed limits, acceleration and lane deviation [35].

The simulator software recorded the driving data at a rate of 20 Hz. The data included time, position, speed, acceleration, and the lateral position on the lane.

Speed limit violation

To estimate the speed limit violation during the whole measurement period and take into consideration the duration and seriousness of the violations, the average normalized exceeded speed is calculated. The normalization is necessary to take into account sections with different speed limits. It is defined as

$$ V(t)=\left\{\begin{array}{c}\frac{v(t)}{v_{limit}(t)}-1,v(t)\ge {v}_{limit}(t)\ \\ {}\kern2.5em 0\kern2.25em ,v(t)<{v}_{limit}(t)\end{array}\right. $$

where v(t) is the vehicle’s momentary speed and vlimit(t) the momentary speed limit. The average value of V(t) during an observed period of time is the result or the “score” of driving safety regarding the speed limit violation.

Acceleration

When observing the vehicle’s acceleration, driving below an estimated acceleration value of 1.5 m/s is considered as safe driving [31]. The factor of maximum acceleration excess was evaluated and defined as

$$ \mathrm{A}\left(\mathrm{t}\right)=\left\{\begin{array}{c}\frac{\mathrm{a}\left(\mathrm{t}\right)}{{\mathrm{a}}_{\mathrm{max}}}-1,\kern0.5em \left|\mathrm{a}\right|\ge {\mathrm{a}}_{\mathrm{max}}\\ {}\kern2em 0\kern1.5em ,\kern0.5em \left|\mathrm{a}\right|<{\mathrm{a}}_{\mathrm{max}}\end{array}\right. $$

where a(t) is the momentary acceleration and amax the maximum safe acceleration. The average value of A(t) during the observed period of time represents a numeric score of the participant’s driving safety.

Lane deviation

Lane deviation is a measure of traffic safety which represents the lateral stability of the vehicle. Safe driving behaviour in this category is defined as keeping the vehicle in the centre of the lane during the whole driving period. Lane deviation is calculated as the standard deviation of lane position (SDLP) as

$$ \mathrm{SDLP}=\sqrt{\sum \limits_{i=1}^N\frac{{\left({x}_i-\overline{x}\right)}^2}{N-1}}, $$

where xi are the measurements of the lateral vehicle position and \( \overline{x} \) the mean lane position.

Driving in the centre of the lane results in low values of the standard deviation, higher values represent unstable lateral vehicle control and indicate swerving during the observed time period.

3.5.2 Usability and user experience

Usability (cognitive load and task completion times)

Although usability is usually observed through effectiveness, efficiency and satisfaction [5], in this study we only observed efficiency (through task completion times) and effectiveness (cognitive load).

The task completion time was defined as the difference between the moment when the participant was instructed to start performing the task and the moment when the correct function in the IVIS was selected. The instructions were given vocally by the experimenter.

The cognitive load was assessed by measuring the changes in the participant’s pupil size. The results of a previous study are showing that in a driving simulator with stable light conditions (dark room, constant light source and three simulator screens) different levels of driver’s cognitive load can be assessed, even with a low cost eye tracking device [3]. The illumination of the driver’s visual field was homogenous and the TV screens emitted homogenous illumination through all changing scenes, because the simulated driving scenes were all in simulated bright daylight. Therefore the pupil data could not be significantly confounded because of reactions to changing lights and the pupil size is the reaction to different levels of cognitive load. Also the output of the IVIS was presented on a HUD and this way the number of glances away from the screens was minimized. In this study, we observed the pupil diameter of participants in all conditions, and compared (within-subject) the results among the different input designs.

User experience

The User-Experience Questionnaire [36] was used for the evaluation of user experience of each interaction design as it proves to be a simple way to evaluate user experience [21], is available in 20 languages (results are less affected by semantic differentials), has been used in other related studies [12, 32, 37] and features a benchmark [17, 33]. This allows comparison of the results of this study with other available and future work and reveals if the evaluated product has sufficient positive user experience to be successful in the market. It evaluates six different areas: attractiveness, perspicuity, efficiency, dependability, stimulation and novelty. These areas are evaluated with 26 questions, in each of which the participant has to choose on a scale from −3 to 3 between two opposite descriptions for one feature. For example, the pairs for evaluation attractiveness are pleasant-unpleasant, attractive-unattractive or friendly-unfriendly. The questionnaire is available in many languages, and since this study was performed in Slovenia with Slovenian speaking participants, we used the Slovenian version. User experience was evaluated immediately after each trial to evaluate the tested interface in that particular trial. The participants filled out the UEQ form in an html web form presented on the central TV of the driving simulator.

3.6 Experiment procedure

Upon arrival, the participants were first informed about the purpose of the study and asked to sign a consent form. They confirmed that they understand the purpose of the study and agree voluntarily to take part in it. After the introduction, the participants were asked to fill in a demographic questionnaire and a driving experience questionnaire. Before the start of the experiment each participant performed a 5 min test drive in the simulator to get acquainted with the simulation environment. During the experiment all participants were asked to drive safely and with the same attitude as they would drive a real vehicle. They were given no specific instructions on how precisely safety is assessed and how their driving performance will be evaluated.

After this warm-up procedure the participants started with the study which was divided into three trials – one for each input interface. They were given detailed instructions on how to use individual input device and 3 min to test the device and perform some demo interaction with the system. It was important also to explore and get familiar with the menu structure and the HUD output interface.

Each trial lasted approximately 5 min and included a set secondary tasks with varying difficulty (please see Table 2 for details). The participants started with the first task 1 min after the beginning of the drive and finished with the last task 30 s before the end of the drive. There were 30-s long breaks between the secondary tasks when participants were asked to focus solely on their driving performance.

Table 2 Experiment procedure for one participant

Immediately after each trial, the participants filled out the UEQ, evaluating their user experience with tested interaction design. After repeating this process for all three interfaces, the study was concluded and the participants were asked to provide some general comments about the study and the experimental setup. The complete procedure for a participant is shown in Table 2.

4 Results

4.1 Driving safety

Driving safety was observed through three variables: speed limit violation, acceleration violation, and SDLP. Each score was calculated separately for intervals where participants only drove and intervals where they drove and performed additional tasks using one of the evaluated interaction designs.

Additionally, safety distance violations were observed in the country road scenario, where the participants’ task was to follow a leading vehicle. We did not find any statistically significant differences in safety distance violations meaning that participants were successful in keeping the recommended safety distance (at least 2 s of time-to-collision).

4.1.1 Speed limit violation

We observed potential changes in speed limit violations between different interaction designs and different traffic conditions.

A two-way ANOVA was conducted to compare the main effects of performing secondary tasks and the interface design as well as the interaction effect between those two on the speed limit violation. Performing tasks had two levels (only driving, driving + task) and the interface design had three levels (free hand, touchpad, buttons) of interaction. No effect was statistically significant at the .05 significance level, except the effect of performing tasks vs. just driving in the city scenario.

In the city road the main effect for performing tasks yielded an F ratio of F(1,60) = 6.272, p < .05, indicating a significant difference between only driving (M = 0.0045, SD = 0.00683) and driving + task (M = 0.0012, SD = 0.00254). The main effect for interface design yielded an F ratio of F(2, 60) = 0.172, p > .05, indicating that the effect of for interaction design was not significant, free hand (M = 0.0023, SD = 0.00579), touchpad (M = 0.0030, SD = 0.00575) and buttons (M = 0.0033, SD = 0.00471). The interaction effect was not significant, F(2, 60) = 0.161, p > .05.

In the country road the main effect of performing tasks yielded an F ratio of F(1,84) = 1.826, p > .05, indicating no significant difference between only driving (M = 0.0005, SD = 0.00157) and driving + task (M = 0.0002, SD = 0.00079). The main effect for interface design yielded an F ratio of F(2, 84) = 0.766, p > .05, indicating that the effect of interaction design was not significant, free hand (M = 0.0006, SD = 0.00164), touchpad (M = 0.0002, SD = 0.00101) and buttons (M = 0.0003, SD = 0.00099). The interaction effect was not significant, F(2, 84) = 1.182, p > .05 (Figs. 5 and 6).

Fig. 5
figure 5

Speed limit violation on a country road (comparing three different interaction designs in two conditions – driving and driving + performing a task on the IVIS)

Fig. 6
figure 6

Speed limit violation in the city (comparing three different interaction designs in two conditions – driving and driving + performing a task on the IVIS)

4.1.2 Acceleration violation

The results for all conditions showed acceleration violation of less than 0.05%. The model used for the evaluation of driving performance considers such results as very safe driving behaviour. There were no significant differences for acceleration violation between trials with different interaction designs. Similarly, no significant differences were found between the country and city driving conditions.

4.1.3 Standard deviation of lane position

SDLP for trials with each interaction design is shown in Fig. 7 (country road) and in Fig. 8 (city road).

Fig. 7
figure 7

SDPL on a country road (comparing three different interaction designs in two conditions – driving and driving + performing a task on the IVIS)

Fig. 8
figure 8

SDPL in the city (comparing three different interaction designs in two conditions – driving and driving + performing a task on the IVIS)

A two-way ANOVA was conducted for each group to compare the main effects of performing secondary tasks and the interface design as well as the interaction effect between those two on SDLP. Performing tasks has two levels (only driving, driving + task) and the interface design has three levels (free hand, touchpad, buttons) of interaction. We found statistically significant effects on SDLP at the .05 significance level of performing tasks and of interface design in the country road, but no significant effects on the city road.

In the city road the main effect of performing tasks yielded an F ratio of F(1,60) = 0.343, p > .05, indicating no significant difference between only driving (M = 0.4810 m, SD = 0.11378 m) and driving + task (M = 0.5016 m, SD = 0.16007 m). The main effect of interface design yielded an F ratio of F(2, 60) = 0.181, p > .05, indicating that the effect of interaction design was not significant, free hand (M = 0.5029 m, SD = 0.13488 m), touchpad (M = 0.4774 m, SD = 0.13175 m) and buttons (M = 0.4936 m, SD = 0.15231 m). The interaction effect between the two independent variables on SDLP was not significant, F(2, 60) = 0.228, p > .05.

In the country road the main effect of performing tasks yielded an F ratio of F(1,84) = 3.946, p < .05, indicating a significant difference between only driving (M = 0.2352 m, SD = 0.05441 m) and driving + task (M = 0.2641 m, SD = 0.08759 m). The main effect of interface design yielded an F ratio of F(2, 84) = 5.737, p < .05, indicating that the effect of interaction design on SDLP was significant, free hand (M = 0.2845 m, SD = 0.08488 m), touchpad (M = 0.2314 m, SD = 0.06494 m) and buttons (M = 0.2330 m, SD = 0.05547 m). The interaction effect between the two independent variables on SDLP was not significant, F(2, 84) = 1.389, p > .05.

Post hoc Bonferroni comparisons indicated that the mean SDLP for the trials with a free-hand system (M = 0.2845 m, SD = 0.08488 m) was significantly higher than with a touchpad (M = 0.2314 m, SD = 0.06494 m) with a significance level of p < .05 and the buttons-based system (M = 0.2330 m, SD = 0.05547 m) with a significance level of p < .05. There were no statistically significant differences in SDPL when comparing the touchpad with the buttons system, p > .05.

4.2 Usability and user experience

4.2.1 Cognitive load

The pupil data was firstly pre-processed to remove genuine outliers (extreme values) and erroneous values (zero and null values). Mean value of pupil size was used for the analysis, which was collected during all time periods when tasks were performed. For statistical tests the mean value of left and right pupil size was used.

A two-way ANOVA was conducted to compare the main effects of performing secondary tasks and the interface design and the interaction effect between those two on pupil size. Performing tasks had two levels (only driving, driving + task) and the interface design had three levels (free hand, touchpad, buttons). No effect was statistically significant at the .05 significance level, except the effect of performing tasks in the country scenario.

The main effect of performing tasks on the city road yielded an F ratio of F(1,59) = 0.347, p < .05, indicating a significant difference between only driving (M = 4.3886 mm, SD = 0.6215 mm) and driving + task (M = 4.4854 mm, SD = 0.67259 mm). The main effect of interface design yielded an F ratio of F(2, 59) = 0.124, p > .05, indicating that the effect of interaction design was not significant, free hand (M = 4.3879 mm, SD = 0.64745 mm), touchpad (M = 4.4328 mm, SD = 0.65098 mm) and buttons (M = 4.4880 mm, SD = 0.66010 mm). The interaction effect between the two independent variables on pupil size was not significant, F(2, 59) = 0.019, p > .05.

The main effect of performing tasks on the country road yielded an F ratio of F(1,83) = 9.960, p > .05, indicating no significant difference between only driving (M = 3.8268 mm, SD = 0.32999 mm) and driving + task (M = 4.0677 mm, SD = 0.37597 mm). The main effect of interface design yielded an F ratio of F(2, 83) = 0.419, p > .05, indicating that the effect of interaction design was not significant, free hand (M = 3.9904 mm, SD = 0.37026 mm), touchpad (M = 3.9416 mm, SD = 0.40522 mm) and buttons (M = 3.9055 mm, SD = 0.34633 mm). The interaction effect between the two independent variables on pupil size was not significant, F(2, 83) = 0.106, p > .05.

In the next step we were focusing solely on the driving periods with secondary tasks again by studying the changes in pupil size. A two-way ANOVA was conducted to compare the main effects of the driving scenario and the interface design and the interaction effect between those two on pupil size. The scenario had two levels (country road, city road) and the interface design had three levels (free hand, touchpad, buttons) of interaction. The effect of the driving scenario on the pupil size was statistically significant at the .05 significance level, but effect of interaction design and the interaction effect were not significant.

The main effect of driving scenario yielded an F ratio of F(1,527) = 9.960, p < .01, indicating a significant difference between city road (M = 4.5159 mm, SD = 0.69505 mm) and country road (M = 4.0519 mm, SD = 0.41947 mm). The main effect of interface design yielded an F ratio of F(2, 517) = 0.184, p > .05, indicating that the effect of interaction design was not significant, free hand (M = 4.2827 mm, SD = 0.55685 mm), touchpad (M = 4.2306 mm, SD = 0.61644 mm) and buttons (M = 4.2252 mm, SD = 0.61681 mm). The interaction effect between the two independent variables on pupil size was not significant, F(2, 527) = 2.359, p > .05.

We can see this significant increase of the pupil sizes when comparing driving and driving task conditions in Fig. 9. However, in Fig. 10 we can see that there are no differences between driving and driving + task conditions.

Fig. 9
figure 9

Mean pupil size in countryside driving (comparing three different interaction designs in two conditions – driving and driving + performing a task on the IVIS)

Fig. 10
figure 10

Mean pupil size in city driving (comparing three different interaction designs in two conditions – driving and driving + performing a task on the IVIS)

4.2.2 Task completion time

A two-way ANOVA was conducted to compare the main effects of the driving scenario and the interface design and the interaction effect between those two task completion time. Scenario had two levels (country road, city road) and the interface design das three levels (free hand, touchpad, buttons). All effects on task completion time were statistically significant at the .05 significance level.

The main effect of driving scenario yielded an F ratio of F(1,36) = 18.164, p < .05, indicating a significant difference between city road (M = 21.43 s, SD = 10.369 s) and country road (M = 15.40 s, SD = 7.042 s). The main effect of interface design yielded an F ratio of F(2, 36) = 51.785, p < .01, indicating that the effect of interaction design was significant, free hand (M = 27.58 s, SD = 9.407 s), touchpad (M = 17.64 s, SD = 2.892 s) and buttons (M = 10.01 s, SD = 2.640 s). The interaction effect between the two independent variables on task completion time was also significant, F(2, 36) = 5.072, p < .05. Bonferroni post-hoc comparisons test indicated significant differences between all interfaces designs (p < 0.01) (Fig. 11).

Fig. 11
figure 11

Task completion times

4.2.3 User experience

The UEQs were filled out by each participant after finishing the trial with each of the evaluated interaction designs. In total, we collected six different groups of results and calculated the averages with the UEQ Data Analysis Tool provided on the UEQ homepage. Figures 12, 13, and 14 represent a benchmark for each device. The black line is the UEQ score for each set of questions. Color histograms represent benchmarks for benchmarks. The reference values are obtained from a comparative database containing data from 9905 subjects from 246 different studies and is available in the UEQ Data Analysis Tool.

Fig. 12
figure 12

UEQ scores for free hand interaction: a Country road; b city road

Fig. 13
figure 13

UEQ scores for touchpad interaction design: a Country road; b city road

Fig. 14
figure 14

UEQ scores for buttons interaction design: a Country road; b city road

The results below represent the scores of the UEQ scales. Free hand interaction has negative results in terms of Attractiveness, Perspicuity, Efficiency and Dependability, but high scores in terms of Stimulation and Novelty. The touchpad interface has medium positive results in all scores. Contrary to those two interfaces, the button has very high scores in terms of Attractiveness, Perspicuity, Efficiency and Dependability, but lacks in terms of Stimulation and Novelty.

Figures 12, 13, and 14 represent the scores of the UEQ scales.

4.3 Result summary

In this section, we present the summary of the most important results. Table 3 summarizes the results of different interaction designs in varying driving conditions while Table 4 focuses on the impact of secondary tasks.

Table 3 Effect of interaction design in different driving conditions
Table 4 Effect of secondary tasks in different driving conditions

5 Discussion and conclusions

In this study, we evaluated three different types of input modalities for interaction with an in-vehicle infotainment system. A head-up display was used to present visual information to the driver to reduce visual distraction and eyes-off-the-road events. We observed the impact of three different input modalities on driving performance, input efficiency, and user experience. Additionally, we tested the proposed interfaces in two different driving environments varying in traffic conditions and driving complexity.

The results of the driving performance showed differences between the tested interfaces only in the city, which represented a more demanding driving scenario. In the city driving scenario, there was a significantly higher standard deviation of lane position for trials with the free-hand input design compared to the buttons-based system and the touchpad. The free-hand interaction proved to be less safe and resulted in much worse vehicle control. Although both free-hand and touchpad interaction designs required the driver to take one hand off the wheel, a much higher standard deviation of lane position was observed only for the free-hand system. This could be a result of the interaction design, which requires drivers to perform different mid-air gestures which cause not only hand-off-the wheel but also certain movements of the entire driver’s torso. On the other hand, the results of speed violation seem to be very interesting as there were more violations in the conditions without secondary tasks. However, rather than speeding we see this more as consequence of drivers slowing down while performing secondary tasks and directing their attention to the interface.

In the country road driving scenario, the results did not reveal any significant differences in any of the observed parameters related to driving performance. Both lateral and longitudinal (speed and acceleration) control remained relatively stable and enabled safe driving. The mean speed violations were below 1%, which means less than 1 km/h of average violation of the speed limit.

The efficiency of different input designs was measured as task completion time for different secondary tasks as well as cognitive workload induced by different interaction designs. The latter was assessed through changes in pupil size for each individual driver. The results showed that the driver’s cognitive load increased significantly in the time intervals with secondary tasks (i.e. interaction with IVIS through one of the three interface designs) compared to the time intervals with just the driving task. Interestingly, this effect was observed only in the simple driving scenario while there were no significant differences between driving and driving + IVIS interaction in the complex city-based scenario.

Furthermore, also the results on the task completion times showed significant differences among the tested interfaces. The interface with buttons on the steering wheel proved to be the fastest among all input designs in both driving scenarios. The longest task completion times, on the other hand, were recorded for the free-hand system, compared to the touchpad system and the buttons input design. The results for the free-hand interaction design were however influenced also by the design itself – the confirmation gesture was holding the palm still for 1000 ms, which prolonged each task completion time for every task performed with this interaction design.

There was an increase in task completion times for all of the interfaces when the tasks were performed in more complex city driving conditions compared to the simpler country road scenario. Again, the highest increase in task completion times when comparing different driving conditions was found for the free-hand input design.

The results of the UEQ showed that there were significant differences in the perceived user experience when comparing different input designs. The attractiveness score of the touchpad and free-hand system was much higher than the attractiveness score of the buttons design. The participants gave the highest score for the free-hand design also for the stimulation and novelty categories, which represent hedonic quality and non-task related quality aspects. The touchpad was evaluated slightly worse and the buttons had a neutral score in this category. However, perspicuity, efficiency and dependability scores, which represent the pragmatic quality of the system and describe task-related quality, were best for the buttons design. The touchpad had a lower, but still comparable result, whereas free-hand interaction had the poorest score in these categories.

The results from this study suggest that using a buttons-based input design on the steering wheel is the most efficient type of interaction compared to the touchpad and free-hand input designs. On the other hand, the free-hand interaction design showed to be very attractive to the participants. However, the efficiency of that device was very poor. This could be due to the fact that the free-hand input interaction design was the only interface that did not provide haptic feedback, which could have contributed to the poor performance of the interface. The touchpad and the buttons input designs provided haptic feedback for every selection, which may have resulted in less looking at the HUD compared to the free-hand input design, where drivers probably looked more often to reassure their commands (each part of the tasks) were completed successfully. Such distractions, however, could be eliminated also for free-hand interactions by using free-hand devices that provide haptic feedback generated using ultrasound [8, 11].

Besides the impact of secondary tasks on driving performance and cognitive load, also the road type and geometry can have an effect on those variables as already shown in a study by Jeong et al.. They showed that SDLP and cognitive load are affected by an interaction of road geometry and secondary task characteristics [15]. Kun on the other hand, published results that report just opposite - higher SDLP on highways than in a city [18].

From this study, we can conclude that although newer input designs can be very attractive, their implementation into the vehicles infotainment system should be done wisely and carefully. Furthermore, the results of this study showed that the use of conventional buttons on the steering wheel still shows the best results for interacting with the infotainment system for both easy (country road) and more complex (city road) driving situations. However, it is important to take into consideration also of the fact that a lot of vehicles today have implemented buttons on the steering wheel for interaction with IVIS, and that a lot of the participants could have been more familiar with this interaction design compared to the newer designs. Although we tried to minimize the novelty effect by first presenting each interaction design and showing the participants how to use it, the prior knowledge may still have had an effect on the results. An additional limitation of this study could be also the use of a driving simulator instead of a real vehicle. However, simulators provide a more controllable and repeatable environment without putting the participants at risk, which was one of the main reasons why this (and many related studies) are performed in a simulated environment. Longer driving trials could reduce the simulator novelty effect, but also increase the risk for simulation sickness and loss of participant’s motivation for completing the tasks in the study.

Future research will take into consideration all the findings from this study, especially possible improvements on specific interface implementations, such as implantation of multimodal interaction and more diverse gesture recognition [17, 33] or active feedback. [11]. We see a big potential particularly in better implementation of our free-hand interaction system, which proved to be very attractive and well accepted by users, but lacked robustness and efficiency. Our study showed that each interface design has its own pros and cons, which strongly depends on the current driving conditions and driver’s mental state. Therefore, we also want to combine these interface designs into a multimodal input interface which will enable drivers to freely select the most convenient interaction mode at any time. Although safety should play the most important role when selecting the most appropriate interface in different driving conditions, driver’s preferences and personal judgment should also be considered whenever possible.