Keywords

1 Introduction

The use of mobile phones in cars has risen significantly in the last decade. This rise is attributed to the need for drivers to feel connected to their outside world [1]. Vehicle information exchange takes place inside of the vehicle, typically in the form of conversations, and outside the vehicle enabled by vehicle telematics. In both case, speech is utilized as the main channel of these interactions. When driving with passengers, a driver engages in conversations that establish a social communication in the vehicle. When driving alone, research has also found that drivers have the urge to use their mobile phones while driving [2], performing socializing activities such as calling, texting or checking email and social networks. Vehicle manufacturers nowadays are continuously integrating features that help drivers stay socially connected to the outside world. Studies show that use of these features increases the driver’s cognitive load impairing driving performance [3] causing slow reaction time, poor judgment, and poor lane maintenance, due to reduced lateral and longitudinal vehicle controls and reduced critical event detection [47]. Driver’s distraction is defined as the diversion of a driver’s attention away from activities that are critical for safe driving. Such distraction is caused by a competing activity [8], object, or event within the internal or the external vehicle environment [9]. More than 34 US states, and over 52 countries have passed distracted driving legislations banning the use of mobile devices while driving. But banning the direct use of mobile devices while driving has only caused further distraction and increased dangers. Drivers that use their mobile phones while driving take extra handheld operation measures for fear of being caught, increasing the chances of an accident. The Federal Communications Commission (FCC) is working to recognize and help develop technologies that could potentially reduce driver’s distraction [10].

This paper introduces Voiceing™, a voice activated application that utilizes speech to send/receive voice messages and send emails while driving to fill the void in vehicular social communications safely. This research investigates four different conditions of socializing inside the vehicle vs. outside of the vehicle while driving a simulator: (1) using Voiceing™ to send and receive voice messages, (2) interacting with a passenger, which represents an in-vehicle human interaction condition, (3) interacting with a familiar person over the phone, which represents an—outside of the vehicle—human interaction condition, (4) interacting with an unfamiliar person at a call center, which represents an interaction with a trained professional. (5) Texting while driving. The rest of this paper presents previous work on distracted driving studies, introduces the Voiceing™ technology, explains the methods in which driving distractions were measured and the tools used to measure them, and finally describes the results comparing the above-mentioned modalities.

2 Distracted Driving

Driver distraction occurs when a driver is delayed from the recognition of information that is necessary to safely accomplish the driving task [11]. The “100 car naturalistic study” found that secondary tasks were performed during 40 % of all trips [11]. Many drivers find it difficult to resist the temptation of using mobile devices to stay connected while driving. Text messaging, for example, has been found to affect lateral control and reaction time, lowering it up to 35 % [12]. In parallel, the amount of attention given to the use of In-vehicle Infotainment Systems (IVISs) has peaked considerably and raised safety concerns [13] and studies have demonstrated that the use of IVISs contributed to 25–30 % of crash risk [14].

The cause of such distractions relates to the fact that humans have a limited amount of available cognitive resources, according to the Dual-Task Paradigm [15]. In driving conditions a great deal of those resources are allocated for the primary driving task, leaving little capacity to secondary actions. However, the multiple resource theory of attention has shown that different pools of resources could be used in parallel, allowing to perform multitasking if actions performed are being allocated in different modalities [16]. This principle supports the design of hands-free, eyes-free vehicular interfaces. Many present auditory interfaces are preferred media for in-vehicle warnings [17] or menu navigation [18]. Driver distraction effects of utilizing them have been proven lower, comparing voice interfaces to manual interfaces [19]. This paper analyzes the effect of using Voiceing™ on driving performance as opposed to the effects of talking to a person inside of the vehicle(passenger), talking to a person outside the vehicle (call center), and texting while driving.

3 Voiceing™

Since one of the speech conditions investigated in this research is voice messages, this section explain Voiceing™, the voice user interface benchmarked in this study. Voiceing™ is a hands-free, eyes-free voice activated system developed by the Human-Centered Computing lab at Clemson University. It allows for sending and receiving short messages using speech. The uniqueness of this tool lies in its independence of being exclusive to any mobile platform. It could be used with any telephone number, mobile or landline. Users of Voiceing™ could send voice messages and receive replies from non Voiceing™ users. The system allows the user to operate diverse features such as using voice to send emails, short auditory voice messages, setting reminders, and making calls. In order to interact with it, a user would perform a call to their unique Voiceing™ number and would interact with the speech interface accordingly. Once a user creates an account, s/he is able to set the delivery of voice replies one or multiple delivery modalities according to the user’s preference. Participants used a dummy account created for the purpose of this study. The interaction with the system begins when the Voiceing™ systems prompts the participant with “how may I help you?”. The participant then says “send a text to John Doe”. The system prompts them to record their message, verifies it with the participant and then sends it to out to the contact(s). When the user receives voice messages, the system calls the participant. When the call is answered, the system prompts “you have a message from John Doe” then will go forth with playing the voice recording of the message.

4 Experiment Design

This study examines the effects of various social communication methods on driving performance and distraction in a driving scenario on a simulator. Different conditions were investigated and were compared to a single driving condition as a baseline, and a texting while driving condition, as it has already been proven to be a highly distracting action. The conditions were: (1) using Voiceing™ to send and receive voice messages, (2) interacting with a vehicle passenger, (3) interacting with a familiar person over the phone, and (4) interacting with an unfamiliar person at a call center.

4.1 Apparatus and Study Logistics

The driving simulator operated on a desktop PC, running Windows 7. The driving controls used in the study consisted of (1) a Logitech G27 dual-motor force feedback racing wheel mounted on a table; (2) an HD LCD 40 in. screen positioned in-line with the racing wheel and directly in front of the participant, and (3) steel gas, brake, and clutch foot pedals placed on the ground under the table. Simulator was set up for automatic gear. An adjustable desk chair was used to provide different height variations according to participants’ liking. A second desktop PC running Windows 7 was used in connection with a Sony WCS-999 wireless microphone system that the participant wore during the drive. Skype version 4.2, a VoIP software application, was used to place telephonic calls. Figure 1 depicts the experiment setup. The software used to collect performance data was the Lane Change Task simulation and analysis software (LCT) [20].

Fig. 1
figure 1

Experiment setup

4.2 Demographics

Participants were recruited from university campus and were required to be licensed drivers. A total of 127 participants, 53 females and 74 males, took part in this study. Participants ranged in age from 20 to 59 and the mean age of the sample was 25.1 years old, SD = 10.19. Only one percent of the participants suffered from disabilities, but this did not affect the participant’s driving skills. All participants owned a driver’s license with a mean driving experience of 7.62 years. Participation in the study was voluntary. Subjects were rewarded $10 for their participation.

4.3 Design

The experiment used a single factor, repeated measures within-subjects design. The independent variable examined was the method of communication as a secondary task while driving the simulator. Driver distraction was measured as a combination of driving performance measures and a subjective cognitive load assessment. User experience measures were also collected in the post-experiment survey. All participants were fluent in English; 70 % of them were native speakers and 30 % spoke English as a second language.

4.4 Secondary Task Conditions

This study analyzed the interaction of a driver in four different conditions, each handling a different modality of interaction while driving. All conditions were hands free. Participants were asked to press a button on the steering-wheel in order to engage with a person or a system. The four conditions were:

  1. 1.

    Interacting with Voiceing™: Participants were asked to interact with a pre-determined contact, R3 in Fig. 2, an experimenter not present in the same room, via sending and receiving voice messages. Participants called the Voiceing™ system by pressing a button on the steering wheel and interact with the system, sending and receiving voice messages until they reached the end of the track.

    Fig. 2
    figure 2

    Researchers and participant placement

  2. 2.

    Interacting with a familiar person on the phone: Participants were instructed to call, and have a friendly conversation with, Michael Jackson for the length of a track. We used the name Michael Jackson for ease of remembering. Researcher R3 in Fig. 2 played Michael Jackson. Participants were instructed to discuss any topic that came to mind. R3 stimulated the conversation by asking further questions such as “what did you eat this morning” to keep the participant engaged in the conversation.

  3. 3.

    Interacting with a human passenger: Participant was presented with a problem scenario and were asked to collaborate with their passenger to find a solution. Researcher R2 in Fig. 2 played the passenger and sat next to the driver. R2 used a web-based system to look up answers for participant’s specific questions. When presented with the problem, it was up to the participant to explain the problem to R2 and seek help solving the problem.

  4. 4.

    Interacting with a call center operator: Participant was presented with a problem scenario and was asked to consult a call center operator for information on how to solve the problem at hand. The participant initiated the call by pressing the button on the steering wheel. As shown in Fig. 2, the call center operator, R3, was able to provide the driver with road assistance information according to the questions asked.

  5. 5.

    Texting while driving: Participant was asked to text a continuous series of text messages while performing the LCT task using their own mobile phone. The messages were sent to R3 who engaged in a text message conversation with the driver.

5 Procedure

First, participants were asked to consent to participating in the study. Once obtained, participants were provided with information about the study and asked to complete a pre-survey that collected demographic data, as well as previous experience operating the simulator, using voice technologies, and texting habits. Next, participants practiced driving the simulator. A practice session involved driving three consecutive tracks on the LCT simulator to get familiar with how to operate it. When completed, an interaction task was randomly assigned to each participant to counterbalance the conditions. All conditions required participants to drive maintaining the lane changing and maneuvering task, while they underwent the requirements of each condition.

An example of a problem scenario presented in conditions passenger and call center is “This is your first time operating the vehicle and you don’t know how to place a call. Consult the passenger”. While driving the simulator, the participant asked the partners (passenger or call center agent) questions like “how do I make a phone call”. The interlocutor then queried web-based search engine for the correct answer and read it back to the participant. If the question had no answer, then the participant would be asked to rephrase the question. This process was repeated with various scenarios until the end of the track was reached.

After all tasks were completed, the participant was asked to complete a post survey questionnaire. NASA–TLX is a multi-dimensional 20‐point Likert scale that measures six workload dimensions (mental demand, physical demand, time demand, performance, effort, and frustration level). The participant also completed a usability questionnaire using a modified version of the technology acceptance model, TAM [21].

6 Results

6.1 Driver Performance Results

A within-subject baseline was computed in the training phase and each secondary task was compared against the individual baseline to obtain driver performance metrics. All performance metrics were extracted running the raw LCT data files trough a data mining script in NI DIAdem [22]. Metrics collected following this procedure included mean lateral deviation, standard deviation (SD) of the mean lateral deviation, reaction time, and wrong lane changes.

Results on mean lateral deviation, shown in Fig. 3, present lower mean values for the baseline and driving alone than the secondary tasks.

Fig. 3
figure 3

Mean lateral deviation on the LCT (error bars represent 95 % confidence interval)

One-Way analysis of variance (ANOVA) was performed to compare the differences. The results, (p = 0.000), proved significant differences across secondary tasks. Bonferroni test for pairwise comparison found significant differences between baseline and texting tasks (p = 0.007) as well as between baseline and Call Center tasks (p = 0.001). No significant difference was found between the baseline and Voiceing.

Figure 4 shows that the mean values of lateral standard deviation (SD) were lowest during the use of Voiceing™ and surprisingly during the texting task. Participants had a tighter control of the steering wheel than during the baseline. This results suggest that during these two tasks participants where focusing on the interaction and they performed a tighter control on the wheel. The phone and passenger conversations resulted in the highest SD values suggesting looser control of the vehicle during these tasks. ANOVA analysis reported however no statistical significance between tasks.

Fig. 4
figure 4

SD of lateral deviation on the LCT (error bars represent 95 % confidence interval)

Reaction time in the LCT is calculated from the time the sign is legible until the time the user started the lane change maneuver. Given that the point of recognition for the LCT is known, the reaction time was calculated using steering angle measures. Figure 5 displays the mean reaction time values for each task. Fastest reaction times were achieved during the single drive baseline and increased as participants performed secondary actions. ANOVA test comparing mean reaction times showed significance between tasks. Following post hoc test showed only significance on texting while driving (p = 0.038). The reaction time during conversation with the passenger was shorter than during phone conversations or Voiceing™ usage.

Fig. 5
figure 5

Reaction times on the LCT (error bars represent 95 % confidence interval)

Extremely illustrative behaviors were found in the wrong lane changes count for each task, as can be seen in Fig. 6. A wrong lane change means the participant missed the sign. Results of the mean values show that there was, on average, close to zero mistakes during the single task drive. The number of wrong lane changes increased an average of 40 % performing secondary tasks like conversations with a passenger or the operator of a call center. However, while performing secondary actions that required hand and eye attention, such as texting, the number of wrong lane changes increased dramatically. Kruskal-Wallis ANOVA showed very high significance, p = 0.000, across tasks. The Tukey-Kramer test showed significant differences between the baseline and texting (p = 0.000). Texting also presented significant differences compared to Voiceing™ and the Call Center conditions.

Fig. 6
figure 6

Wrong lane changes on the LCT (error bars represent 95 % confidence interval)

To further investigate the effects across modalities, the two driving actions of which the LCT is composed, lane keeping and lane changing, were analyzed separately. The effects of the different secondary actions were studied during these phases. Results for the mean lateral deviation are presented in Figs. 7 and 8. During the lane keeping phases, the driver had to concentrate on maintaining the vehicle in the center of the lane. Results, Fig. 7, showed that mean deviations when the driver was interacting with people were the highest. The values for lane deviation were lowest for the Voiceing™ task, suggesting that participants kept better control than even the single drive baseline. The results were surprisingly similar for the texting condition. Interacting with human partners revealed to be more distracting during the lane-keeping task, based on the lateral deviation values. Statistical analysis reported very high significance in group comparison, p = 0.000, and post hoc tests showed significant differences comparing the baseline to all but for the Voiceing™ task. Also, the conversational tasks involving human partners showed high statistical significance compared to Voiceing. The results of analyzing SD during lane keeping showed the same results as the mean deviation values. On the lane changing phases, however, the results were opposite to the lane keeping, see Fig. 8. The secondary tasks that involved interaction with people showed better performance than those that implied interacting with technology. Furthermore, results for phone conversation and the call center showed a slight increment in vehicle control during lane changes compared to the baseline. Post hoc pairwise comparisons reported significant differences between the baseline and Voiceing, p < 0.05, and very high significance, p = 0.001, between baseline and texting. Likewise, significance was found between the human-to-human modalities, texting and Voiceing.

Fig. 7
figure 7

Mean deviation during lane keeping (error bars represent 95 % confidence interval)

Fig. 8
figure 8

Mean deviation during lane changing (error bars represent 95 % confidence interval)

6.2 Cognitive Load Results

After performing the secondary driving tasks, participants were asked to rate their cognitive workload using the NASA-TLX questionnaire [23]. The values of the overall cognitive workload were calculated giving equal weight to the six workload dimensions present in the TLX-questionnaire. Figure 9 shows the mean values per task on a normalized Likert scale from 0, lowest, to 7, highest.

Fig. 9
figure 9

Average TLX ratings for subjective cognitive load (error bars represent 95 % confidence interval)

Subjective cognitive level reported by the participants during the baseline drive was lower than when asked to perform a secondary task. The most demanding task was clearly texting and the rest of the conditions presented similar mean values. Statistical analysis followed to determine the significance of the mean value differences. The Tukey Kramer procedure revealed pairwise very significant differences, p < 0.001, between texting while driving and the baseline, but also between texting and all other tasks. While effort, temporal, physical and mental demand showed similar results to the overall measure, performance and frustration levels clearly benefitted the Voiceing™ task above the rest. The highest performance score was reported for the Voiceing™ task, where participants believed to drive better than during the baseline task, Fig. 10.

Fig. 10
figure 10

Subjective performance level reported on the TLX questionnaire

In an inverse correlation, the frustration levels reported during the different conditions were lowest for the baseline and Voiceing™ tasks, Fig. 11. The highest frustration was experienced while participants were chatting on the phone which suggests that participants were aware of the difficulty to perform accurately on the LCT task while they were on the phone.

Fig. 11
figure 11

Subjective frustration level reported on the TLX

7 Conclusion

The results presented in this paper suggest that in order to fulfill the socializing needs of drivers, voice-activated applications have the least effect on the driver in terms of driving performance. Even compared with human conversations in the vehicle and through telephony lines, we see results that are favorable towards Voiceing™. The driving performance metrics were also supported by the behavioral intention of use of the participants, the majority of which rated Voiceing™ as a preferred communication media, especially when compared to texting while driving. These results suggest that the social activity in the vehicle is inherently spoken even though many of the participants were very skilled at texting. The self-reported performance measures indicate as well that participants felt comfortable using the Voiceing™ application.

All in all, these results suggest that speech technology is the most promising interaction method in the vehicle to enable the social act. A driver-centric design for in-vehicle communication systems has the potential to lower driver distraction. The implementation of such a system like Voiceing™ improves communication without significantly impacting driver cognitive load or driver distraction. We hope to use these results when designing other socially-inspired communication devices and other in-vehicle social services. Future improvements on natural speech interaction of the voice user interface and inclusion of context awareness in the application could potentially help further palliate the effects of interacting with a vehicle system while satisfying the communication needs of drivers.