Keywords

1 Introduction

Trust can be conceptualized as a belief that an entity will act with benevolence, integrity, predictability, or competence [14]. Recent studies indicate that people have negative attitudes toward utilizing autonomous platforms [7, 11]. With the growth and the increase in the complexity of autonomous systems in the 21st century, managing the trust of users in such systems has become an important concept when designing new autonomous systems [18, 19]. Numerous studies in the domain of trust and AI have suggested that the management and the constant improvement of this mutual trust between autonomous systems and their users will be one of the primary challenges the industry professionals will face when trying to popularize the use of fully autonomous systems [3, 4, 9]. These discoveries highlight the necessity and urgency of conducting research to better understand the evolution of trust between humans and growing autonomous technologies, and to provide technologies that are responsive to human trust.

Google-Waymo, Tesla, Mercedes-Benz, and others have been developing semi or fully-autonomous vehicles, and they predict this technology will be deployed in the near future [23]. It is known that the widespread adoption of autonomous technologies depends on consumers experiencing and maintaining positive emotional responses in autonomy.

We therefore aim at this problem using Electroencephalography (EEG) signals. The EEG signal analysis is a methodology that researchers use to monitor brain activities and relate those signals to emotional states [6, 12, 15, 25, 26]. In our experimental study, the emotional responses of fifty (50) human subjects are evaluated through EEG analysis while they are in a SDC simulator. The simulator is utilized to elicit negative emotions in order to evaluate the level of passenger fear, stress, and anxiety in response to actions of the SDC. In fact, this information can be used to develop controllers for SDCs so that they become responsive to passengers’ emotional states, and accordingly, adjust their behaviors whenever it’s needed [18].

1.1 Motivation, Novelty and Contribution

The uniqueness of our experiment lies in the immersiveness of the SDC simulator, its sequential-and-structured data collection approach, and the way it correlates basic emotions such as fear, stress and anxiety to real-time trust. Prior research works have not implemented a VR environment with real videos from roads and highways, not computer-generated or animated videos, along with a motion chair to create a SDC simulation while monitoring basic emotions by EEG and measuring real-time trust by an objective approach. This work builds on prior work from [21]. Our earlier work [24] illustrated that our simulator is highly effective for collecting real-time data from subjects. The previous work only required subjects to self-report their trust levels, while the usage of an additional EEG brainwave monitor has potential to provide meaningful data to validate the self-report metrics.

In fact, the way that our research correlates basic emotions to real-time trust through objective as well as subjective data collections is unique. Basic emotions such as fear, stress, anxiety, sadness, happiness and excitement are well-studied. There are known brain’s activities and/or hormones, e.g., oxytocin, cortisol and serotonin, associated with these basic emotions. They usually emerge in real-time with visible signs and gestures. However, human factors such as trust/distrust or satisfaction/frustration [1, 2] are shaped over time, and often with invisible signs. I.e., there are no obvious connections between these human factors and brain’s activities and/or hormones although there are prior studies that show indirect connections, e.g., oxytocin increases trust in humans [10]. Our results showed that the elements of basic emotions collected in real-time by an EEG headset were consistently associated with real-time trust measures collected by our objective data collection approach. Indeed, we studied sub-dimensions of trust by triggering elements of basic emotions that could affect trust when observed.

The results of our experiment unveiled that there is a direct correlation between the real-time trust in SDCs and emotional responses. In other words, the trust level and trust rebuild after faulty behaviors depend on the driving style as well as reaction of the SDC to passengers’ emotions. Using the mean EEG beta/alpha wave band power, i.e., quantification of the emotional state, as an indicator of feeling stress and anxious in a SDC appears to be effective when the stress inducing event is dramatic and strong within a small time interval, but becomes more difficult to obtain meaningful data when the incident does no elicit a very strong response or the time interval is long.

Our results therefore illustrate that trust in SDCs, and accordingly, social acceptability can be achieved if SDCs become responsive to emotional responses by using proper operation modes, e.g., normal, cautious, and alert modes. For instance, the SDC can avoid busy roads or highways, drive on the right-hand-side lane with a speed lower than the speed limit, or avoid overpassing other cars when it’s in the cautious mode. Our results are expected to inform the design and operation of a control module that monitors the emotional state of passengers, using computational models [17, 20], and adjusts the AI control parameters accordingly in semi or fully-autonomous vehicles [18]. Moreover, these results can be utilized to better understand passengers’ expectations from semi or fully-autonomous vehicles [5, 19].

2 Self-driving Car Simulator Setup

The SDC Simulator is a safe platform to expose human subjects to a variety of driving scenarios. It is built by fusion of an Oculus Rift headset with an Atomic A3 Full Motion Simulator and offers a combination of complete visual, audio, and motion immersion that creates a convincingly realistic simulation. This platform allows for participants to safely be exposed to unique driving situations that would otherwise offer potential risk to passengers if performed with real vehicles. Figure 1 shows a participant in our VR SDC simulator.

Fig. 1.
figure 1

Participant using the SDC simulator.

Our driving scenarios were randomly recorded using the GoPro Fusion Camera and edited using the GoPro Fusion Studio to produce 360\(^\circ \) videoFootnote 1. The videos of driving scenarios were exported from Fusion Studio at 4k resolution as MP4s along with 360\(^\circ \) MP3 audio files. The Oculus Rift headset outputs 1080\(\,\times \,\)1200 resolution per eye, at 90 Hz refresh rate, a 110\(^\circ \) field of view, and has headphones which output a 3D audio effect. Our human subjects could freely move their heads 360\(^\circ \) to see the complete scene while using the Oculus Rift VR headset. See Fig. 2 for the view inside the simulator.

Fig. 2.
figure 2

View from the simulation. Each frame represents the participant’s view as they turn their head to look around, illustrating the 360\(^\circ \) view inside the simulator.

Our motion simulator could move up to 71\(^\circ \) per second across a full 27\(^\circ \) dual-axis movement rangeFootnote 2. The Atomic A3 Motion simulator has been used previously by NASA to create realistic moon rover simulationsFootnote 3. The combination of complete visual, audio, and movement immersion provides a convincingly realistic simulation.

3 Research Methodology

3.1 Sequential-and-Structured Data Collection

The exposure of participants to the various segments builds on prior work from [16]. Two structured data collection templates to measure trust with autonomous systems were used in the present work. Segments are categorized into five distinct groups, as shown below.

  1. 1.

    Initial Trust: Segments that capture the initial trust of the passengers in the first few minutes of the interaction.

  2. 2.

    Trust Escalation: Segments where the subject’s trust is increased: Involves smooth and predictable driving by the SDC without any complications.

  3. 3.

    Trust Reduction: Segments where the human subject’s trust is decreased: Involves the SDC driving aggressively.

  4. 4.

    Trust Mutation: A sequence of mild incidents (e.g., a rapid lane change by the SDC) followed by critical incidents (e.g., stop-sign violation or near collision with another car) and vice versa, can be negative/positive incidents.

  5. 5.

    Re-Building Trust: Segments designed to rebuild trust between the passenger and the SDC. Involves the SDC driving predictably and calmly after trust-damaging incidents.

3.2 Experimental Design

Participants were randomly placed in one of two possible SDC simulation scenarios that were based on templates from prior work in [16]. Each scenario is made up of 5 segments. Specific scenario-segment pairs are denoted with a two letter abbreviation followed by the scenario and segment numbers, e.g., TR\(_{\mathrm {I-II}}\) denotes trust reduction segment 2 of scenario 1, shown in Tables 1 and 2.

Table 1. Simulation Scenario 1.
Table 2. Simulation Scenario 2.

Once the participant is in the SDC simulator, the EPOC+ EEG is attached to the participant. An initial 1 min baseline is taken of the participant’s brainwaves with no visual, audio, or motion stimulus. After the baseline, the scenario begins, and the EEG records the participant’s brainwaves for each segment. Each segment is an exposure to an approximately 2 min SDC driving simulation. After each segment, the participant is presented with a Likert Scale that appears inside the Oculus Rift. The participant selects their response by focusing their gaze on the desired answer for five seconds while wearing the Oculus Rift.

During the response interval, the participant responds to the question “On a scale of 1–5 with 1 being the lowest and 5 being the highest, after this simulation, what is your level of trust in the self-driving car?”. After the participant responds, the application moves on to the next segment until the simulation scenario is complete. Tables 1 and 2 define the scenario and segment pairings.

An initial trust/trust escalation segment involved the SDC moving slowly and predictably while adhering to the rules of the road. A trust reduction segment involved the SDC along with Human-Driving Cars (HDC) moving erratically and unpredictably, breaking rules of the road including speeding, tailgating, and sudden lane changes. In the NM\(_{\mathrm {I-IV}}\) segment, the SDC ran through a non-visible stop sign and nearly collided with another car and then proceeded to drive through a residential neighborhood. In the NM\(_{\mathrm {II-IV}}\) segment, the SDC ran through a stop sign unexpectedly and detected a pedestrian and a bicyclist crossing a crosswalk and abruptly came to a stop. A rebuild trust segment involved the SDC driving defensively and adhering to rules of the road. Note that HDCs were involved in all scenarios.

In reality, it takes months and even years to be able to repair/rebuild damaged trust. In the case of minor issues, it may take several months to rebuild trust. In critical situations, it may take years. If the concentration of the project was on repairing trust between human-and-human, we probably could execute a multi-year clinical study to conduct this research. However, to rebuild trust between human-and-SDC, it would be challenging to run similar studies. For that reason, we made it clear that, after the trust-damaging incident in segment-4, the human subject should assume that the behavior of the car in segment-5 will be repeated for months and months.

It is predicted that after the initial trust/trust escalation segments, the participants will respond with high levels of trust in the SDC, and after trust reduction segments, the participant will respond with low levels of trust in the SDC. It is also predicted that after the negative trust mutation segment, the participant will report a drastic decrease in trust. For EEG response, it is predicted that when trust damaging events occur in the simulation a high beta/alpha ratio will be observed in participants, indicating negative emotional response, and loss of trust. In trust building segments, it is expected that a low beta/alpha ratio will be observed in participants, indicating positive emotional response, calmness, and trust in the SDC simulator. It is also expected that spikes in the power of the beta/alpha ratio will correlate in time to when negative incidents occur inside the SDC simulator, e.g., in segment NM\(_{\mathrm {I-IV}}\) when the SDC does not stop at an intersection and is nearly in a devastating car accident.

4 Experiment and Technical Results

Fifty (50) human subjects were recruited to participate in the 11-minute VR autonomous driving simulationFootnote 4. Subjects were each given $25 gift cards. Each subject was randomly placed in either Scenario 1 or Scenario 2.

4.1 Participant’s Response

Scenario 1. Figure 3 shows subjects response to “On a scale of 1–5 with 1 being the lowest and 5 being the highest, after this simulation, what is your level of trust in the SDC?” after each segment in Scenario 1. The green box shows the quartiles of the dataset. The yellow whiskers extend to show the whole distribution, except for points that are determined to be outliers using a method that is a function of the inter-quartile range. Pink lines are the median response.

In the initial trust segment (IT\(_{\mathrm {I-I}}\)), participants responded with a mean score of 4.68 ± 0.47, followed by a mean score of 3.72 ± 0.96 in the first trust reduction segment (TR\(_{\mathrm {I-II}}\)). After exposure to the further trust reduction segment (TR\(_{\mathrm {I-III}}\)), the score decreased slightly to 3.68 ± 1.25, followed by a large decline to 1.92 ± 0.84 when exposed to the negative trust mutation segment (NM\(_{\mathrm {I-IV}}\)). Finally, trust levels increased to 3.96 ± 0.92 in the rebuild trust segment (RT\(_{\mathrm {I-V}}\)). The largest change across segments was between the negative trust mutation (NM\(_{\mathrm {I-IV}}\)) and the initial trust segment (IT\(_{\mathrm {I-I}}\)), consistent with the expectation that erratic driving has the potential to severely reduce trust.

Fig. 3.
figure 3

Subjects reported trust level in the SDC.

Scenario 1 performed as expected. Participants scored the initial-trust and rebuild-trust segments with high levels of trust. Participants scored the trust reduction segments with lower levels of trust, and the negative trust mutation segment with the lowest level of trust. As expected, the negative trust mutation had the lowest trust levels and was significantly lower than all other segments.

An interesting result is the difference between the initial trust segment and the final segment designed to rebuild trust. While participants scored their level of trust after RT\(_{\mathrm {I-V}}\) at 3.96 ± 0.92, a high value, it is significantly lower than the initial trust value (4.68 ± 0.47), representing a 15% decrease. This seems to indicate that participants trusted the SDC less after being exposed to trust-damaging segments.

Segment (NM\(_{\mathrm {I-IV}}\)) was the most drastic portrayal of a malfunction in a SDC. In segment (NM\(_{\mathrm {I-IV}}\)) the SDC runs through a stop sign, and comes within inches of colliding with another vehicle. The view from the participant’s perspective of this incident can be seen in Fig. 4.

Fig. 4.
figure 4

During the (NM\(_{\mathrm {I-IV}}\)) segment an incident occurs where the SDC does not stop at an intersection and nearly collides with another vehicle. The frames from top to bottom show the progression of the incident from the perspective of the participant.

Scenario 2. Figure 5 shows subjects response to “On a scale of 1–5 with 1 being the lowest and 5 being the highest, after this simulation, what is your level of trust in the SDC?” after each segment in Scenario 2. The green box shows the quartiles of the dataset. The yellow whiskers extend to show the rest of the distribution, except for points that are determined to be outliers using a method that is a function of the inter-quartile range. Pink lines are the median response.

In the initial trust segment (IT\(_{\mathrm {II-I}}\)), participants responded with a mean score of 4.40 ± 0.69, followed by a mean score of 4.64 ± 0.68 in the first trust escalation segment (TE\(_{\mathrm {II-II}}\)). After exposure to trust reduction (TR\(_{\mathrm {II-III}}\)), the score decreased to 3.68 ± 1.12, followed by a further decline to 3.64 ± 1.13 when exposed to the negative trust mutation (NM\(_{\mathrm {II-IV}}\)). Finally, trust levels increased to 4.04 ± 1.11 in the Rebuild Trust segment (RT\(_{\mathrm {II-V}}\)).

Fig. 5.
figure 5

Subjects reported trust level in the SDC.

In (NM\(_{\mathrm {II-IV}}\)), the SDC approaches a crosswalk and stops for a pedestrian to cross the street. This was the only segment that involved the SDC interacting near a pedestrian. Participants reported low levels of trust after this segment and commented that they especially did not trust the SDC near pedestrians. In Scenario 2, while participants score their level of trust after RT\(_{\mathrm {II-V}}\) at 4.04 ± 1.11, a high value, it is significantly lower than the initial trust segment (4.40 ± 0.69), representing a 8% decrease. This seems to indicate that participants trusted the SDC less after being exposed to trust-damaging segments.

The results of the experiment were generally consistent with our expectations. The participants reported higher trust levels after experiencing initial trust and trust escalation segments and reported distrust after the trust reduction segments, as well as high distrust after the negative trust mutation segment. Participants did not trust the SDC around pedestrians. Finally, participants in both groups were able to relatively rebuild their trust after the trust damaging Negative Trust Mutation segments.

4.2 EEG Response and Data Cleaning

Based on prior research [8, 22, 26] the ratio of the average power between the beta and alpha waves was used as the main feature to determine emotional state. A high beta/alpha power ratio indicates negative emotional response. A low beta/alpha ratio indicates positive emotional response. The average power in the delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), and beta (12–30 Hz) is computed from the PSD of the raw EEG data.

Since Wang et al. [26] found that the features with the most information on emotional response were mainly on right occipital lobe and parietal lobe in alpha band, the parietal lobe and temporal lobe in beta band, the P7, T7 and O1 electrodes were the channels used for the signal processing and data analysis. In the future, more channels and interaction between channels will be analyzed.

The power spectral density of raw EEG data sampled 128 Hz was computed in 4 second blocks using Welch’s method with a 50% overlap. The beta to alpha power ratio was computed for each 4 second block by taking the ratio of the mean power in the respective frequency band. A low pass and high pass variance filter was applied to reduce signal noise from participant head movement.

EEG Analysis Scenario 1. Figure 6 shows the mean beta/alpha power for all participants in Scenario 1 across segments after the data was cleaned with the variance filter. Figure 7 shows the change in the power of the mean beta/alpha power between segments for Scenario 1. From the TR\(_{\mathrm {I-II}}\) to NM\(_{\mathrm {I-IV}}\) segments the mean beta/alpha power is increasing (TR\(_{\mathrm {I-II}}\) dB: −2.10, TR\(_{\mathrm {I-III}}\) dB: −1.78, NM\(_{\mathrm {I-IV}}\), dB: −1.54), indicating that participants are feeling increasingly stressed and anxious. This is expected as the segments become increasingly stress-full, and this matches the subjects ratings of their mean trust in the SDC (TR\(_{\mathrm {I-II}}\) Mean Trust Score: 3.72 ± 0.68, TR\(_{\mathrm {I-III}}\) Mean Trust Score: 3.68 ± 1.25, NM\(_{\mathrm {I-IV}}\) Mean Trust Score: 1.92 ± 0.84).

As predicted, NM\(_{\mathrm {I-IV}}\) segment had the lowest mean beta/alpha power (−1.54 dB) for all participants, matching the lowest mean reported trust score of 1.92 ± 0.84. In the RT\(_{\mathrm {I-V}}\) a large increase in the mean beta/alpha power across participants is observed. An interesting observation is that the largest change in the beta/alpha power between segments occurs from NM\(_{\mathrm {I-IV}}\) to RT\(_{\mathrm {I-V}}\). The largest change in how participants rated their trust levels also occurred between these two segments (NM\(_{\mathrm {I-IV}}\) 1.92 ± 0.84 to RT\(_{\mathrm {I-V}}\) 3.96 ± 0.92).

The only segment in Scenario 1 that deviated from expectation was the IT\(_{\mathrm {I-I}}\) segment that had a mean beta/alpha power (−1.59 dB) that was higher than the TR\(_{\mathrm {I-II}}\) −2.10 dB and TR\(_{\mathrm {I-III}}\) −1.78 dB segments. It would be expected that the IT\(_{\mathrm {I-I}}\) segment would have low beta/alpha power since subjects reported having high levels of trust in the SDC in this segment IT\(_{\mathrm {I-I}}\) 4.68 ± 0.47. The IT\(_{\mathrm {I-I}}\) segment was the subjects’ first experience in the simulator and there may have been increased stress due to becoming acclimated with the simulator.

Fig. 6.
figure 6

The \(\beta /\alpha \) mean power across segments.

Fig. 7.
figure 7

Change in \(\beta /\alpha \) power across segments.

EEG Analysis Scenario 2. Figure 8 shows the mean beta/alpha power for all participants in Scenario 2 across segments after the data was cleaned with the variance filter. Figure 9 shows the change in the power of the mean beta/alpha power between segments for Scenario 2. Scenario 2 did not perform as expected. It was predicted that in segments that did not induce stress or anxiety and that participants rated with high levels of trust in the SDC, IT\(_{\mathrm {II-I}}\) 4.40 ± 0.69, TE\(_{\mathrm {II-II}}\) 4.64 ± 0.68 and RT\(_{\mathrm {II-V}}\) 4.04 ± 1.11, a lower mean beta/alpha power would be observed, and in segments that elicit stress and anxiety and that participants rated with low levels of trust in the SDC, TR\(_{\mathrm {II-III}}\) 3.68 ± 1.12 and NM\(_{\mathrm {II-IV}}\) 3.64 ± 1.13, a higher mean beta/alpha power would be observed. This was not the case as segments IT\(_{\mathrm {II-I}}\) −0.46 dB and TE\(_{\mathrm {II-II}}\) −0.37 dB had relatively higher mean beta/alpha power compared to TR\(_{\mathrm {II-III}}\) −0.83 dB and NM\(_{\mathrm {II-IV}}\) −1.01 dB.

However, the anticipated effect of a decrease of mean beta/alpha power from a stressful to non-stressful segment was observed between segments NM\(_{\mathrm {II-IV}}\) −1.01 dB and RT\(_{\mathrm {II-V}}\) −2.47 dB. Between segments of Scenario 2 participants self-reported the largest change in trust levels to be between segments NM\(_{\mathrm {II-IV}}\) 3.64 ± 1.13 and RT\(_{\mathrm {II-V}}\) 4.04 ± 1.11, and it was found that the EEG is able to capture the strongest effect between segments.

Fig. 8.
figure 8

The \(\beta /\alpha \) mean power across segments.

Fig. 9.
figure 9

Change in \(\beta /\alpha \) power across segments.

EEG Analysis at Moment of Near Collision. From the analysis of the EEG data in Scenario 1 and in Scenario 2, it appears that using the mean beta/alpha power to indicate if a subject is stressed or anxious is effective at capturing big effects, however it misses more nuanced changes. Also attempting to capture an effect by averaging subjects data over a long 2 min time interval will be unable to resolve quickly changing conditions or events. A more precise alternative would be to analyze individual participants’ EEG data at specific inflection moments.

Participants reported the lowest levels of trust in segment (NM\(_{\mathrm {I-IV}}\), Mean Trust Score: 1.92 ± 0.84), and it contains the most stressful and trust damaging incident were the SDC malfunctions and nearly causes a car accident at an intersection. The incident is outlined in Fig. 4. Since Participant 50 had the cleanest EEG signal, due to having a shaved head and limited head movement, this single participant’s mean beta/alpha power during the stressful incident of nearly experiencing a car crash is plotted in Fig. 10. A much shorter time interval of 16 s is used to analyze the EEG data.

Fig. 10.
figure 10

Beta/alpha power change at point of near collision inside the SDC simulator.

As expected, a large increase in the mean beta/alpha power is observed during the small time window that coincides with the near car accident. Analyzing subjects individually at specific moments using small time intervals allows for more meaningful information retrieval. It is worth noting that being able to selectively evaluate subjects with clean EEG signals (having less interference due to hair and limited head movement) makes it easier to gain meaningful results.

5 Conclusion and Future Work

As stated earlier, social acceptability of autonomous vehicles is a prominent challenge that academic communities and industries are now facing. Therefore, research is needed to better understand the evolution of trust between humans and growing autonomous technologies. In this paper, an experiment on fifty (50) subjects was conducted where participants were exposed to driving scenarios designed to induce positive and negative emotional responses, as sub-dimensions of trust. Emotions were picked up by EEG signals from a certain area of the brain, and simultaneously, trust was measured based on a 5-point Likert scale.

The results of our experiment unveiled that there is a direct correlation between passengers’ real-time trust in SDCs and their emotional responses. In other words, the trust level and trust rebuild after faulty behaviors depend on the driving style as well as reaction of the SDC to passengers’ emotions. In our experiment, the participants reported higher trust levels after experiencing initial trust and trust escalation segments and reported distrust after the trust reduction segments, as well as high distrust after the negative trust mutation segment. A notable observation was that participants especially did not trust the SDC around pedestrians. Interestingly, all these objective observations were fully consistent with emotional responses picked up by the EEG signals.

We utilized signal processing parameters for power spectral density estimates and the resultant EEG beta/alpha wave band power ratios based on a limited investigation of spectral processing methods, windows and overlaps. However, it may be that better results can be obtained by optimizing the signal processing parameters, and it is likely that parameters best suited for responsive detection of emotional changes in short intervals will differ from those that work best for longer-term emotional states.

The EEG data has a low signal to noise ratio. Therefore, it would be challenging to manually classify a data set, making the usage of a Machine Learning (ML) based classifier necessary for large data sets. Prior research works have illustrated that Linear Support Vector Machine (LSVM) and multilayer perceptron (MLP) neural networks are possible candidates to classify EEG data [13]. Other physiological sensors (heart rate, skin perspiration, etc.) will be used in future work to detect stress. Perhaps EEG in combination with other physiological sensors will create a more robust system of detecting stress and anxiety in participants. Also future work will dedicate more time to analyzing the interaction of EEG signals with various electrode positioning.

Our results therefore illustrate that trust in SDCs, and accordingly, social acceptability can be achieved if SDCs become responsive to emotional responses, e.g., by using proper operation modes such as normal, cautious, and alert modes. Future work will utilize ML classifiers to determine the human subjects’ emotional state based on a combination of sensors. As stated earlier, our results are expected to inform the design and operation of a control module that monitors the emotional state of passengers and adjusts the AI control parameters accordingly in semi or fully-autonomous vehicles [18].