Introduction

About 7% of individuals suffer from specific phobias during their life-time (Eaton et al., 2018). Cognitive behavioral therapy (CBT) is considered as a treatment of choice for specific phobias, wherein exposure therapy is considered a key component (Abramowitz et al., 2019). Virtual reality exposure therapy (VRET) can be defined as the virtual counterpart of exposure in vivo. In VRET patients are exposed to anxiety provoking stimuli via a head-mounted display, which is connected to a computer system that generates real-time three dimensional images to the patient. Depending on the system, the patient can interact and/or react to the stimuli presented in the virtual environment. While the efficacy of VRET for specific phobias has been well established (Carl et al., 2019; Emmelkamp & Meyerbröker, 2021; Morina et al., 2015a, 2015b), there is a lack of knowledge on specific predictors of treatment outcome (Meyerbröker, 2021).

One of the factors that has been considered to predict treatment outcome in anxiety disorders in general is anxiety sensitivity. Anxiety sensitivity refers to the fear of anxiety related sensations (Reiss & McNally, 1985) and it has been assumed to be a predisposing factor in the development and maintenance of anxiety disorders (Naragon-Gainey, 2010; Olatunji & Wolitzky-Taylor, 2009). Individuals with elevated levels of anxiety sensitivity mostly respond with fear to psycho-physiological arousal because they fear harmful consequences (Smits et al., 2008). In studies exploring the relationship between anxiety sensitivity and fear of flying among students with increased levels of fear of flying, results indicated that anxiety sensitivity moderates the association between somatic sensations and in-flight anxiety (Vanden Bogaerde & De Raedt, 2008, 2011). Given that patients with fear of flying often experience situational bound panic attacks or sympathetic arousal, it seems likely that higher levels of anxiety sensitivity predict a lower treatment outcome. However, in a study with patients with fear of flying (Busscher et al., 2013) no moderating effect of anxiety sensitivity in the relationship between somatic sensations and flight anxiety was found, which is not in line with earlier findings, wherein anxiety sensitivity was a significant predictor of treatment outcome in cognitive behavioural therapy (CBT; Blakey et al., 2017). Yet, no research into the role of anxiety sensitivity in specific phobias and VRET has been done.

Another factor that has been associated with successful outcome of exposure therapy is self-efficacy (Böhnlein et al., 2020). According to the social learning theory of Bandura (1977), the efficacy of exposure-based treatments is explained by strengthening self-efficacy through successful coping experiences. Few studies have been conducted into self-efficacy and VRET. In a study conducted with patients with acrophobia, it was found that VRET plus coping self-statements led to a linear increase in self-efficacy (Krijn et al., 2007). These results were extended by a study by Meyerbröker and Emmelkamp (2008), in which self-efficacy in patients with specific phobia’s changed during the course of four sessions VRET without addressing cognitions. Two studies have investigated the predictive role of self-efficacy in VRET. In the first study it was found that an increase in perceived self-efficacy was a significant predictor of improvement in general outcome (Côté & Bouchard, 2009). In a study conducted with patients with social anxiety disorder, it was found that self-efficacy was significantly associated with treatment outcome, but these changes did not significantly predict symptom improvement (Kampmann et al., 2019). Thus, it remains unclear what the contribution of self-efficacy to successful exposure treatment is.

One general therapy factor that has inconsistently been associated with therapy outcome in the treatment of anxiety disorders is therapeutic alliance (Luong et al., 2020). In several meta-analyses of the therapeutic alliance small to moderate effect sizes (0.15 to 0.27) were found for the prediction of therapy outcome for client and therapist ratings (for an overview seeFlückiger et al., 2018; Priebe et al., 2011). Existing studies on the role of the therapeutic alliance in VRET are rather conflicting. Meyerbröker and Emmelkamp (2008) found in a small series of patients a positive relationship between the quality of the working alliance and treatment outcome in patients with fear of flying, but not in patients with acrophobia. In a trial treating small animal phobias (Wrzesien et al., 2013), no negative influence of VR was found on the therapeutic alliance. In another study, a direct comparison between VRET and exposure in vivo was made and no differences in evaluation of the therapeutic alliance were found in patients with social anxiety disorder (Ngai et al., 2015). As therapeutic alliance is a promising prognostic indicator in exposure therapy (Buchholz & Abramowitz, 2020), it is important to evaluate its role in VRET.

To sum up, there is some evidence that anxiety sensitivity and self-efficacy have been associated with favourable treatment outcome in the treatment of specific phobias. Given the importance of these and the promising role of the therapeutic alliance, the aim of the present study was to examine whether pre-treatment anxiety sensitivity and self-efficacy (as measured before and during treatment) predict treatment outcome in VRET for fear of flying. Additionally, the predictive value of the therapeutic alliance in VRET was investigated. We hypothesized first, that both pre-treatment anxiety sensitivity and pre-treatment self-efficacy will predict treatment outcome following VRET. Secondly, initial improvement on self-efficacy during treatment will predict therapeutic changes during VRET. Thirdly, the quality of the therapeutic alliance as assessed after the second session will predict better treatment outcome following VRET.

Methods

Participants

The methods of the study have been described in detail in the randomized controlled trial (VRET + yohimbine vs VRET + placebo; Meyerbröker et al., 2012). Given that findings were non-significant between groups on all dependent measures, the two groups were pooled together for the present study to enhance statistical power, as has been done in other studies (Norr et al., 2018).

Participants (n = 67) had to fulfil diagnostic criteria for fear of flying according to the DSM-IV criteria (APA, 2000). Please note that when the trial was started, the official classification system in the Netherlands was still the DSM-IV. One participant was excluded from data-analyses because the collected data were found not reliable, due to drug use during therapy days. A total of 18 patients dropped out during the study for the following reasons: the virtual reality did not provoke anxiety (n = 11), medical reasons (n = 4), personal circumstances (n = 1), simulator sickness (n = 1), and astigmatism (n = 1). The mean age of patients was 36.71 years (SD = 11.74). The majority of the patients was female (71% vs 29% male). In order to enhance normality, four outliers who differed more than two standard deviations from mean on baseline assessment of fear of flying were excluded. Completer analyses included the full remaining sample of 45 participants who completed treatment.

Participant Selection

The study was advertised via flyers in pharmacies and general practitioners and via internet (www.vliegangstbehandeling.nl). The sample consisted of individuals who referred themselves for treatment to the Department of Clinical Psychology of the University of Amsterdam. Besides free treatment participants received no further compensation. The study was registered with clinicaltrials.gov (NCT00734422) and all study procedures were approved by the Institutional Ethics Review Board.

Inclusion and Exclusion Criteria

To participate in the study, patients had to fulfil current Diagnostic Statistical Manual of Mental Disorders (DSM-IV); APA, 2000) criteria for a diagnosis of specific phobia, situational type (i.e., fear of flying). Further, patients had to be between the ages of 18–65 (At the time of the study there was still a lack of knowledge about the use of elderly and virtual reality) and to dispose over sufficient fluency in Dutch to follow the treatment protocol and complete the assessments.

Patients were excluded in case of presence of a medical condition or medication that would contraindicate participation in VR (i.e., pregnancy, seizure disorder, respiratory disorder, cardiovascular disease, pacemaker, hypertension). Additional exclusion criteria were an unstable dose of pharmacological medication, a history of psychosis, bipolar disorder or post-traumatic stress disorder as assessed with the SCID-I (First et al., 1996).

Measures

Clinician-Rated Assessment Instruments

The Structured Clinical Interview for DSM-IV (SCID-I), which was was used to assess the DSM-IV diagnosis of specific phobias, as well as to identify diagnoses for exclusion (First et al., 1996; Dutch version: Groenestijn et al., 1999). Please note that when we started the trial, the official classification system in the Netherlands was still the DSM-IV. The SCID-I is considered the gold standard for clinical diagnoses and has demonstrated strong reliability and validity (Spitzer et al., 1992). The inter-rater reliability for axis-I disorders is good to excellent (Lobbestael et al., 2011).

The Flight Anxiety Situations Questionnaire (FAS)

The FAS is a commonly used 32-item, self-report inventory designed to measure anxiety related to flying experienced in different situations (Van Gerwen et al., 1999). The FAS consists of three subscales: the Anticipation scale, which represents situations before the actual flight, the In-flight scale, which refers to situations during a flight and the Generalized Flight scale. The internal consistency and concurrent validity of the FAS is good to excellent (Cronbach’s alpha ranging from 0.88 to 0.97). The mean in a clinical sample was = 102.42 (SD = 22.48); while in a non-clinical group the mean was 39.84 (SD = 11.92; Nousi et al., 2008).

Self-efficacy questionnaire

The Self-Efficacy Questionnaire (Krijn et al., 2007) was used to measure the degree of self-efficacy subjects experienced while coping with the phobic situation. The self-efficacy questionnaire is a 5-item self-report measure for flying situations. The items are designed around five themes: the capability to (1) reduce fear, (2) think clearly, (3) have control over your actions, (4) have control over anxious thoughts and images, (5) stay in the situation for at least two minutes while panicking or with intense fear. Patients can rate their answers on a scale ranging from 0 (no problem at all) to 100 (not capable of staying calm). The internal consistency of the self-efficacy questionnaire in the current sample was good (Cronbach’s alpha 0.82).

Working Alliance Inventory

The Working Alliance Inventory (WAI; Horvath & Greenberg, 1989) measures the working alliance as defined by Bordin (1979) independent of a therapist's theoretical orientation. It is a self-report instrument consisting of 36 items. The questionnaire was completed at two different phases of therapy (at sessions 2 and 4). There are three scales that reflect congruence on goals, tasks, and the emotional bond between client and therapist. Each item is rated on a 5-point scale (1_never, 5_always). The total score ranges from 36 to 180, with higher scores reflecting a stronger working alliance. The authors of the WAI client version have reported good psychometric properties of this instrument (Horvath & Greenberg, 1989). Our objective was to assess the therapeutic relationship as early in treatment as possible, aiming at minimalizing confounding treatment effects. Accordingly, we chose to use only the WAI scores after the second session assuming that at this time the provided treatment is unlikely to have produced any significant effects yet.

Anxiety Sensitivity Index

The Anxiety Sensitivity Index (ASI; Reiss et al., 1985) is a 16-items self-report questionnaire, measuring fear of anxiety-related symptoms. Each item is rated on a five-point likert scale ranging from 0 (very little) to 4 (very much). The ASI is scored by summing all items; possible scores range from 0 to 64, with higher scores reflecting higher levels of anxiety sensitivity. The ASI has good to excellent internal consistency with Cronbach’s alpha ranging from 0.82 to.91 (Peterson & Reiss, 1993).

Subjective Units of Discomfort

Subjective Units of Discomfort (SUDs; Wolpe, 1990) were used to monitor anxiety level during exposure. SUDs are a self-reported index of the intensity of anxiety, discomfort, or distress experienced in a specified moment. SUDs were rated on a continuous 0–10 scale, where 0 represents no distress and 10 represents extreme (maximum) distress.

Computer Equipment and Virtual Environments

The virtual reality exposure therapy was given in a basement laboratory at the Department of Clinical Psychology of the University of Amsterdam. The virtual world was generated by an Optilex 755 Intel C2D. The projection of the worlds into the glasses (Cybermind Visette Pro) was stereographic. The tracking was done with Ascension Flock of Birds.

The fear of flying environment was a virtual aircraft where subjects could take seat in different positions in. This environment was supported by two real aircraft-seats and part of an airplane fuselage, with windows. The aircraft-chair vibrated during take-off, landing and during turbulences via connected subwoofers. The VR-world was identical to the one used in the Krijn et al. (2007) study.

Procedures

Screening

Potential participants were contacted via the telephone. Eligible participants who were willing to participate were invited for an intake session that included the SCID-I (First et al., 1996).

Treatment

Eligible participants returned 1 week after the intake session to start treatment. A total of four virtual exposure therapy sessions was planned. Sessions were held every week, each session consisting of two virtual flights of 25 min each. A ten-minute break between virtual flights was applied to prevent cybersickness. No cognitive restructuring took place during exposure sessions.

Treatment was provided by master level students who were trained in the use of the exposure protocol by the first author (KM). All treatment sessions were weekly supervised by the last author (PE). Therapists were not aware of the aim of the current study.

Before starting treatment, participants were provided with instructions about an anxiety hierarchy and the general rationale of exposure therapy. Additionally, patients were made familiar with the virtual reality equipment. For each treatment session, participants were instructed to remain in the virtual flying environment for as long as possible. The total exposure duration was 2 times up to 25 min with a break of 10 min to prevent patients from experiencing cybersickness.

During the first two treatment sessions, the flights were completed with VR settings for fair weather conditions without turbulence. In the third and fourth treatment session, the virtual flights started with a delay because “technical problems with the oil-facilities, which had to be solved before starting the flight”. During the flight, difficult weather conditions were generated (as turbulence and a thunderstorm). During exposure, participants were asked every 3 min to rate their anxiety on a scale from zero to ten. An accurate and exhaustive protocol assured that SUDs were taken from each patient at exactly the same exposure moment.

Data Analysis

Standardized Residuals of the Main Outcome

In order to adjust for baseline variance in the main outcome measure (FAS) residualized change scores were constructed. Residualized change scores are referred to as partialling out results from pre-treatment to those of post-treatment measures of change and are viewed as superior to simple pretest–posttest change scores (Prochaska et al., 2008).

Residualized change scores of the FAS were constructed using a simple linear regression model in which post FAS scores were predicted by pre FAS scores. The standardized residuals (Zres_FAS) for each case were saved from this model. Accordingly, variability among residuals can be considered independent from the pre-scores (Segal et al., 2006).

Statistical Analyses

All hypotheses were tested with hierarchical multiple regression models with condition (VRET + Yohimbine vs VRET + placebo) always being included in the first block and the hypothesized predictors being included in the second block. As dependent variable the residualized change scores of the FAS as indicator of treatment outcome were used, to control for variability among residuals on the pre-scores.

Results

Means and Standard Deviations at Pre- and Post-assessment

Scores on anxiety related to flying as measured with the FAS improved significantly from pre-treatment (M = 107.65, SD = 20.66) to post-treatment (M = 80.66, SD = 20.75). The mean and standard error mean of anxiety levels for each session are presented in Fig. 1. Results of a repeated measures ANOVA indicated that most significant improvement on the FAS was reported after session 1 and after session 4 [Wilks’Lambda = 0.25, F(4,40) = 30.06, p = 0.00, partial eta squared = 0.75].

Fig. 1
figure 1

Mean and standard error mean measured before treatment and after each therapy session on the Flight Anxiety Situations Questionnaire. Notes Flight anxiety situations Questionnaire, ranging from 32 to 160

Cognitions as Predictor for General Improvement

A hierarchical multiple regression analysis was used to examine how the variables were associated with general improvement. Multicollinearity among potential predictors was assessed by using the statistics of variance inflation factor (VIF). A variance inflation factor exceeding 10 for a variable was regarded as indicating multicollinearity. Only variables with sufficient variability and without collinearity with other variables were selected and included in the final model and fitted simultaneously.

Hierarchical multiple regression analysis was done with condition being included in the first step and predictors being included in the second step, predicting residualized change scores on the FAS. As expected, condition did not predict outcome [F(1/45) = 0.01, p = 0.92]. The inclusion of the second set of variables (step 2) in the regression model produced an increase in explained variance, R2 change = 0.35, [F(4/38) = 5.18, p < 0.001], leading to a final regression model explaining 35% of variance (adjusted R2 = 0.27). Results on the relative contribution of each predictor entered in this model are presented in Table 1, which shows that anxiety sensitivity was the strongest predictor of general improvement.

Table 1 Hierarchical multiple regression analysis for predicting treatment outcome

Anxiety Sensitivity, Initial Improvement on Self-efficacy and the Quality of the Therapeutic Alliance will Significantly Contribute to a Better Treatment Outcome

A second hierarchical multiple regression analysis was used to examine how the variables were associated, when pre-treatment self-efficacy was excluded, with general improvement. Multicollinearity among predictors was assessed by using the statistics of variance inflation factor (VIF). The same procedure was handled as for the first analysis.

Again, condition was being included in the first step and predictors were included in the second step. Treatment condition was not a significant predictor of residualized change scores on the FAS in the first block [F(1/42) = 0.01, p = 0.92]. The inclusion of the predictors produced a significant increase in explained variance, R2 change = 0.29, [F(4/39) = 5.30, p = 0.00], leading to a final regression model explaining 35% of variance (adjusted R2 = 0.29). Results on the relative contribution of each predictor entered in this model are presented in Table 2, which shows that anxiety sensitivity is the strongest predictor of general improvement. Evaluating each of the independent variables in the equation, all three variables made a statistically significant contribution (see Table 2).

Table 2 Hierarchical multiple regression analysis for predicting treatment outcome

Recall that self-efficacy scores were reported at the beginning and end of each session. The mean and standard error mean of between- and within session scores of self-efficacy are presented in Fig. 2. A one-way repeated ANOVA was conducted to compare scores of self-efficacy at each treatment session. There was a significant effect for time [Wilks’Lambda = 0.34, F(7,35) = 9.61, p = 0.00, partial eta squared = 0.66].

Fig. 2
figure 2

Mean and standard error mean measured within each treatment session on the Self-efficacy Questionnaire. Notes Self-Efficacy Questionnaire ranging from 0 to 50

Discussion

The primary aim of the study was to examine the role of potential predictors as anxiety sensitivity, self-efficacy and the therapeutic alliance in VRET. As predicted, anxiety sensitivity, initial improvement in self-efficacy, but not pre-treatment self-efficacy, predicted treatment outcome. Additionally, the quality of the therapeutic alliance significantly predicted general improvement in patients with fear of flying.

More specifically, the predictive value of anxiety sensitivity was found in our study with VRET in fear of flying. Given that several studies have reported that CBT can successfully reduce levels of anxiety sensitivity and that anxiety sensitivity potentially represents more a transdiagnostic factor in cognitive behavioural therapy (Asnaani et al., 2020; Smits et al., 2019), this is an important finding. In our study, anxiety sensitivity proved to be a significant predictor of anxiety reduction after treatment, which is in line with earlier findings (e.g. Blakey et al., 2017). Although results until now have supported that anxiety sensitivity plays a more crucial role in panic disorder and agoraphobia (Gallagher et al., 2013; Ino et al., 2017) than in other anxiety disorders, our findings suggest that anxiety sensitivity is an important predictor of treatment outcome in fear of flying as well, suggesting indeed a more transdiagnostic factor (Smits et al., 2019).

Another important finding of our study is that self-efficacy increased significantly during and across sessions. Corroborating earlier findings (e.g., Cote & Bouchard, 2009; Kampmann et al., 2019; Krijn et al., 2007; Meyerbröker & Emmelkamp, 2008), we found initial changes in self-efficacy to predict anxiety reduction. Our data with patients with fear of flying, however, do not support earlier findings that pre-treatment self-efficacy has a predictive value regarding treatment outcome (Williams et al., 1984, 1985, 1989). Thus, the present findings suggest that VRET can effectively reduce specific phobia regardless of pre-treatment levels of self-efficacy and that treatment outcome is greater in patients who experience a significant increase in self-efficacy at the beginning of treatment. This is an important finding, as VRET seems capable of significantly enhancing and strengthening self-perception of ones coping efficacy, which has been reported an essential factor contributing to therapy outcome in specific phobias (Böhnlein et al., 2020). Similar results were also reported by Kampmann et al. (2019) who also concluded that an increase in self-efficacy among patients with generalized SAD were associated with better treatment outcome in VRET. Furthermore, Morina et al. () reported that two sessions of VRET among students with high levels of social anxiety led to higher self-efficacy 3 months after exposure relative to their pre-treatment scores.

With regard to the quality of the therapeutic alliance in VRET, we replicated earlier findings (Meyerbröker & Emmelkamp, 2008) in a larger group of patients with fear of flying. In line with previous findings (Meyerbröker & Emmelkamp, 2008; Wrzesien et al., 2013), we found that the quality of the therapeutic alliance worked as a partial predictor of therapeutic change. Our finding on the predictive value of therapeutic alliance signifies the importance of therapeutic alliance in VRET and contradicts the idea that the use of technology in VRET might interfere with the development of a good therapeutic relationship. The present data show that the quality of the therapeutic alliance is an important component of VRET, thus corroborating the general findings in psychotherapy research (e.g., Flückiger et al., 2018; Priebe et al., 2011).

Several limitations of this study are noteworthy. First, patients were originally randomized to VRET plus yohimbine or VRET plus a non-active placebo (Meyerbröker et al., 2012). Given that no differences between groups were found, we pooled the data to enhance statistical power for the purpose of this study. A post-hoc power analysis revealed that with the current completers sample sufficient statistical power of 0.81 was achieved. Second, the virtual environments used in our study were non-interactive virtual environments. However, the virtual environments were supported by tactile and auditive stimuli and the environments contained crucial anxiety triggers for fear of flying, e.g. the sound instructing passengers to fasten their seatbelt. Additionally, warnings of turbulence where announced, which were followed by actual turbulences supported by subwoofers to support sound and movement of actual aircraft seats. Third, although we found a significant decline in fear of flying at post-treatment, the mean of the post-assessment of fear of flying is still above the non-phobic norms (Nousi et al., 2008). A possible explanation for this significant yet not sufficient change might be the relatively brief period between the first assessment and post-assessment (treatment was given within the period of 4 weeks), which did not enable most patients to try a real flight. This is in line with Craske et al. (2014), who reported that sufficient time between treatment sessions enhances treatment outcome, as it provides patients with the opportunity to practice in changing circumstances. Therefore, in future research potential long-term changes following VRET for fear of flying need to be investigated. Given that exposure in vivo and exposure in virtual reality seem to produce similar results, VRET represents a promising research paradigm within psychotherapy research.

In summary, our results demonstrate that initial improvements in self-efficacy in VRET partially predict general change in treatment outcome. Further, anxiety sensitivity acts in patients with fear of flying as a predictor of treatment outcome corroborating more fundamental research. Finally, our results confirm and extend pilot findings that the therapeutic alliance has a predictive value in VRET.