Introduction

Over 20 years ago, Cognitive Therapy and Research published the results of a randomized clinical trial comparing cognitive behavioral group therapy to a credible treatment alternative (Heimberg et al. 1993). The journal then published the 5-year follow-up study (Heimberg et al. 1993), which remains to this day the most comprehensive assessment of improvement following short term cognitive behavioral group therapy. At follow-up, patients that received cognitive behavioral group therapy were better than those that received the alternative treatment: fewer symptoms as assessed by patients and by independent assessors, less interference from symptoms as assessed by clinicians, and less anxiety and better performance on a speech task as rated by judges blinded to type of treatment received. While acknowledging important limitations to the study, such as small sample size and selective attrition, the authors conclude that “CBGT is an effective and durable approach to the treatment of social phobia.” (P. 336).

Subsequent research reinforces the notion that the improvement seen immediately following cognitive behavioral treatment for social phobia endures and has accumulated to the point that a meta-analysis of psychological interventions for social anxiety disorder concludes that treatment gains remain stable over time and may improve somewhat (Acarturk et al. 2009). The follow-up periods of the studies analyzed in the meta-analysis, however, are relatively brief, ranging from 1 to 18 months, with the majority of studies containing follow-up periods between 1 and 3 months. Since Heimberg et al’s (1993) study, only two follow-up studies of randomized clinical trials show that short-term cognitive-behavioral therapy for social anxiety disorder produces treatment gains that are maintained for 5 years or longer (Mörtberg et al. 2011; Willutzki et al. 2012). One study (Willutzki et al. 2012) exclusively used self-report measures, and neither study included a behavioral avoidance test. And whereas the Heimberg et al. (1993) paper included audience ratings of performance and anxiety during a speech task, it did not include a direct measure of participants’ behavior (e.g., length of speech).

Improvement following cognitive behavioral therapy for social anxiety disorder not only appears to last, but also is robust across various formats, including individual therapy, group therapy, and internet-based treatment. The present study is a follow-up of the first randomized clinical trial to test another format for delivering cognitive behavior therapy for social anxiety disorder—virtual reality exposure therapy (Anderson et al. 2013). Participants in this study were diagnosed with social anxiety disorder and were randomly assigned to 8 weeks of waiting or 8 weekly sessions of Exposure Group Therapy (EGT; Hofmann 2007) or VRE for social anxiety disorder (Anderson et al. 2005) delivered according to a treatment manual. Waitlist participants were subsequently randomly assigned to EGT or VRE. Benefit from treatment was assessed via standardized self-report measures, clinician-rated diagnosis, and a behavioral avoidance task. Results showed that relative to waitlist, people completing either active treatment improved on all but one measure (self-reported fear of negative evaluation for VRE, and length of speech during a behavioral avoidance task for EGT). At 3-month follow-up, the majority of treated participants were rated by independent assessors as in “full” or “partial” remission. At 12-month follow-up, participants in both groups reported continued improvement on all standardized self-report questionnaires in the year following treatment, and feeling “very much” or “much” improved (65%). The results from this study are consistent with existing literature on the robustness of cognitive behavioral therapy for social anxiety disorder, but are also unique, as no other study has examined VRE for social anxiety disorder. One study (Wallach et al. 2009) used virtual reality to treat social fear, however no diagnosis or clinical cut-off score was used, and the 1-year follow-up assessment consisted entirely of self-report questionnaires (Safir et al. 2012).

This is the first study to evaluate the durability of virtual reality exposure therapy and exposure group therapy for social anxiety disorder over the long-term (6 years, on average, after treatment) using multimodal assessment. We hypothesize that symptom severity, as measured by standardized self-report questionnaires, a diagnostic interview, and avoidance during speech task, would be lower at long-term follow-up than at pre-treatment. Participants’ reports of global ratings of improvement are also presented.

Methods

Participants

Of the 65 potentially eligible participants (anyone who completed either treatment in the parent study), seventeen could not be contacted, four declined participation, and sixteen agreed to take part in the study but never did so, yielding a sample of participants (N = 28) who completed VRE (n = 13) or EGT (n = 15) 6.0 years earlier, on average (range 4.6–6.8 years). See Fig. 1 for participant flow.

Fig. 1
figure 1

Participant flow chart for long term follow assessment

Participants were predominately female (71%), middle-aged (X = 42 years, range = 19–69 years; SD = 13.17), and well-educated, with 75% attaining an undergraduate degree or higher. Participants reported having an average income of 50,000 or more (35.7%), being married (43%) and self-identifying as “Caucasian” (n = 16; 57%), “African-American” (n = 8; 28.6%), “Latino” (n = 2; 7%), or “Other” (with no additional self-report of race; n = 2; 7%).

Measures

Standardized self-report measures with well-established psychometric properties were used to measure fear of public speaking (Personal Report of Confidence as a Speaker [PRCS; Paul 1966]) and fear of negative evaluation (Fear of Negative EvaluationBrief Form [BFNE; Leary 1983]). For each measure, greater scores represent greater levels of fear. The internal consistency of these measures at long-term follow-up was very good for the PRCS (α = .87) and for the BFNE (α = .89).

The Structured Clinical Interview for the DSM-IV (SCID; First et al. 2002) was administered to assess remission from social anxiety disorder.

Patient Global Improvement (PGI Guy 1976), a one-item, face-valid measure, assessed participants’ sense of overall improvement since beginning treatment: “Compared to how I felt before beginning the study, I now am….” Ratings are based on a seven point scale ranging from 1 (very much improved) to 7 (very much worse).

Behavioral Avoidance Test. The behavioral avoidance test was based on a standardized speech assessment protocol which demonstrates good test–retest reliability for physiological, behavioral, and cognitive measures of social anxiety assessed during the task- including speech length, the behavioral outcome assessed in the present study (Beidel et al. 1989). For the speech task, participants are given 3 min to prepare notes on 5 controversial topics (e.g., abortion, same sex marriage). Participants are then asked to speak for 10 min on up to three topics. Participants are given one of several index cards (a different card for each administration of the task). Each card lists five controversial topics, such as capital punishment and abortion. Participants have 3 min to choose up to three of the five topics listed on the card and prepare notes for the speech. Participants are asked to try to speak for 10 min, but understand that they can stop at any time. Participants rate their peak anxiety (0–10) during the task, with higher numbers indicating higher anxiety.

Treatments

Virtual Reality Exposure Therapy and Exposure Group Therapy treatments were designed to be as similar as possible, with the obvious exception that one used in vivo exposure in a group format and the other administered virtual reality exposure individually. VRE used a virtual conference room (~5 audience members), a virtual classroom (~35 audience members), and a virtual auditorium (100 + audience members). In EGT, groups included up to five participants and two therapists. During exposure, participants gave a videotaped speech in front of the group. Every effort was made to equate time in exposure across treatment group. There were five study therapists, and all therapists administered both treatments.

Both treatments began with a treatment rationale and psychoeducation about social anxiety disorder. During sessions 2–8, both treatments addressed specific aspects of social anxiety disorder identified in psychopathology literature, including self-focused attention, perceptions of self and others, perceptions of emotional control, rumination, realistic goal setting for social situations through the use of such techniques as cognitive preparation, and challenging of cost and probability biases. Session 8 also included relapse prevention. Homework was assigned for both treatments, including a daily mirror task, daily record of social situations, and identification of cognitive biases.

Procedure

This research complied with the Code of Ethics of the World Medical Association and was approved and monitored by the Institutional Review Board at Georgia State University in Atlanta, Georgia.

A brochure was sent to participants’ last known address, and participants were later contacted by phone and/or email to schedule an in-person assessment. The assessment took place at Georgia State University and included standardized self-report measures, the speech task, and the SCID. One participant could not attend the in-person assessment and completed the SCID by phone. Clinical interviews were conducted by advanced doctoral students in clinical psychology who knew that participants had completed treatment, but did not know which one. All interviews were recorded, and a randomly selected subset (n = 5) were reviewed by a licensed psychologist to calculate the inter-rater reliability (100% agreement for primary diagnosis, with one disagreement on severity).

Results

All dependent variables were screened for errors, outliers (defined as scores greater than three standard deviations from the mean), and missing values. No outliers were identified. Comparisons between those who did and did not complete the follow-up assessment were made using a series of Chi-square and t test analyses, which revealed no significant differences in pre-treatment or post-treatment symptom severity on any measure, nor on any demographic variable (all p’s >.10). Comparisons of effect sizes also show no differences between completers and non-completers on symptom severity or participant characteristics, with the exception of a small-to-medium effect of pre-treatment BFNE and Age: (pre-treatment PRCS: Cohen’s d = .05; post-treatment PRCS: Cohen’s d = .06; pre-treatment BFNE: Cohen’s d = .32; post-treatment BFNE: Cohen’s d = .05; Age: Cohen’s d = .31; Gender: Cramer’s v = .02; Ethnicity: Cramer’s v = .01).

Standardized Self-Report Measures

A series of 2 × 3 Treatment Type (VRE, EGT) × Time (pre-treatment, post-treatment, follow-up) repeated measures ANOVAs tested the hypotheses that (1) self-report measures of symptom severity (BFNE, PRCS) would be lower at long-term follow-up that at pre-treatment and that (2) self-reported measures of symptom severity at post-treatment and follow-up would not differ.

As shown in Table 1, there was a significant main effect of Time for each measure: BFNE, F (2, 50) = 6.08, p < .01, partial η 2 = .20; PRCS, F (2, 52) = 24.76, p < .001, partial η 2 = .49. Post hoc analyses using Bonferroni-corrected pairwise comparisons showed that, compared to pre-treatment, the PRCS was significantly lower at post-treatment (M difference = −10.85, p < .001) and at long-term follow-up:(M difference = −8.61, p < .001). Compared to pre-treatment, the BFNE was not significantly lower at post-treatment (M difference = −3.70, p = .10), but it was at long-term follow-up (M difference = −5.98, p = .01). There were no differences in self-reported symptoms between post-treatment and long-term follow-up on either measure: BFNE (M difference = 2.28, p = .58), PRCS (M difference = −2.23, p = .66).

Table 1 Effect Sizes and 3 × 2 (Time × Treatment type) ANOVA comparing self-report ratings at pretreatment (Pre), posttreatment (Post), and 6 year follow-up (FU)

There was not a main effect of Treatment Type: F (1, 25) = .12, p = .74, partial η2 = .00, nor Time × Treatment Type interaction, F (2, 50) = .81, p = .45, partial η2 = .03, for the BFNE. There was, however, a significant main effect of Treatment Type for the PRCS, F (1, 26) = 6.04, p = .02, partial η2 = .19, qualified by a marginally significant Time × Treatment Type interaction, F (2, 52) = 3.15, p = .05, partial η2 = .11. Simple main effects showed that participants completing EGT reported significantly lower PRCS scores than participants completing VRE at post-treatment, F (1, 26) = 5.62, p < .05, but were only marginally lower at long-term follow-up, F (1, 26) = 4.11, p = .05.

Behavioral Avoidance Task

All but two participants completed the speech task –one person declined to do the speech task and one person completed the follow-up assessment by phone. Behavioral data for all three time points (pre, post, follow-up), however, was complete for only a small subset of participants (n = 9). Because the sample size is too small to test statistical significance, the results are presented qualitatively. As shown in Fig. 2, the duration of the speech across assessment points showed a different pattern for each treatment. VRE showed little-to-no change from pretreatment to posttreatment and noticeable improvement from posttreatment to follow-up. EGT showed improvement from pretreatment to posttreatment and lost some of the improvement gained at posttreatment at follow-up (but still showed overall improvement in speech duration from pretreatment). Specifically, the EGT group spoke longer (183 s, on average) at posttreatment (M = 554.5, SD = 97.6) than at pretreatment (M = 371.7, SD = 203.7), and somewhat (41 s) shorter at long-term follow-up (M = 513.5, SD = 175.8) than at posttreatment. The VRE group spoke only slightly longer (6 s) at posttreatment (M = 281.9, SD = 175.9) than at pretreatment (M = 275.9, SD = 183.1) and noticeably longer (101 s) at long-term follow-up (M = 382.9, SD = 219.0) than at pos-treatment. Overall, participants in both treatments spoke longer at follow-up than at pretreatment. Peak anxiety during the speech task, however, did not seem to change over time. Self-reported anxiety ratings at pretreatment (VRE: M = 7.69, SD = 2.25; EGT: M = 7.60, SD = 1.88), posttreatment (VRE: M = 7.10, SD = 2.55; EGT: M = 6.44, SD = 2.79), and long-term follow-up (VRE: M = 7.70, SD = 1.77; EGT: M = 6.83, SD = 2.08) changed no more than 1.2 points of each other, across both groups (see Fig. 3).

Fig. 2
figure 2

Time spent speaking during speech task at pretreatment (Pre), posttreatment (Post), and long term follow-up (LTF) for exposure group therapy (EGT) and virtual reality exposure (VRE)

Fig. 3
figure 3

Peak anxiety ratings (0–10) during speech task at pretreatment (Pre), posttreatment (Post), and long term follow-up (LTF) for exposure group therapy (EGT) and virtual reality exposure (VRE)

Treatment Remission

Twenty-four participants completed the diagnostic interview, just over half of which (54.2%, n = 13) were classified as being in remission, as they no longer met diagnostic criteria for social anxiety disorder. Severity ratings for the participants who continued to meet criteria for the disorder (n = 11) were mild (n = 5) and moderate (n = 5), with one participant rated as severe (n = 1). The majority of participants (all but 2) did not have co-morbid Axis I disorders.

The majority of participants (68%, n = 23) rated themselves as “very much improved” (44%, n = 15) or “much improved” (24%; n = 8).

Additional Treatment

One participant began medication for social anxiety during the follow-up period—and one participant who had been on a stabilized one medication for social anxiety before beginning treatment discontinued it during the follow-up period. No one sought additional psychological treatment for social anxiety during the long term follow-up period.

Discussion

This is the first study to evaluate the extent to which VRE and EGT for Social Anxiety Disorder produce benefits that endure over the long term. Data from all sources—self-report, clinician-rated, and behavioral—show improvement an average of 6 years after treatment is completed. Our findings are consistent with follow-up studies of other forms of cognitive-behavioral treatments for social anxiety disorder. The medium (i.e., BFNE = .68) to large (i.e., PRCS = 1.35) effect sizes for our sample are comparable to Mortberg et al. (Mörtberg et al. 2011), who found large effects (d = 1.29–1.61) for a composite score of social anxiety measures over a 5-year follow-up and Hedman et al. (2011), who found medium to large effects (d’s = .63–1.32) at 6-months following completion of an internet-based treatment.

The present study is the longest follow-up study of EGT or VRE for social anxiety and for VRE for any anxiety disorder. Our results are consistent with a small body of research supporting the robustness of VRE for treating anxiety. Another study, for example, found that participants who completed VRE for fear of flying maintained or improved upon treatment gains following a fear relevant event–the September 11th 2001 terrorist attacks (Anderson et al. 2006). Social stimuli are more difficult to create within computerized environments than those for fear of flying; the current study shows that exposure to social fears in a virtual world can have measurable impact on social behavior in the real world, long after treatment is completed.

Only one other follow-up study of social anxiety treatment has used a behavioral avoidance test (Heimberg et al. 1993), and a behavioral avoidance task has never before been used as a part of a long-term follow-up assessment of VRE for any anxiety disorder. All but two participants did not complete the speech task–one could not come to the assessment in person, and another declined to do the speech task. Although speech duration is a rather sterile means of indexing behavior change in a clinically meaningful way, those participants who completed the speech task spoke 2 min longer at long-term follow-up—despite reporting average peak anxiety ratings of 7 out of 10. The behavioral data reinforce a core tenet of exposure therapy—to feel the fear and do it anyway. The behavioral data, combined with clinician-rated diagnostic remission, with participants’ perceptions of improvement, and with the data showing that only one person sought additional treatment during the follow-up period are suggestive that treatment produces real-life benefits that endure for years.

There are limitations to the study, the most important of which we note below. As is true of all long-term follow-up studies of randomized clinical trials for social phobia, there was no control condition to which long-term follow-up data could be compared. In our parent study, participants initially assigned to the control condition were randomly assigned to treatment following the 8-week waiting period; given the negative impact of the disorder and the availability of effective treatment, it is unethical to do otherwise. Given the chronicity of social anxiety disorder, it seems unlikely that improvements are due to spontaneous recovery, but the methodology does not allow us to be sure. Also, our primary self-report measure focused on the core fear of social anxiety (fear of negative evaluation) rather than symptoms more broadly.

The sample size for the study was not nearly large enough to determine non-inferiority of one treatment to another, so it is pre-mature to draw the conclusion that the treatments are equally effective, especially given the difference between treatments on one self-report questionnaire. Our sample size (N = 28) is, however, comparable to or larger than 2 of the 3 existing studies of the long-term durability of CBT for social anxiety (Heimberg et al. 1993, N = 19; Mortberg et al. N = 27), but smaller than Hedman et al. (2011) study which conducted a non-inferiority analysis with a sample size of 48.

Whereas this is the first long term follow-up study of cognitive behavioral treatment for social anxiety disorder to include a behavioral avoidance test, the sample size precludes drawing conclusions about the long-term impact of treatment on behavior. Clearly, further research using behavioral measures wi th good psychometric properties, including ecological validity, is essential to evaluate the long-term impact of cognitive behavioral therapy for social anxiety disorder on actual behavior.

The attrition rate for the current study (55%) is higher than one would like, but it is comparable to Heimberg et al. (1993)—the only other long-term follow-up study using an in-person assessment (52.5%). Unlike the Heimberg et al. study, our results do not appear to be affected by selective attrition, as there were no differences on any variable or demographic factor at pre-treatment or post-treatment between those who did and did not complete the follow-up assessment. The fact remains, however, that the participants who no longer had valid contact information, did not respond, or declined to participate may have differed in a systematic way that was not measured than those completing the follow-up assessment.

There are some limitations to generalizability, as well. The treatments (both VRE and EGT) included exposure to public speaking fears only, which is the most common—but not the only—social fear among people with social anxiety disorder. The use of a behavioral avoidance task other than a speech would have increased our confidence that treatment effects generalize to other social fears. The improvement in self-reported ratings of fear of negative evaluation, however, provides some evidence of generalizability, along with the diagnostic remission data. The ethnic/racial diversity of the participants in our study represents the diverse community in which the study was conducted; such populations, however, are underrepresented in treatment efficacy research (Whaley and Davis 2007), including virtual reality exposure therapy.

Despite these limitations, this is the first study to evaluate the sustainability of treatment gains made following virtual reality exposure therapy and exposure group therapy. The positive effects of both treatments, measured in a variety of ways, endure for quite some time. Our results are consistent with other long-term follow-up studies of a variety of forms of cognitive-behavioral treatment for social anxiety, and add to this literature with the inclusion of a direct measure of behavior. Our study shows that improvement endures across self-report, clinician-rated, and behavioral data, providing strong evidence that a little bit of CBT can go a long way. Most people who could benefit from existing treatments, however, do not access them (Grant et al. 2005). It is time to turn our attention to better dissemination, access, and uptake of cognitive-behavioral based treatments for those who suffer with this debilitating disorder.