The Diagnostic Statistical Manual of Mental Disorders (DSM) defines Primary Insomnia (PI) as a chronic difficulty initiating and/or maintaining sleep and/or non-restorative sleep, not exclusively occurring during the course of another sleep or mental disorder (American Psychiatric Association [DSM-IV-TR], 2000). Other classification systems, such as the International Classification of Sleep Disorders (ICSD-2, 2005), distinguish three subtypes of primary insomnia: psychophysiological insomnia, sleep-state misperception (or paradoxical insomnia) and idiopathic insomnia. Although insomnia can be acute, PI is diagnosed when the sleep disturbance is present for 1 month or longer and with a frequency of at least three times a week. A detrimental impact upon daytime functioning, due to associated symptoms such as excessive fatigue, lowered mood and impaired cognitive function, is an essential feature of the PI definition. More recently, chronic insomnia has also been documented as negatively impacting several domains of health-related quality of life (HRQoL) (Kyle & Espie, 2010; Kyle, Morgan, & Espie, 2009).

Until now, pharmacotherapy with benzodiazepine receptor agonists and/or sedative antidepressants have been the most frequently used treatment strategy for chronic insomnia (NIH, 2005). However, worldwide efforts are being made to disseminate non-pharmacological treatment strategies. Cognitive Behavioral Therapy for Insomnia (CBT-i), a multi-component treatment consisting of cognitive and behavioral strategies, is currently in major upsurge. Its approach fits nicely with a behavioral model for chronic insomnia where maintaining factors such as maladaptive sleep habits and sleep-disruptive beliefs, next to (conditioned) arousal are considered crucial sources of chronification (Spielman, Caruso, & Glovinsky, 1987). Results from several meta-analyses have pointed at an equal short-term efficacy of CBT-i compared to the effects of pharmacotherapy, and there is growing evidence that CBT-i outperforms pharmacotherapy in the long run (Morin et al., 2006). These promising results together with the well-known adverse consequences of routine medication use (Riedel et al., 1998), make CBT-i a treatment format very much valued by professionals as well as by patients.

In his review in 1999, Morin et al. concluded that 70–80% of PI patients benefit from a non-pharmacological treatment, 50% achieve clinical meaningful improvements, and about one-third become good sleepers (Morin et al., 1999). This implies that not all patients profit to the same extent from CBT-i, and that some even fail to respond at all. Taking into account that these results mainly come from research settings, documenting outcomes in large samples of patients seeking treatment in various clinical settings has recently been advocated (Morin et al., 2006; Perlis et al., 2000). The search for predictors of CBT-i treatment response has received some attention in previous research (Morin et al., 1999; Murtagh & Greenwood, 1995), but few demographic and clinical variables have been reliably associated with outcome. Again, the majority of data on predictive variables comes from research trials where characteristics of ‘real world’ patients may be diluted (Espie, Inglis, & Harvey, 2001). In their clinical effectiveness study, Espie, Inglis, Tessier, and Harvey (2001) examined which pretreatment variables predicted sleep improvements after CBT-i. They found that greater baseline sleep disturbance and higher baseline anxiety and depression positively predicted good outcome. Hypnotic use did not differentiate responders from non-responders and demographic data were of no predictive value. A sound understanding of the relation between predictors and outcome can result in the identification of those individuals for whom CBT-i may or may not be indicated and is therefore of utmost clinical importance. Notwithstanding the importance of outcome data in terms of mean differences of variables of interest for pre-to post treatment, Riemann and Perlis (2009) recently emphasized the need to interpret data in terms of clinical relevance such as percentages of responders/non-responders. Furthermore, since impaired daytime functioning is the primordial reason to seek treatment (e.g. Morin et al., 2006), including aspects of daytime functioning and HRQoL as outcome parameters is indispensable in clinical insomnia research. However, only few studies to date have reported on treatment outcomes beyond sleep quality, and results are ambiguous (Omvik et al., 2008). These before mentioned aspects of effectiveness in clinical practice, predictors of outcome, and clinical relevance of treatment results are typical features of recent psychotherapy research in general (Westbrook & Kirk, 2005).

Our study objectives were to examine the effectiveness of CBT-i in PI patients who sought treatment in a clinical setting. In addition to improvements in sleep quality, we examined daytime functioning and physical and mental HRQoL immediately after treatment and at 6 months follow-up. Finally, we tried to identify which baseline characteristics were predictive of treatment response.

Methods

Participants

Participants in this study were 157 consecutive patients with PI, either self-referred or referred by their general practitioner for insomnia treatment in the Leuven University Centre for Sleep/Wake disturbances (LUCS, Belgium). Only patients who explicitly met the DSM-IV-TR criteria for PI were included in this study (APA, 2000): all patients with evidence of current medical and/or psychiatric causes or comorbidities were excluded. Of the original 157 patients, 19 were excluded from the sample: 17 patients dropped out during treatment, 1 patient received a diagnosis of a sleep disorder during treatment and 1 patient started a medical treatment with an impact on sleep. These patients were therefore post factum excluded from the sample (Table 1). To be eligible for CBT-i treatment, patients were allowed to use a maximum of 1 benzodiazepine and/or 1 sedating antidepressant. Usage of non-sedating antidepressants, explicitly prescribed for sleep maintenance problems (and not as a mood regulator), was not a reason for exclusion. A statement of formal agreement was obtained by the Ethical Committee of the University Hospitals of Leuven.

Table 1 Characteristics of the study population

Measures

At baseline, post-treatment, and at 6 months after treatment, patients completed six measures as part of routine outcome monitoring, including assessments of sleep quality (PSQI), daytime function (CIS-20), HRQoL (SF-36), sleep-disruptive beliefs (DBAS-16) and psychological well-being (GHQ, PANAS). Graphical sleep logs were used, not as a formal outcome measure, but mainly as a key treatment tool. The questionnaires were sent by mail to each participant and were anonymized to reduce social desirability in the answering patterns. Patients were informed that obtained data would be used for research purposes.

Pittsburgh Sleep Quality Index (PSQI)

This 19-item questionnaire (Buysse, Reynolds, Monk, Berman, & Kupfer, 1989) assesses sleep quality and disturbances over a 1 month interval. Seven different components are measured: (a) subjective sleep quality, (b) sleep onset latency, (c) total sleep time, (d) sleep efficiency, (e) sleep disturbances, (f) use of sleep medication, and (g) daytime dysfunction. The total score reflects the severity of the sleep disturbance, with a range from 0 to 21. Scores greater than 5 are indicative of severe sleep difficulties. Psychometric data of the Dutch version of the PSQI revealed acceptable internal homogeneity and validity (Sillis & Cluydts, 1992). To obtain correct post-treatment scores, the standard interval of one month was changed to 1 week.

Checklist Individual Strength (CIS-20)

The CIS-20 (Vercoulen et al., 1994) is a 20-item self-report questionnaire assessing the severity of daytime impairments over the previous 2 weeks. There are four subscales: subjective experience of fatigue, concentration, motivation and physical activity. Each item is scored on a Likert scale (score 1–7). Total scores range from 20 to 140, with 76 as the cut-off point for severe fatigue in employees (Bültman, de Vries, Beurskens, Bleijenberg, & Vercoulen, 2000). The CIS-20 has been translated in the Dutch language and has been validated by Vercoulen et al. (1994). The questionnaire has been used in cancer survivors and showed good reliability, discriminative validity and sensitivity to change (van der Lee & Garssen, 2010). Immediately after treatment, a 1 week interval was used.

Short-Form Health Survey (SF-36)

The SF-36 (Ware & Sherbourn, 1992) is a generic health status instrument frequently used to assess HRQoL (Léger, Scheuermaier, Philip, Paillard, & Guilleminault, 2001). This 36-item self-report questionnaire comprises eight scales covering the following domains: physical functioning, physical role limitations, perceived general health, bodily pain, mental health, emotional role limitations, social functioning, and vitality. A physical component summary score and a mental component summary score can be computed by adding the four physical subscales and adding the four mental subscales. Scores range from 0 to 100, with lower scores reflecting greater impairment. A validated Dutch version was used (Aaronson et al., 1998). To appropriately measure the perceived HRQoL immediately after treatment, the acute (1-week) recall version was used instead of the standard 4-week recall version.

Dysfunctional Beliefs and Attitudes Scale (DBAS-16)

The abbreviated DBAS version (Morin, Vallières, & Ivers, 2007) is a self-report questionnaire on maladaptive beliefs and attitudes about sleep. Only for exploratory reasons, we translated the original questionnaire in Dutch, and constructed a Likert scale (score 1–5). The scoring involves a simple average of the scores from all 16 items, with the interpretation being that a stronger endorsement of sleep-disruptive beliefs is maladaptive. Scores range from 1 to 5.

General Health Questionnaire (GHQ-12)

The GHQ-12 (Goldberg, 1972) is the 12-item version of the General Health Questionnaire. The GHQ is one of the most widely used self-report tests for assessing psychological health. This questionnaire can be used as a screening instrument and a state questionnaire for psychological wellbeing. A Dutch version was supplied by Koeter and Ormel (1991). We used a bimodal scoring (0–0–1–1), so scores ranged from 0 to 12. The most common scoring methods of the GHQ are bimodal (0-0-1-1) and Likert scoring styles (0-1-2-3) (Montazeri et al., 2003).

Positive and Negative Affect Schedule (PANAS)

The PANAS is a 20-item self-report measure of positive and negative affect developed by Watson, Clark, and Tellegen (1998). Positive affect reflects the extent to which one experiences pleasurable engagement with the environment. Negative affect reflects subjective distress and unpleasurable engagement. A trait and a state version are available. In this study, a validated Dutch trait version was used (Engelen, De Peuter, Victoir, Van Diest, & Van den Bergh, 2006).

Treatment

All participants were treated using the CBT-i program adapted from the original protocol developed by Morin (1993) and by Morin and Espie (2004). The treatment protocol consisted of six 2-h group sessions, held once a week and containing 7–8 participants. All sessions were led by 1 clinical psychologist who was trained and supervised by a senior clinical psychologist/cognitive behavioral therapist with expertise in CBT-i delivery (first author). During the first session participants received general information on sleep and insomnia, education on sleep hygiene, information on sleep medications and they were instructed how to complete the sleep diary. By using an individualized reduction schedule, patients were encouraged to gradually taper off their hypnotic medications (benzodiazepines/sedative antidepressants). In the second session, the core behavioral techniques were introduced. From week 2 until week 6, patients were asked to adhere to sleep restriction rules and to follow stimulus control guidelines. Session three targeted cognitive dimensions of insomnia using traditional cognitive restructuring techniques to influence erroneous and/or unhelpful sleep-related beliefs and attitudes. The fourth session focused on relaxation: progressive muscle relaxation and imagery-induced relaxation were instructed. A relaxation CD was offered to enhance practicing at home. Session five introduced cognitive control techniques such as the worry-chair and thought-blocking methods. Finally, the sixth and last session consisted of a review of the different treatment components and the development of a individualized relapse prevention sheet.

Each session followed the same time frame. We started with a detailed review and discussion of the sleep log of each participant, to check adherence to the stimulus control guidelines and to the sleep restriction regimen. Moreover, solutions for certain impediments to treatment compliance were suggested. Next, the session-specific therapeutic component was introduced and illustrated with a typical case. Patient-therapist interactions were encouraged to ensure a good understanding of each treatment aspect. Patient-patient interactions were welcomed to promote mutual support. To end, each participant was given a homework assignment to practice specific skills, and individualized sleep restriction rules were recommended based upon sleep log data. In addition to the weekly sessions, patients received a comprehensive manual.

Exploratory Data Analyses

We used a linear model for repeated measures to evaluate the evolution over time in sleep, daytime function and physical and mental HRQoL. Results of analyses were corrected for age and gender. The model uses an unstructured covariance matrix for the three repeated measurements over time which allowed us to include subjects with missing data in the analysis (Verbeke & Molenberghs, 1997). Tukey adjustments were carried out for multiple comparisons between the points in time. Further, we verified if the evolution over time in outcome parameters (sleep, daytime function, physical and mental HRQoL) differed as a function of demographic aspects (age, gender), characteristics of the sleep disturbances (duration, severity, type, medication usage) and baseline values on daytime function, physical and mental HRQoL, sleep-disruptive beliefs, and psychological health. To this purpose, for each of the candidate moderating variables, we used a separate linear model containing an interaction between the factor and time (and correcting for age and gender). For each measure, we computed Cohen’s d effect sizes (Cohen, 1977). Effect sizes above 0.8 are interpreted as large, above 0.5 as moderate and above 0.2 as small. The alpha level was set at 1% because of the large number of separate linear models. All analyses were performed using SAS software, version 9.2 of the SAS System for Windows. Copyright © 2002 SAS Institute Inc. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc., Cary, NC, USA. The SAS-procedure PROC MIXED was used for the linear model.

Results

Demographic Data

Table 1 presents the inclusion/exclusion flowchart as well as the main demographic characteristics of our study population. Two-thirds of our patient sample were women and the mean age was 44 years. The majority had mixed insomnia complaints with a mean duration of over 7 years. Seventy-six percent of the patients were using sleep medication.

Results of Statistical Analyses

Summary data on all means with 95% confidence intervals and p-values are provided in Table 2 for sleep and in Table 3 for global daytime function and for the physical and mental HRQoL component. More detailed scores on subscales of daytime function and on domains of HRQoL can be obtained from the authors on request. Highly significant improvements after treatment and at follow-up were found for sleep, daytime function and on mental and physical HRQoL. No significant changes were observed between post-treatment and follow-up, except for a significant augmentation in total sleep time. A significant increase in medication use was observed from post-treatment to follow-up.

Table 2 Means, p-values and effect sizes on SOL, TST, SE, SQ, PSQI total and medication use
Table 3 Means, p-values and effect sizes on daytime function and health-related quality of life (HRQoL)

Results of Clinical Analyses on Sleep, Daytime Function and HRQoL

Sleep

Moderate to large pre–post and pre–follow-up effect sizes were found on all sleep parameters. The post-treatment to follow-up comparison did not yield a significant effect size, except for total sleep time (TST). Compared to a mean sleep onset latency (SOL) of 48.4 min pre-treatment, a SOL of 24.7 and 26.9 min was reached post-treatment and at follow-up. The initial total sleep time (TST) of 325 min increased to 353 min after treatment and to 388 min at follow-up. Sleep efficiency (SE) increased from 68% pre-treatment to 84% after treatment, and decreased towards 83% at follow-up. Sleep quality (SQ) improved from pre to post and remained stable at follow-up (resp. 2.0, 1.1, 1.1). The PSQI total diminished from 12.6 pre-treatment to 7.1 and 6.8, respectively at post-treatment and follow-up. Finally, and importantly, the percentage of medication-free patients increased significantly: 24% were medication-free before treatment and this percentage increased to 71% after treatment, but at follow-up it decreased to 59%. Table 2 gives pre–post, pre–follow-up and post–follow-up effect sizes on all the above mentioned sleep parameters. Figure 1 depicts the frequency rates in medication use pre- and post-treatment and at follow-up.

Fig. 1
figure 1

Frequency of medication use (%)

Daytime Functioning

Large pre–post and pre–follow-up effect sizes were observed on daytime impairments (CIS-TOT). The post-treatment to follow-up comparison did not yield a significant effect size (Table 3). When using the traditional cut-off score of severe impairment (>76) (Bültman et al., 2000), 74% of our sample were considered as very impaired before treatment. Immediately after treatment and at follow-up these percentages dropped significantly to 40 and 35% respectively.

HRQoL

The physical HRQoL component score (SF-PHYS) yielded pre–post and pre–follow-up moderate effect sizes. Pre–post and pre–follow-up comparisons on the mental HRQoL component (SF-MENT) yielded large effect sizes. On post–follow-up differences, only small effect sizes were observed (Table 3).

Predictors of Outcome

Neither age or gender, nor type or duration of the sleep disturbance significantly impacted the evolutions in sleep, daytime function or HRQoL. Neither did baseline medication use or pre-treatment levels of dysfunctional sleep-related beliefs. Table 4 gives an overview of all significant predictors of sleep and shows all significant predictors of daytime function and physical and mental HRQoL. Due to our conservative p-value (p < .01), several predictors were on the edge of significance and were therefore also listed in Table 4.

Table 4 Significant predictors of sleep, daytime function and health-related quality of life (HRQoL)

Predictors of Change in Sleep

More severe sleep disturbance and higher levels of daytime dysfunction predicted larger improvements in sleep quality from pre to post and from pre-treatment to follow-up. Patients with lower levels of positive affect and lower physical HRQoL showed larger pre–post and pre–follow-up improvements in insomnia severity compared to patients with higher levels of positive affect and physical HRQoL.

Predictors of Change in Daytime Function

Pre to post and pre to follow-up improvements in daytime functioning were larger when pre-treatment daytime impairment was greater.

Predictors of Change in Physical and Mental HRQoL

High baseline sleep disturbance, substantial daytime impairment, pre-treatment low positive affect, low levels of psychological well-being and impaired physical and mental HRQoL were associated with larger pre–post and pre–follow-up improvements in physical and mental HRQoL. High negative affect only predicted changes in mental HRQoL.

Discussion

At baseline, our patients suffered from severe insomnia as demonstrated by a mean total PSQI score above 12. After treatment, 41% of our sample reached a PSQI score below 5, which is indicative of normal sleep (Buysse et al., 1989), 83% obtained a SOL below the clinical cut-off of 30 min and 51% yielded a SE score above the clinically relevant threshold of 85% or more (Morin et al., 2006). These outcomes were kept remarkably stable at 6 months follow-up (resp. 41%, 82%, 50%). A slight but significant relapse in sedating medication use observed at follow-up should somewhat temper our results. However, as depicted in Fig. 1, the percentage of daily usage remained substantially low and the percentage of medication-free patients was still high at follow-up. Based on these data, we can conclude that, in addition to the large reduction in medication use, up to half of our patient group reported normalized sleep values and more than 80% achieved sleep improvements of clinical relevance, in terms of normalized SOL. This latter finding is of particular importance since research has pointed at prolonged SOL as the only objective sleep abnormality seen in an insomniac population compared to normal sleepers, besides prolonged wakefulness (Edinger et al., 2003).

Concerning daytime functioning, we found in our sample substantial pretreatment elevations of baseline scores on daytime impairments compared to the healthy employees in the general population (Bültman et al., 2000; Vercoulen et al., 1994), i.e. at baseline almost 75% crossed the clinical cut-off of severe daytime impairments. After treatment and at follow-up, although many patients ameliorated, 40% and 35%, respectively, still remained severely impaired.

Regarding HRQoL, we found clear pretreatment impairments especially on emotional domains of HRQoL at baseline, compared to two large samples in the general Dutch population (Aaronson et al., 1998). After treatment, nearly all domain scores closely resembled normative data or significantly moved toward these normative scores. These data support former findings on the correlation of insomnia and decreased quality of life (Kyle et al., 2009) and show that CBT-i can improve functioning in both physical and mental domains of HRQoL.

Taken together, these outcome data demonstrate the clinical effectiveness of CBT-i, not only in ameliorating night time sleep disturbance, but also in improving HRQoL and daytime functioning. However, up to 60% did not achieve normalized sleep and up to 40% kept on struggling with severe daytime impairments after treatment. Several explanations could account for these observations. First, the follow-up period of 6 months could be too short to reveal the complete treatment response. Daytime impairments have often resulted after many years of sleep disturbance and may therefore not be expected to resolve within 6 months. Second, referring to the recent view on PI as being a 24 h disorder characterized by hyperarousal (Bonnet & Arand, 2010; Horne, 2010), our data on continuing daytime impairments despite substantially improved sleep, could reflect that daytime performance deficits are not solely attributable to the sleep disturbance per se, but result more likely from heightened levels of arousal during day- and nighttime. Finally, the observation of continuing daytime dysfunction in 40% of our sample compared to generally normalized HRQoL could be explained by the fact that the experience of improved sleep (irrespective of explicit normalization) is associated with augmented perceived self-efficacy which has formerly been shown to be a consistent predictor of short- and long-term quality of life (Marks, Allegrante, & Lorig, 2005; Strecher, DeVellis, & Rosenstock, 1986). The slight but significant relapse in medication use needs to be viewed as a troublesome clinical outcome. A persistent failure to adequately cope with situational insomnia could account for this relapse. Offering the opportunity to participate in post-treatment booster sessions could remediate the risk for relapse and prevent a recommencement of medication use.

In our search for predictors of treatment outcome, we determined no significant influence of age, gender, medication use, and type or duration of the sleep disturbance. These findings support the idea of overall clinical effectiveness of this treatment and replicate former data on predictors (Espie, Inglis, & Harvey, 2001).On the other hand, the finding that pre-treatment levels of dysfunctional sleep-related beliefs did not play a predictive role, differs from some former research showing dysfunctional thinking as a positive indication for CBT-i (Edinger, Carney, & Wohlgemuth, 2008; Espie, Inglis, & Harvey, 2001). However, Jansson-Fröjmark and Linton (2008) concluded from their study on CBT-i that sleep-related beliefs play a less unequivocal predictive role than previously envisioned. Moreover, in our study severely impaired sleep did predict larger improvements on sleep, and on physical and mental HRQoL. Also, pronounced daytime impairments were associated with more improved daytime functioning and predicted greater improvements on sleep and on physical and mental HRQoL. Although these findings may result from a statistical phenomenon such as regression to the mean, an alternative and plausible explanation could be that patients who are more distressed are more motivated to comply with difficult treatment strategies, such as sleep restriction. Of note, adherence to treatment techniques has been shown to be related to an improved perception of sleep (Vincent & Hameed, 2003). Finally, lower psychological well-being at baseline also predicted larger improvements on physical and mental HRQoL. One hypothesis could be that patients with higher psychological burden profited more from the mutual support of the group approach compared with patients with no such complaints. Another hypothesis could be that these patients gained more from learned coping skills, such as relaxation exercises and cognitive control techniques, which lead towards larger improvements. Both the paucity of specific predictors as well as the observed trend of better outcomes in patients with more severe complaints lead to the conclusion that CBT-i can be viewed as a ‘broad-spectrum’ treatment strategy especially fruitful for those affected with severe night- and daytime complaints.

Although this study has major strengths, notably the large clinical sample, the long-term outcome data and the assessment of daytime functioning and HRQOL, a number of limitations—most of them linked to the clinical setting—need to be mentioned. First, the absence of a control group hampers the drawing of firm conclusions about the specific effect of CBT-i as such. Since more severe complaints of poor sleep and higher daytime dysfunction predicted larger improvements post treatment, the possibility of regression to the mean could not be ruled out. Other potential confounds related to the absence of a control group such as the impact of passage of time or the influence of other non-specific factors could not be precluded either. However, these aspects have been extensively investigated in former research and these data suggest that the treatment gains in CBT-i are related to active ingredients of therapy (Morin et al., 1994; Murtagh & Greenwood, 1995). Second, our findings are based upon self-reported data. Objective testing of sleep and of daytime functioning could result in a more accurate view of the actual level of dysfunction. However, as put forward by Riemann and Perlis (2009) treatment outcome is best assessed using the same assessment strategy that is required to establish diagnosis, in this case, by subjective report. Third, analysis of sleep diary derived data definitely could have added strength to our conclusions since it captures the typical night-to-night variability of the insomnia complaint to a better extent. Furthermore, diary data would shed light on the actual therapy compliance and could therefore add an essential variable to the prediction model. Finally, from this database, we could not identify which characteristics were associated with a poor outcome on CBT-i. In light of the hyperarousal perspective of chronic insomnia (Riemann et al., 2010), objective measurements of hyperarousal characteristics during day- and nighttime could have yielded more sophisticated information on predictors of CBT-i (non-) responders.

Conclusion

To end, the therapeutic optimism from the last decade on CBT-i treatment is mainly built on data from efficacy studies. Therefore, calls have been launched to include unselected, heterogeneous patients being treated in routine settings (Morin et al., 2006). This study should be considered as a true effectiveness trial in a’ real world’ clinical setting. Based on our outcomes, we endorse CBT-i as a beneficial treatment to improve night- and daytime functioning, applicable for PI patients encountered in actual clinical practice.

This study demonstrates the beneficial impact of CBT-i on sleep, daytime function and HRQoL in PI patients treated in a clinical setting, particularly for those with severe sleep disturbance, pronounced daytime impairment, and low psychological well-being. The lack of other outcome predictors such as type and duration of insomnia, demographic aspects, medication use and distorted sleep-related beliefs suggests that CBT-i may be considered an effective treatment strategy that may benefit a broad array of patients, including those who are severely affected. However, future research on moderating and mediating variables performed in well controlled designs is needed to unravel specific treatment mechanisms and to dismantle factors inhibiting adequate treatment response.