Introduction

Many questionnaires are available to appraise various dimensions in low back pain (LBP) patients. The dimensions that were singled out for use in a variety of settings, including routine clinical care, quality management, and research, are pain (back and leg), symptom-specific function, generic well-being, social disability and work disability, along with satisfaction with treatment [1]. The Core Outcome Measures Index (COMI) was originally proposed to condense the measures of these dimensions in a short, easy to use questionnaire [1]. This 7-item quick-to-answer questionnaire was shown to be a reliable and valid [24] instrument for assessing multidimensional outcomes for patients suffering from LBP. A sum score from 0 (best health status) to 10 (worst health status) can then be computed. It is routinely used in the SpineTango, the European Spine registry [5].

Such a multidimensional assessment calls upon the biopsychosocial model [6] that has been widely advocated during the last three decades in the conceptualization of the etiology and prognosis of LBP [7]. However, it does not fully consider the identification of psychological risk factors for the development of disability following the onset of LBP as described by the “yellow flags” [8]. Both the biopsychosocial model and the yellow flags emphasize the importance of psychological factors and among those the role of distressed affect. Indeed, as Nicholas et al. [8] pointed in their reappraisal of psychological yellow flags, taken as a whole the evidence shows a clear relationship between these factors and future clinical and occupational outcomes. Among these yellow flags, distress, depressed mood and anxiety were consistently associated with the transition from acute to chronic pain problems [9]. It does actually appear that depression especially is associated with a number of negative outcomes [10]. However, anxiety and depression commonly coexist and when the disorders co-occur, patient disability is greater than when either one alone is present [10, 11]. Much in the same line, studies have highlighted that negative emotions such as anxiety and depression increase the risk for new-onset chronic pain [12] and that these emotions contribute to the deterioration in physical function [13]. Furthermore, the emotional burden of pain and depressive and anxiety disorders, separately or conjointly, is high [10, 14].

These data raise the issue of psychological distress as belonging to core outcomes. Anxiety and depression were selected as indicators of psychological distress because of the large body of literature stressing their impact on, and relevance to, pain and function. Therefore, adding items related to anxiety and depression to the COMI was particularly relevant. However, such modifications could alter the psychometric properties of the questionnaire. This study was set up to explore the psychometric properties of two new items (addressing anxiety and depression, respectively) and the impact of this new version on the psychometric properties of the questionnaire.

Methods

Study design

This study was part of a larger research aimed at providing additional knowledge on the psychometric properties of the French version of the COMI by examining the responsiveness of this self-assessment questionnaire to patients’ treatment [15]. A prospective 6-month multicentre cohort study involving patients from non-surgical spine centers was conducted in three French-speaking countries (France, Belgium and Switzerland). Patients were included if they had LBP, with or without radiating leg pain, for at least 4 weeks with an intensity of at least 3 on a 0–10 visual analogue pain scale and were fluent in French. They were excluded if they had a diagnosis of specific LBP (tumor, infection, spondylarthropathy or trauma) or the presence of co-morbidities (e.g., heart failure, knee osteoarthritis) with a functional impact.

During their first appointment to one of the centers for consultation, patients were invited to participate. After having obtained a signed informed consent form, they were instructed to complete a questionnaire booklet (Fig. 1 shows the timeline of the assessment procedure). In addition, a short version of the booklet was given to the patient with the instruction to complete and mail it back to the centre 1 week later for study of reproducibility and reliability (further referred as the short-term follow-up). The follow-up evaluation was scheduled 6 months later (long-term follow-up). The choice of treatment was left to the decision of each investigator. The study was approved by the Medical Ethic Committee of the University Hospitals of Geneva, Switzerland.

Fig. 1
figure 1

Timeline of the assessment procedure

Core Outcome Measure Index

The validated French version of the COMI contains 7 questions evaluating 5 dimensions (pain, function, symptom-specific well-being, quality of life and disability) [3, 15]. Two numeric rating scales are used to evaluate back and leg pain and the highest value is used to compute the total score. The other items are rated on 5-point Likert scales. Two questions are used to evaluate social and work disability and the average score is taken to compute the total score. Each of these other items [function, well-being, quality of life, disability (average score of both social and work disability)] is then scaled from 0 to 10, with each incremental step of the 5-point Likert scale being allocated 2.5 points. All five dimension scores [including pain (highest value for either back or leg pain on the numeric rating scale is used)], are thus scaled into a value ranging between 0 (best condition) and 10 (worst condition). A mean score is then computed by the addition of the five subscales divided by five, thus ranging from 0 to 10, and referred to as the sum score [24].

Two items, addressing the importance of anxiety and depression during the past week (Table 1), were created and added to the 7 original items, forming the new questionnaire (COMIAD). Their content, i.e., their labeling regarding in particular the examples provided to focus on the issue at stake, was in part inspired by the items of the Hospital Anxiety and Depression Scale [16] and answers were on a 5-point Likert scale, ranging from ‘not at all’ to ‘extremely’, in congruence with the other items of the COMI [4]. In order to compute the sum score, a “new” item called “psychological distress” was created by retaining the highest value between anxiety and depression. It was then scaled from 0 to 10, to parallel the procedure of the COMI, with each incremental step of the 5-point Likert scale being allocated 2.5 points. Hence, the COMIAD sum score is computed by the addition of the six subscales (pain, function, symptom-specific well-being, general well-being, disability, and psychological distress) divided by six and thus ranges from 0 (‘‘best health status’’) to 10 (‘‘worst health status’’) [4].

Table 1 Translation of the 2 questions added to the original COMI questionnaire to explore psychological distress

Additional assessment

The validated French version of the Roland and Morris Disability Questionnaire (RMDQ) [17], Dallas Pain Questionnaire (DPQ) [18] and Euroquol 5-dimension (EQ-5D), [19]) were used to assess the external validity of the COMI. Two additional questionnaires were included to assess the construct validity of the anxiety and depression measures, i.e., the French validated versions of the Beck Depression Inventory (BDI) and of the State-Trait Anxiety Inventory (STAI-S/-T) [2023].

At the short-term follow-up, in addition to the COMIAD, the clinical evolution was evaluated through a question on a 7-point Likert scale (from “strong improvement” to “strong deterioration”).

At the 6-month follow-up, patients were asked to fill in a questionnaire booklet identical to the one received at baseline (Fig. 1); treatments administrated since inclusion were also recorded along with a question evaluating patient’s perceived efficacy assessed on a 5-point Likert scale (from “no effect” to “excellent effect, almost no symptoms at all”). Patients were also asked to state whether they considered their present state as satisfactory through the following statement: “Taking into account all activities you have to perform in your daily life, your amount of pain and the level of physical disability, if you were to remain the same for the next months, would this be acceptable for you?” [24]. This question is central for the determination of patient acceptable symptom state (PASS) [25].

Statistical analysis

According to recommendations [26], a sample size of 150 patients was included. A sample size of 150 patients allows a good analysis of factorial structure of the 7 questions index (>20 patients per question). Detecting an effect size of 0.25 for the improvement in condition between baseline and 6-month follow-up with an alpha of 0.05 and a power of 0.8 requires a sample size of 128. Thus, a sample size of 150 patients is sufficient to detect a small effect size while allowing for a 10 % drop-out. Data entry was checked. Missing data were treated according to the specific recommendations for each questionnaire. COMI scores were computed only when all data were present. The analyses described below are based on published recommendations [26].

Floor and ceiling effects Floor and ceiling effects were determined for each item of the COMI, for the 2 questions on anxiety and depression and for the COMIAD sum score by computing the percentage of answers at both extremities of the total score and each subscale.

Construct validity The external construct validity of the COMIAD was explored investigating correlations between subscales to the corresponding validated full-length questionnaire (e.g., BDI for the depression subscale) using Spearman rank correlation coefficients, corrected for ties. Spearman’s Rho coefficients were interpreted as follows: Rho 0.81–1.0 = excellent, 0.61–0.80 = very good, 0.41–0.60 = good, 0.21–0.40 = fair, and 0–0.20 = poor [27, 28]. Pre-specified hypothesis were made and at least good correlation were expected between COMIAD anxiety subscale and STAI-state and COMIAD depression subscale and BDI. In addition, a better correlation between COMIAD anxiety subscale and STAI-state compared to STAI-trait was postulated. For the sum score, our hypothesis was that the correlation with the total score of the DPQ, a multidimensional questionnaire that also includes psychological domains, would be higher for the COMIAD than for the COMI.

Internal validity The internal validity of the COMIAD score was first assessed using principal component analysis (PCA). PCA determines the number of underlying component in a set of items. Since the COMI is supposed to provide a score for a single dimension, we expect the PCA to yield a single component. Then, reliability of the scale was determined using Cronbach’s alpha coefficient.

Reproducibility Reproducibility was assessed by comparing baseline and short-term follow-up responses among those patients who reported no or only minimal change since inclusion. This was computed using weighted kappa for single item and intraclass correlation coefficient (ICC) for the total score and using the Bland–Altman plotting method which indicates the smallest detectable difference (SDD; i.e., the amount of detectable change above the random measurement error). The 95 % limits of agreements were calculated by Bland and Altman method [29] defined by the mean of the difference between the 2 measures ± 1.96 times the standard deviation of this difference.

Responsiveness, sensitivity to change and additional characteristics Assessment of the smallest detectable change (SDD) was done by multiplying 1.96 to the difference in score between baseline and the 1-week follow-up among patients declaring no or minimal improvement [27, 30]. The minimal clinically important improvement (MCII) was determined using an anchoring method based on the patient’s assessment of response to treatment at last follow-up through a 5-point Likert scale (0 = no effect, 1 = slight effect, 2 = moderate effect, could be better, 3 = good effect, still with some symptoms, 4 = excellent effect) [31]. These results were then divided in patients for whom the treatment did not provide change (0 and 1) and patients for whom the treatment provided change (2–4) and the threshold was determined by subtracting the mean change score of the group from that of the second group. The relationship between the change in COMIAD sum score and MCII was also assessed using receiver operating characteristic (ROC) curve and the determination of the area under the curve (AUC). The MCII was defined as the value in change in COMIAD sum score providing the coordinates on the ROC curve closest to the top left corner (Euclidian method). The standardized variation of the items and the total score was assessed by effect sizes (mean difference divided by the standard deviation of the change).

Assessment of PASS was determined using an anchoring method based on the patient’s answer to the statement “Taking into account all activities you have to perform in your daily life, your amount of pain and the level of physical disability, if you were to remain the same for the next months, would this be acceptable for you?” [25]. The threshold for PASS was determined as being the 75th ‰ of the COMI sum score at last follow-up of patients answering “yes” to the previous statement [25]. The relationship between the change in COMIAD sum score and the PASS was also assessed using ROC curve and AUC.

Psychometric properties of the version of the COMIAD were then compared to those of the original COMI questionnaire.

Results

Eleven centers recruited 168 patients from May 2009 to June 2010. Four centers recruited 5 or less patients and there were at least 2 centers in each country (France, Belgium and Switzerland) recruiting more than 15 patients. The short-term follow-up questionnaire (for the study of reproducibility) was answered by 138 out of 168 patients at a mean time of 12.8 days, SD 32.0. The long-term follow-up was answered by 142 patients at a mean time of 5.5 months, SD 1.5.

Patients’ characteristics at baseline

Patients (n = 168; Table 2) had a mean (SD) age of 45.5 (12.2) years; there was slightly less male than female (43.9 versus 56.1 %). For the vast majority (82 %), the present episode lasted for more than 3 months. Fifteen percent had symptoms and signs compatible with lumbar radiculopathy. Twenty-five patients had had previous back surgery, discectomy for half of them. Overall, clinical characteristics of the patients (i.e., pain, disability, quality of life) evolved positively between baseline and last follow-up (Table 3).

Table 2 Baseline characteristics of patients (n = 168)
Table 3 Pain, function and quality of life related characteristics of patients at baseline and after treatment

Floor and ceiling effects

The depression and anxiety items had 2.4 % missing values and at least one item was missing in 4.8 % of the COMI and COMIAD questionnaires (Table 4). No significant floor (1.2 %) or ceiling effects (1.2 %) were observed for COMIAD sum score.

Table 4 Item characteristics of COMI and COMIAD at baseline (n = 168 patients)

Internal validity

The first component analysis of a PCA of the 6 items explains 59.8 % of variance. While the eigenvalue for the second and third factors were slightly above 1, the screenplot favored a one-factor solution. The reliability, as measured by Cronbach’s alpha, was 0.88.

Construct validity

Concerning the construct validity, all hypotheses were fulfilled (Table 5). The item measuring anxiety had a good correlation with state anxiety, and this association was lower with trait anxiety. The item measuring depression had a very good correlation with the BDI. As expected, the COMIAD sum score had a higher correlation with the DPQ, the STAI and the BDI, than the original COMI sum score.

Table 5 Construct validity (correlation coefficient at baseline)

Reproducibility

Out of the 138 patients who responded to the short-term follow-up questionnaire, 132 answered that they had observed no or minimal change from inclusion and were thus included in the test–retest analysis. Test–retest agreement was acceptable for the anxiety and depression items (weighted kappa 0.68 and 0.62, respectively). The test–retest agreement for the sum score was very high (ICC 0.79 (95 % CI 0.71–0.84).

Responsiveness, sensitivity to change and additional characteristics

The SDD was 1.7 for the anxiety item and 1.9 for the depression item. The SDD was 2.1 for the COMIAD sum score (see Bland and Altman plot of COMIAD score, Fig. 2) and the mean difference between scores among stable patients was very low (<0.5). The MCII for the total score was 2.7 based on the distance to the upper left corner. The AUC for the prediction of patient’s own assessment of response to treatment by the change in COMIAD total score was 0.79, meaning that a patient who is reporting no or slight effect had 79 % chance of having a lower COMIAD sum score than a patient who reported a treatment effect. The effect size of the COMIAD sum score was 0.96.

Fig. 2
figure 2

Bland and Altman plotting showing limits of agreement between mean score COMIAD at baseline and after 1 week among stable patients

The patient acceptable symptom state (PASS) for the total score (scale from 0 to 10) was 2.92. This threshold on the COMIAD sum score at the last follow-up correctly classified 90.6 % of the patients who declared being unsatisfied with their present state and 74.3 % of patients reporting being satisfied.

The AUC for the prediction of PASS by the change in COMIAD total score was 0.83 (Fig. 3), meaning that a patient who is unsatisfied has 83 % chance of having a lower COMIAD sum score change than a patient who is satisfied.

Fig. 3
figure 3

Relationship between the change in COMIAD sum score and the patient acceptable symptom state (PASS) assessed using ROC curve analysis. The area under the curve for the prediction of PASS by the change in COMIAD total score was 0.84, meaning that a patient who is unsatisfied has 84 % chance of having a lower COMIAD sum score change than a patient who is satisfied

Comparison between COMIAD and COMI

As measured by Cronbach’s alpha, COMIAD and COMI had similar internal validity (Table 6). Reproducibility, responsiveness and sensitivity to change characteristics were also similar for the 2 versions. However, as expected, the addition of psychological distress measured by anxiety and depression improved construct validity and the association with the Dallas Pain Questionnaire was also higher.

Table 6 Comparative properties of COMI and COMIAD

Discussion

The COMI is a well validated multidimensional questionnaire that is extremely useful both in clinic and in research because it is short, easy to use and enables to explore several dimensions [4, 5]. One of its limitations could be the absence of the psychological dimension, an important element of the biopsychosocial model used in the field of LBP for more than 20 years. Anxiety and depression are two key psychological aspects and their identification in LBP patients is recommended as their presence may influence therapeutic orientation (e.g., indication to use cognitive-behavioral therapies [32, 33], introduction of antidepressant or anxiolytic medication [34, 35]).

This study demonstrates that the addition of two questions exploring anxiety and depression is important and does not alter the psychometric properties of the questionnaire. Both new items showed good psychometric properties with no floor or ceiling effects and high correlations with full-length scales. Furthermore, MCII and smallest detectable difference remained very similar. Construct validity of the COMIAD was greatly improved not only for depression and anxiety, but also for measures of pain. The PASS for the COMIAD sum score set at 2.92 discriminate satisfied and unsatisfied patients just as well as the COMI sum score, correctly classifying more than 90 % of the patients who declare themselves unsatisfied and 74.3 % of the patients who declared being satisfied after treatments.

To the best of our knowledge, the only other multidimensional self-administered questionnaire specifically developed to assess severity and treatment efficacy in LBP patients is the DPQ [36]. The 16-item DPQ evaluates the impact of LBP on four aspects of daily life, i.e., daily activities, work and leisure activities, anxiety/depression, and social interest. The questionnaire provides 4 scores for each aspect (0–100), but it is not designed to provide a total score like the COMIAD. Furthermore, the values of SDD, MCII and PASS have not yet been established and the DPQ includes twice the number of items as compared to the COMIAD.

The strength of this study is its multicentre setting that includes several French-speaking countries, thus increasing the generalizability of our results. Each new item had its external validity controlled by a well-known and fully validated questionnaire in their respective field of anxiety and depression. But this study does not solve definitively the question of anxiety and depression as core outcomes or as predictors of LBP even if their importance among psychological risk factors has been largely acknowledged [3739]. In a multicausal perspective, the identification of a factor as a predictor at one stage does not preclude the possibility that this factor may also be a consequence of pain. A biopsychosocial perspective on the onset and evolution of LBP [7] provides ample justification for the role of anxiety and depression at the various stages of the LBP process and as a possible outcome measure of a specific treatment for LBP. The introduction of anxiety and depression in the COMI increases the construct validity of the instrument for the measures of pain, thus providing further ground for the investigation of these variables in the clinical context. In particular, the inclusion of psychological distress in the COMI may increase the ability of the COMI to predict return to work or, conversely, longer sickness leaves [38, 40]. It may also increase the ability of the COMI to predict treatment effects [14, 39, 41, 42], in particular in those situations where yellow flags and the distressed affect can hinder treatment benefits, from the assessment phase and through treatment planning and implementation [8].

The limitation of this study is the relatively short time for the last follow-up. The addition of the two items measuring anxiety and depression also requires further investigation in a clinical cohort to investigate more precisely their role in the evaluation of the treatment effects. Further factors referring to the psychological dimension may also deserve consideration, and in particular fear of movement and self-efficacy where treatment explicitly focuses on such variables which may then be part of the core outcomes considered after such specific treatment for back pain.

In conclusion, the COMIAD, a modified version of the COMI that includes 2 additional items exploring psychological dimensions (anxiety and depression), has good psychometrics properties. With this new version of the COMI, researchers and clinicians have now access to a short and easy to answer questionnaire that allows an evaluation that better fits the biopsychosocial paradigm.