Background

Low back pain (LBP) continues to be a major health problem in developed countries causing sickness absence and the need for disability pensions [1]. Only a few patients with back pain have pathoanatomically well-defined diseases. Known causes for specific back pain are vertebral fracture, tumour, infection, inflammatory diseases, nerve root compression, spondylolisthesis, spinal stenosis and definite instability [2]. Evidence suggests that fewer than 15% of individuals with back pain can be assigned to one of these specific back pain categories [3]. Chronic LBP disability is mostly attributed to non-specific low back pain (NSLBP) [2, 4] and depends often on psychosocial factors [4]. The assessment of patients with LBP should include diagnostic triage to exclude specific spinal pathology as well as the assessment of work-related factors, psychosocial distress, patient expectations, and extreme symptom reporting [5]. Modern treatment guidelines for NSLBP define return to work as the primary rehabilitation goal [5, 6].

Perceived self-efficacy is a relevant psychosocial factor contributing to the outcome in patients with chronic musculoskeletal pain [7]. According to Bandura [8], perceived self-efficacy affects how people behave in difficult situations, and people who doubt their capabilities shy away from tasks which they view as personal threats. Patients with back pain tend to have experienced difficulties with manual material handling [9], which may affect their perceived functional ability. The assessment of perceived functional ability for work-related activities in patients with NSLBP plays an important role in their management, and is proposed to be a predictor for return to work [10].

The use of self-reported measures to assess perceived functional ability is limited by literacy level [11] and depends on linguistic abilities. Text-based questionnaires are often impossible to administer in European rehabilitation settings for the treatment of patients with different mother tongues. A possible approach to overcome this problem is the use of picture-based questionnaires such as the spinal function sort (SFS) [10]. The SFS has shown a high practicability in rehabilitation settings where patients from pan-European origin are treated, and is recommended for work-related rehabilitation [12].

The clinical utility of the SFS has so far only been reported in patient samples from the USA [13, 14]. The reliability and validity of the SFS were investigated in English-speaking patients with back pain from the United States [10] and Australia [15] reporting good psychometric properties of the SFS. No studies have been performed investigating the validity of the SFS in European patients.

Purpose of this study

The aim of this study was to test the validity of the SFS by assessing internal consistency, unidimensionality, concurrent and predictive validity, and responsiveness in a European rehabilitation setting for patients with NSLBP.

Method

Study design, data collection, and subjects

This validation study is embedded within a randomised controlled trial investigating the effectiveness of a function-centred treatment when compared with a pain-centred treatment with 3 and 12 months of follow-up for working status. Eligible were patients with non-acute NSLBP, between 20 and 55 years of age, presenting with at least 6 weeks of sick leave during the previous 6 months, who were referred to the Rehabilitationsklinik Valens (Switzerland) for inpatient rehabilitation [16, 17]. The study was approved by the ethics committee of the Canton of St. Gallen, Switzerland (EKSG 03/035).

Measurements

All measurements were taken by a blinded research assistant at entry and discharge of the patients for rehabilitation. Work status at 3 and 12 months of follow-up was assessed with a questionnaire sent to employers and the patients’ primary physicians, who were blinded to the patients’ group assignments. Patients were classified as returned to work if they worked at least 1 day in the follow-up period.

Spinal function sort

The SFS assesses perceived ability to perform work tasks that involve the spine in various ways. It was developed in the United States by Matheson et al. [18] and consists of 50 graphically depicted tasks with simple descriptions. The patient is instructed to look at each drawing and rate each task on a separate evaluation sheet on a 5-point scale from “able” to “restricted” to “unable”. The SFS is scored by the assessor and yields a single rating of perceived functional ability ranging from 0 to 200. The scores were categorised according to the work demands as defined by the dictionary of occupational titles (DOT), thereby allowing a comparison with perceived functional ability and work [10]. Patients with an SFS score of <100 are categorised as having minimal working capacity (see Table 1).

Table 1 Transformation of SFS scores to DOT categories [10]

Body functions

Back strength was assessed with the Biering-Sorensen test, measuring the amount of time a patient can maintain a prone position, the unsupported upper body horizontally with the lower body strapped to the treatment table [19]. Back-related range of motion was measured with a finger to floor distance in centimetres, and straight leg raise tests in degrees against a horizontal line with a bubble inclinometer.

Work-related activity

A functional capacity evaluation (FCE) as described by Isernhagen [20] was performed. FCE is a standardised battery of clinical tests that intend to measure a patient’s safe physical ability for work-related activity. For the purpose of this study, five lifting tests were performed assessing the maximum safe ability to handle weight from “Floor to Waist”, “Waist to Overhead”, “Horizontal”, and “Single Handed Carry” (right and left).

Personal beliefs

Pain intensity was measured with numeric rating scale. The Fear Avoidance Belief Questionnaire (FABQ) was used to assess how patients were affected by fear and avoidance beliefs. The questionnaire provides a score for fear of physical activity ranging from 0 (no fear) to 24 (maximum fear) and one for work activities ranging from 0 to 42 [21]. Translated versions of the FABQ were available in German, Italian, Serbo-Croatian, Albanian, Turkish, French, Spanish, and Portuguese.

Data analyses

Internal consistency at entry and discharge

Internal consistency was assessed by item-total correlations and Cronbach’s alpha. For a scale to be sufficiently reliable for the use in groups of patients, an alpha value of 0.80 is considered acceptable [22].

Unidimensionality

Principal component analysis of all 50 items of the discharge SFS score was used to assess the unidimensionality of the instrument. An eigenvalue criterion of 1.0 was used for the factor analysis. Ratios of first to second eigenvalues of 3:1 are generally considered evidence for unidimensionality [23].

Concurrent validity at entry to discharge

Concurrent validity was assessed by comparing the entry and discharge SFS scores with body function, work-related activity, and personal beliefs with Spearman’s correlation coefficient. We hypothesised that the SFS scores would correlate highly to personal beliefs and work-related activity (i.e. r > 0.60) and moderately to body function (0.30–0.60).

Predictive validity for work status

Receiver operating characteristic (ROC) curve analysis was used [24] to evaluate the diagnostic performance of the SFS scores at discharge for work status at 3 and 12 months of follow-up. The diagnostic properties of the proposed SFS score ranges related to the DOT categories [10] were assessed by calculating sensitivity and specificity as well as positive and negative predictive value. Within the context of work-related rehabilitation of patients with back pain, the aim is to identify those patients with a probability of not returning to work. Therefore, sensitivity refers to the proportion of patients who had not returned to work and were correctly identified by the DOT category. Another interesting aspect for clinicians is the positive predictive value of the perceived work capability for not returning to work. This was calculated as the percentage of the patients within a DOT category who had not returned to work. According to the definition of the DOT category, we hypothesised that a perception of a minimal working capacity (SFS score <100) would have a high positive predictive value identifying those patients who would not return to work at 3 and 12 months of follow-up.

Responsiveness from entry to discharge

Because the function-centred treatment was specially targeted towards improving perceived ability, it was hypothesised that these patients would have a greater improvement in SFS than the patients in the pain-centred treatment group. Responsiveness was, therefore, assessed in the two treatment groups separately, with the standardised response mean which was calculated by dividing the mean change scores by the standard deviation of the change scores.

Data were analysed using SPSS for windows version 11.5. The ROC curve analysis was performed with MedCalc version 9.7.3.0. Bonferroni’s procedures were applied to reduce type I error in the assessment of concurrent validity. Adjustment for 12 comparisons at alpha = 0.05 resulted in the use of p < 0.005 as level of significance.

Results

Descriptive statistics

Subjects

170 of the 174 subjects who participated in the randomised controlled trial fully completed the SFS at discharge, and were included in this study. The subjects’ characteristics are described in Table 2.

Table 2 Subjects characteristics

Internal consistency at entry and discharge

A high internal consistency at entry and discharge shown by a Cronbach’s alpha coefficient of 0.98 was found. Corrected total item correlation was at entry in two items below 0.30 and at discharge in three items below 0.30 and one item 0.32. All of these items asked for a very high physical capacity handling weights of 50 kg. Mean scores were in these four items above 4.9 with low standard deviations of 0.3 showing high agreement amongst subjects that they were unable to perform such heavy tasks. Alpha would not have been substantially changed if these items had been removed. The remaining 46 items all showed a total item correlation of >0.6.

Unidimensionality

Inspection of the correlation matrix revealed the presence of the majority of coefficients of >0.3. Principal component analysis revealed the presence of seven components with eigenvalues exceeding 1, explaining 55.6, 8.0, 4.3, 2.8, 2.5, 2.2, and 2.1% of the variance, respectively. An inspection of the screeplot revealed a break at two components. To aid the interpretation of these components, a Varimax rotation was performed. The rotation solution revealed the presence of a mixed structure with two components showing a number of strong loadings. Variables 45–48 loaded on different components. The two-factor solution explained a total of 63.6% of the variance, with component 1 contributing 46.0% of the variance and component 2 contributing 17.6%. Ratio from first to second eigenvalue was 6.95, suggesting reasonable evidence for unidimensionality.

Concurrent validity at entry and discharge

Complete measurements and SFS total scores at entry and discharge were available from 156 patients. Correlations were higher at discharge as compared to entry. Correlations between SFS and work activities were high (>0.6) except for manual handling from waist to crown. Back strength and pain intensity also showed high correlation with the SFS total score at discharge. Fear and avoidance beliefs for work activities showed markedly higher correlation with the SFS score than fear and avoidance beliefs for physical activities (see Table 3).

Table 3 Spearman’s rank correlation coefficients for SFS total score and body function, work activities and personal beliefs at entry and discharge

Predictive validity of SFS at discharge for work status at 3 and 12 months of follow-up

Work status at the 3-month follow-up was known in all patients. 66 returned to work and 104 did not. At the 12-month follow-up, work status was known in 163 patients of which 75 returned to work and 88 did not. Area under the ROC curve was 0.760 (95% CI 0.689–0.822) for work status at 3-month follow-up (see Fig. 1) and 0.801 (95% CI 0.731–0.859) at 12-month follow-up (see Fig. 2). The difference in the area under the ROC curve at 3 and 12 months of follow-up was not statistically significant.

Fig. 1
figure 1

ROC curve with 95% confidence intervals of SFS total score at discharge with work status at 3-month follow-up

Fig. 2
figure 2

ROC curve with 95% confidence intervals of SFS total score at discharge with work status at 12-month follow-up

Seventy-six patients judged their working capacity as minimal, achieving an SFS score of <100. 62 of these patients were not at work at 3 months and 59 were not at work at 12 months of follow-up (see Table 4). The positive predictive value of a score of <100 was 81.6 for work status at 3-month follow-up and 79.9 at 12-month follow-up, respectively. The 15 patients who were falsely positively diagnosed as not returning to work had an SFS score between 28 and 96. None of the patients with an SFS score of ≤28 (n = 7) returned ever to work, resulting in a positive predictive value of 100 for both times of follow-up.

Table 4 Diagnostic values of DOT-transformed SFS categories for non return to work at 3 and 12 months of follow-up

Responsiveness

Patients in the function-centred treatment group (n = 84) showed at discharge an increase in perceived self-efficacy with a mean change score of 5.9 (SD 32.5) resulting in a standardised response mean of 0.18, while the patients in the pain-centred treatment group (n = 85) showed a decrease in perceived self-efficacy with a mean change score of −7.4 (SD 24.4) resulting in a standardised response mean of −0.31 [16].

Discussion

To our knowledge, this is the first study assessing the validity of the SFS in a European rehabilitation setting. The high follow-up rate of work status at 3 and 12 months allowed a thorough evaluation of the diagnostic accuracy of the SFS for work status. We found discriminating power of the proposed SFS-transformed DOT categories for work status. High DOT categories have a high sensitivity in diagnosing patients not returning to work, but an unacceptably high rate of false-positive findings. The proposed SFS cut-off score of 100 indicating minimal working capacity results in a high probability for not returning to work at 3 and 12 months of follow-up. Considering the specificity of 78.7, respectively, 80.0 and the high prevalence of patients with NSLBP not returning to work, both factors being determinants of the positive predictive value, we can recommend using an SFS score of ≤100 for screening purposes in clinical practice identifying patients with high probability of not returning to work.

The validation study of this picture-based questionnaire included patients from 10 different nations. We found higher correlations of perceived functional ability with all measurements at discharge and interpreted this as the effect of repeated testing, giving the patients a more accurate perception of their functional ability. Four out of the five comparisons of SFS scores with FCE tests showed high correlations. Interestingly, these were higher than those of other self-reported measures of low back disability as found by Reneman et al. [25]. Relations of the SFS score with the FABQ activity score were lower than with FABQ work score. These findings add further validity to this questionnaire assessing specifically perceived ability to perform work tasks. Owing to the lower correlations of the FABQ physical activity score than of the FABQ work score, we thought the patients had understood the questions and we, consequently, interpreted the findings as valid, despite the use of non cross cultural adapted translations of the FABQ.

The SFS was able to capture the expected changes in self-perceived ability in the two treatment groups. In contrast to the expectations, the standardised response mean was smaller for the function-centred treatment than the pain-centred treatment. A possible explanation is that the information provided in the pain-centred treatment group, namely to stop activities if pain was provoked, was more powerful than the positive reinforcement provided within the function-centred treatment [16].

Factor analysis has shown reasonable evidence for unidimensionality of the SFS. However, four items asking for manual handling capacity of 50 kg had higher loading for another dimension. These four questions also showed low corrected total item correlation in the assessment of internal consistency. Therefore, we recommend further research using RASCH analysis to investigate unidimensionality and consequently reduce questionnaire items. Owing to the multidimensionality of factors predicting work status in patients with NSLBP, we also recommend further research to assess the contribution of physical functioning, the patient’s perceived functional ability, and other psychosocial work-related factors for work status.

Conclusion

Perceived functional ability for work-related tasks can validly be assessed with the SFS in a European rehabilitation setting treating patients with NSLBP, and is predictive for future work status.