Introduction

The assessment of treatment outcome of back pain patients with standardized self-rating questionnaires has become a routine procedure for clinical investigations, quality control, or benchmarking within many spine centers in Europe. The many outcome tools which are now available for use have recently been reviewed in a special issue of the European Spine Journal [5] and Spine [4]. However, patients are currently overwhelmed with completing questionnaires which they receive from physicians, nursing staff, hospital administration and governmental institutions. This trend often results in incomplete and unreliable data sets due to the reluctance of the patients to fill in all these questionnaires [6]. This problem prompts the quest to simplify outcome assessment to a minimal acceptable question set which also needs to be validated with regard to cross-cultural aspects [6, 10, 17]. To simplify outcome assessment recent trials have been made to develop and test short but meaningful instruments [11, 16, 19].

Despite a plethora of articles on outcome instruments and their application [5], no quantitative baseline data is available on specific domains determining a good outcome particularly from a patients’ perspective. Similarly, only sparse data are available for cross-cultural differences in Europe with regard to what comprises a good outcome.

In performing a cross-sectional survey among spine patients from different European regions and European spine surgeons, the study attempted (1) to identify the most important domains determining a good outcome, and (2) to explore regional differences in the identified domains.

Materials and methods

Study population

Surgeons

A structured interview was performed among 30 peer spinal surgeons of Eurospine, the Spine Society of Europe (representing northern, central and southern European regions) to investigate their criteria for a good treatment outcome. Inclusion criteria were renowned European Spine surgeons with a long track record in clinical and scientific activities in the field of spinal disorders. We chose peer European Spine Surgeons assuming that they are not only confronted with outcome questions during their daily work but that they also exhibit some practical experience with the application of outcome tools. The interviews were performed by the first and the senior author during the SPINEWEEK 2004 in Porto. All of the peers, who were approached during conference pauses, responded to the structured interview.

Patients

A total of 353 spine patients recruited in seven spine centers from different European regions were addressed with the research objective to determine the patients’ perspective on a good subjective outcome using the format of a structured interview. Internationally renowned centres with sufficient case load were selected for this study. Three centers were University hospitals (Great Britain, Portugal, and Switzerland), three were large referral centers for spinal surgery (Germany, Finland, Italy) and one was a community hospital (Sweden). Although skewed to centers with an academic interest, the potential bias towards patient selection was regarded as minor. Although desirable, we were unable to recruit more centers representing additional regions due to cost and time constraints.

Each center contributed at least 49 patients. Patients were chosen in a consecutive manner in case of selection for surgery. The sole selection criterion was the diagnosis. Patients with disc herniation, isthmic spondylolisthesis, degenerative motion segment, or degenerative spondylolisthesis and spinal stenosis were included while patients with tumor, trauma, infections or other consuming illnesses were excluded. In four centers, study participants were interviewed by the corresponding authors of this study, in two centers by a resident and in one center by a study nurse. In five centers patients were interviewed in the outpatient clinics and in two centers preoperatively during the hospital stay for surgery. No structured interview was excluded from study analysis.

Structured interviews

An expert panel consisting of two spine surgeons, one rheumatologist and one clinical and organizational psychologist discussed all interview items. Interview questions were discussed for their clarity and their potential overlap. About one-third of initial questions were identified for ambiguities in wording. Based on these comments, the wording was altered.

Interviewees were given the background rationale for the enquiry, including the voluntary and confidential nature of the interview, and were asked for their informed consent. They were assured that their individual comments were anonymous.

Patients were instructed to rely on their own individual attitudes and expectations, i.e., to report what comes into their minds rather spontaneously. Surgeons were instructed to answer a rather “generally” prototype view on outcome, and not to rely on specific problematic cases.

In a post-hoc survey with 30 patients not included in this study but who were scheduled for the same types of surgery at the principle investigators’ institution, 1-week test–retest reliability was estimated. Furthermore, in order to test for potential interviewer bias that might be related to interviewer status (physician or nurse) differences between repeated data assessment by a research fellow and a study nurse were explored to estimate intra- and inter-rater reliability. A second post-hoc test of translation including forward and back translation by native speakers not involved in the study revealed that translation was satisfying.

Surgeons

The interviews contained mainly open questions and a few rating lists. The surgeon’s interview consisted of three main sections: (1) Surgeon’s personal definition of good and bad outcome and the parameters influencing outcome. (2) Characterization of the patients that surgeons would usually operate on in terms of back pain history, and expectations of surgery. (3) Surgeons’ opinion on currently used outcome tools and on characteristics of a good outcome tool. Beside these three main sections, the interview also included questions on the surgeon’s case load, experience and waiting lists for surgery at the surgeon’s institution (Table 1).

Table 1 Structured Interview for spine surgeons and spine patients

Patients

The interview for the patients also consisted of three main sections: (1) Current major complaints and their influence on the patient’s life. (2) Patient’s view of a good outcome after spinal surgery. (3) Patient’s expectations and estimation of the efficacy of spinal surgery. Additional questions addressed the diagnosis, planned treatment and current work status (Table 1).

Data analysis

Surgeons’ answers were only descriptively analyzed as the small number of participants from the different regions did not allow for a statistical comparison. The information obtained in the open questions of the patients’ interviews was summarized by the first author into categories of outcome such as “pain relief”, “mobility”, “neurological deficits”, etc. Similarly, answers such as e.g. “better walking ability” and “be able to walk longer distances again” were categorized as “improved mobility”. Accordingly, each category was assigned a dichotomous (e.g., improved mobility: yes/no) criterion which was then used for descriptive statistical analyses. One rheumatologist who was not involved in this study independently assigned the patients answers to the set of outcome criteria determined by the first author. Disagreements in this assignment were resolved in conference between the three principle investigators of the study. Answers were compared with regard to regions, diagnosis, work absenteeism and age groups.

Statistical analysis

A qualitative and descriptive approach was used to analyze the results of this survey. Analyses were done by comparing the aforementioned criteria in different groups. Differences in age were evaluated by the Student’s t test, differences in gender and differences between surgeons and patients were explored by the Chi Square test or Fishers’ exact test. Statistics between different groups were controlled for age, gender and diagnosis where appropriate. The reproducibility of the questionnaire in the test–retest experiment was assessed by percent agreement (dichotomous questions) and by intraclass correlation coefficient ICC (VAS and Likert Scale). SPSS 11.5 (SPSS Inc.) was used for statistical analysis and the level of significance was set to 0.05, two-tailed.

Results

Questionnaire evaluation

The 1-week test–retest experiment of the questionnaire revealed a total percent agreement (all questions taken together) of at least 83% (range 83–100%) in all test patients. Agreements for dichotomous questions when both interviews were made by the same person ranged between 89 and 97% and between 78 and 93% when the two interviews were made by two different interviewers. For the two questions with Likert scales, ICCs of 0.85 and 0.96 were calculated, respectively.

Surgeons’ perspective

A total of 30 European spine surgeons from northern, central and southern European countries answered to the structured interviews. No substantial differences between the regions were found. Table 2 shows the characteristics of the surgeons and Table 3 summarizes the main findings.

Table 2 Surgeons’ characteristics
Table 3 Findings from the surgeons survey (n = 30; multiple answers possible)

Criteria for a good outcome of spinal surgery

All surgeons agreed that pain relief was the best parameter for a good outcome. However, it was found to be difficult to define which amount of improvement in a questionnaire corresponds to a clinically significant improvement as well. One respondent implied an improvement on the visual analogue scale (VAS) of 20% to be clinically significant. The other responders however, defined it less formally as an improvement to ‘a pain-level that may be accepted by the patient with or without pain medication’. A VAS was the mostly accepted instrument by the respondents. The vast majority (76.5%) responded that achievement of the preoperative expectations represents a good outcome. Six surgeons (20%) felt that it is important to satisfy the patient to achieve a good outcome. Half of the surgeons thought that either an improvement of function in activities of daily living or return to work is important. Three respondents considered a fulfilled “contract” on the expected result of the operation between surgeon and patient to be a good outcome. The expectations were thought to depend on the patient’s main complaints, but also on information of the surgeon about what can realistically be expected of an operation.

Surgeons’ view on patient’s expectations

All surgeons mentioned that the initial expectations of their patients are in general too high. Pain relief (94%) and complete recovery (33%) were among the most frequently cited aspects.

Criteria for a good outcome tool

Surgeons responded that a good tool for outcome assessment should be short and easily understandable (36%), cover pain, disability and quality of life (33%), reflect patients’ expectation (20%) and have a satisfactory level of validity and reliability (10%). An interesting approach was brought up by three surgeons from northern Europe who argued that a ‘contract’ between surgeon and patient should be made preoperatively. This contract should include the surgeon’s and the patient’s expectations of the planned surgery. It should be based on a realistic estimation of the possibilities of both, the effectiveness of the chosen surgical procedure and of the patient’s potential to benefit of such a procedure (i.e. psycho-social network, compliance, etc.). If the contract was found to be fulfilled by both parties in the postoperative assessment, the outcome would be good.

Reasons for a poor outcome

Among surgeons it was thought that the major factors for a poor outcome after surgery is bad patient selection (36%) exaggerated expectations (23%), wrong diagnosis/indication (23%) and confounding variables such as psycho-social aspects (23%).

Patients’ perspective

The characteristics of the 353 patients from seven spine centers in Europe who were interviewed are presented in Table 4. Forty-nine Finnish, 51 Swedish and 50 British patients represented Northern Europe; 50 Swiss and 50 German patients represented Central Europe and 54 Italian and 49 Portuguese patients represented Southern Europe.

Table 4 Patients demographics of study sample

Differences related to different regions

Patients younger than 65 years (still in the working age) in southern Europe more often mentioned to be affected in their working capacity than northern/central European patients (73.2 vs. 38.7% and 35.0%; p < 0.001). Consistently, they more often mentioned an improved working capacity as a parameter of a good outcome (29.6 vs. 14.4% and 6.7%; p < 0.023 and 0.001) and more often intended to improve their working capacity (67.6 vs. 39.4 vs. 20.9%; p < 0.001). Further differences between southern, central and northern European patients are presented in Table 5.

Table 5 Patients’ survey: differences related to European region (yes answers; multiple responses possible), controlled for age, gender and diagnosis

Differences related to age

Three age-groups were composed for comparison (<40 years, 40–65 years, >65 years). Younger patients (<40 years) significantly more often expected a complete pain relief (p < 0.011 and p < 0.033) and significantly less frequently a neurological improvement (p < 0.003 and p < 0.040). The latter finding was confirmed by a significantly lower frequency of self reported neurological deficits (p < 0.005 and p < 0.048) in this age group. The youngest group significantly less frequently mentioned improved mobility (p < 0.011 and p < 0.006) and activities of daily living (p < 0.002 and p < 0.001) than the other two groups as the domains to be improved by surgery (Table 6).

Table 6 Patients’ survey: differences related to age (yes answers; multiple responses possible), controlled for sex and diagnosis

Differences related to diagnosis

Major complaints

Back (42.7–91.7%) or leg pain (28.5–86.7%) were by far the most often reported major complaints independently of the underlying diagnosis. In the group of disc herniation, neurological deficits (17.3%) followed leg and back pain as most disturbing. In the patients exhibiting degenerative spondylolisthesis or stenosis walking problems (27.3%) did so. In the other groups, walking problems and general functional disability played an important role (Table 7).

Table 7 Patient’s survey: differences related to diagnosis (yes answers; multiple responses possible)
Most affected parts of life

Social functioning (28.2–44.4%), daily activities (29.2–41.7%), mobility (34.0–64.5%) and working ability (25.5–54.2%) were the parts of life mostly affected by back problems within all diagnoses (Table 7).

Expectations

Between 48.2% (degenerative spondylolisthesis and stenosis) and 59.0% (degenerative motion segment (disc degeneration and/or facet joint osteoarthritis)) of all patients mentioned to expect a substantial pain relief by surgery. In the group with disc herniation, 41.3% expected a complete pain-relief whereas in the other groups this expectation was reported between 21.8% (degenerative spondylolisthesis and stenosis) and 12.5% (isthmic spondylolisthesis) (Table 7). Patients with pain lasting for less than 6 months (n = 37) significantly more often suffered from disc herniation (67.6 vs. 16%; p < 0.001) and significantly less from degenerative spondylolisthesis and stenosis (2.7 vs. 17.6%; p < 0.016) or degenerative motion segments (16.2 vs. 45.2%; p < 0.001) than those with pain lasting for more than 6 months. Accordingly, this group significantly more often expected a complete pain relief from surgery (43.2 vs. 21.5%; p < 0.007).

Good outcome after surgery

In the group of disc herniations, 48% mentioned a complete pain-relief to be a good outcome whereas 37.3% already found a substantial improvement to be a good outcome. The same trend was found for isthmic spondylolisthesis (45.8 and 29.2%). A different pattern was observed in the groups with degenerative motion segments and of degenerative spondylolisthesis and stenosis. In these groups a substantial and a complete pain-relief were about equally mentioned as a good outcome (Table 7).

Comparison of surgeons’ and patients’ perspective

Both, surgeons and patients reported pain relief to be the most important parameter for a good outcome. Further important parameters mentioned by both groups were an improvement in activities of daily living, in mobility and working capacity. Three quarters of the surgeons mentioned that a good outcome is achieved if the patient’s expectations are met. According to the interviews, this generally included improved function in activities of daily living, return to work and the avoidance of deterioration.

Discussion

Outcome assessment with a single item and few response categories (poor–fair–good–excellent) has been showed to be a valid descriptor of treatment effects in chronic low back pain [16]. However, it reflects an aggregate of many outcome domains and may be influenced by individual levels of life-satisfaction, depression, and general health status. When attempting to assess outcome with a questionnaire as short as possible but as comprehensive as necessary it is of utmost importance to include those questions that are really essential for the evaluation of a therapy. However, perspectives on what is important when judging outcome after spinal surgery may differ between surgeons and patients and between different European regions. One approach to determine the most meaningful dimensions of outcome is to directly ask patients and surgeons what they consider to be good outcome. Performing a cross-sectional study allows to identify interesting fields of future research topics within this field.

The surgeons’ perspective

The surgeons’ criteria for a good outcome concentrated on patient-related factors e.g. to achieve patients’ expectations. Surgeons reported that pain and disability (e.g. impaired mobility, activities of daily living, social functioning) are mostly complained by the patients and therefore must be addressed to improve by surgery. It was however mentioned that the patients’ expectations and definition of a relevant improvement may differ substantially between patients according to their major complaints and affected parts of life. This has also been shown by Beaton et al. [1] in workers with musculoskeletal disorders of the upper limb.

Disproportionate expectations of the patient of what can be achieved by surgery were one of the most frequently mentioned reasons for a bad outcome accompanied by wrong indication, wrong patient selection and others (Table 3). A recent study by Toyone et al. [22] reported that positive expectations were associated with better satisfaction in patients treated with discectomy for lumbar disc herniation but not in patients treated for lumbar spinal stenosis. They also found that even if the clinical expectations were met, some patients remained unsatisfied. On the other hand, Gepstein et al. [14] found that high patients’ expectations were positively interrelated with satisfaction in patients treated for lumbar spinal stenosis older than 64 years and that preoperative expectations reasonably predicted their postoperative satisfaction rate. This once more highlights the great importance of proper diagnostics, patient selection and information about what can realistically be expected from surgery. The latter should of course be based on known data from the literature but is also heavily influenced by the surgeons experience and expectations on treatment success [9].

Long-term sick leave/receipt of disability benefit are known to be consistent risk factors for a poor outcome regarding return-to-work. Consequently, patients should try to keep working as long as possible, despite ongoing symptoms and plans for surgery [15]. For this reason, prevention of work disability is a primary goal within treatment of spinal disorders and work-related outcome measures are essential indices within evidence-based medicine [12]. In this study, working capacity was mentioned to be important for individuals’ social embedding and functioning and therefore an improvement of working ability was found to be a desirable aim in spinal surgery. However, it was mentioned by half of the surgeons that often it is not very realistic to achieve this aim. This was particularly valid for chronic back pain patients who had been off working for months. In these cases surgeons experienced that a reintegration into the working process is very difficult even though an operation can improve the patients’ health state significantly. Therefore, return to work or an improvement of working ability might be of limited value as an outcome measure especially in chronic pain patients. This surgeons’ opinion could be problematic as it might influence treatment practice and therefore also outcome. This has been shown by Buer and Linton [8] in a population of general practitioners on their fear-avoidance behavior in the treatment of back patients. Thus, surgeons must be aware of these mechanisms when treating patients with acute and chronic back pain.

Other parameters as radiological outcome or absence of complications were less frequently mentioned as important outcome parameters by the surgeons. This makes sense as the patients’ perception of the final result of surgery will determine whether the operation was successful or not in the eyes of the patient. This issue will mainly determine whether or not an operation can be regarded as successful. To ask the patient if he or she would have the same operation under the same circumstances once again might therefore be a valuable question for a global outcome assessment. The drawback of such a simplification however is that many aspects that do not have anything to do with the procedure itself may influence the answer as shown by Fritzell et al. [13].

The patients’ perspective

Comparing the patients’ statements on what comprises a good outcome with those of the surgeons’ revealed high agreement. Back or leg pain were the most often reported major complaints independent of the underlying diagnosis followed by impaired mobility, neurological deficits and functional disability. Accordingly, relief of pain and disability were reported to be most important with regard to a good outcome with some inter group differences.

Differences related to different European regions

A recent study compared chronic pain amongst 15 countries of the European Union and Israel [7]. It revealed that self-reports of herniated or deteriorated intervertebral discs were more common in countries of central Europe (Belgium, Austria, and Switzerland) compared with northern countries as Norway, Sweden, Finland and Denmark. Other studies also found differences of prevalence rates within countries, e.g., in the UK [23] and Germany [21]. Not surprisingly, the use of surgery for low back pain varies widely across regions and between countries. This was shown in two recent studies from the United States and Canada [3, 20]. However, the interpretation of geographical data regarding prevalence rates always remains tentative because many other differences between countries are left unconsidered. For instance, countries may systematically differ in tradition of LBP subclassification as recently shown by Billis et al. [2].

In this context, differences found in our study must also be regarded with caution. Nevertheless, this study revealed an interesting finding concerning working ability. Patients in the southern parts of Europe valued working ability significantly higher than those in the central and northern European countries. In Italy as well as in Portugal, the governmental unemployment insurance only pays for a shorter time period (usually 180 days) compared to the central and northern European regions (up to 500 days in Finland). Further payments must be covered by private insurances and therefore require more financial effort of the individuals. Systems of disability payment also differ between the participating countries. All authors agreed that in general it has become more difficult to get disability payment due to back pain. Most of them also reported that in their countries it is becoming increasingly difficult to solely live from an invalidity pension. According to Main and Burton [18] factors that are the same for many individuals of an organization or a country and that relate for example to working conditions of a particular organization or the social security and health care system of a country are so-called “black flags”. Black flags are not differing with the individuals’ perception but affect all equally. They not only may initially lead to the onset of LBP but they may also promote disability once the acute episode has occurred. These black flags might be a reason for the aforementioned findings concerning the working ability and as a consequence, they should be identified and taken into consideration when different countries or regions are compared.

Age-related differences

The priorities were different in those individuals being still in the working age and those already being in retirement. Younger patients more often mentioned being affected in their social life and working ability. Therefore, they more often expected an improvement in those domains. These patients probably more often have to support members of their families, e.g. children that are still in education. These individuals still need to earn money to assure a retirement pension. On the other hand, severe back-related disability makes it difficult to participate in the family life or social activities which may lead to isolation. Older patients mentioned more often to be affected in their mobility (p = 0.074) and activities of daily living. Therefore, they mainly intended to improve these domains by surgery. This may indicate that the fear of loosing ones’ independence might be stronger in older individuals. This may also be aggravated by health problems others than back pain already compromising their independence.

Limitations

When interpreting our data some limitations have to be taken into account. As a cross-sectional survey this study is mainly descriptive and explorative. Our study populations may not be representative for the different European regions. Particularly, the small number of interviewed surgeons allows only for an exploratory interpretation and prohibits a statistical comparison of the different regions. Including more patients and considering more aspects is always desirable but this was simply not possible because of cost and time constraints. Our data analysis did not reveal any evidence that this attempt would have substantially changed the findings. However, we do not exclude selection bias causing a potential underestimation of cultural diversity.

We are aware that the fact that the surgeons’ interviews were performed by the first and the senior author of the study may be a source of status related interviewer bias. To avoid this, independent and ideally trained interviewers would have had to perform the interviews. Although desirable, the financial and organizational effort to do this seemed too extensive for a preliminary cross-sectional survey. The post-hoc test on bias, however, showed that there was no remarkable decrease in reliability in repeated interviews with physicians as first interviewer and nurses as second ones (or vice versa) compared with repeated interviews by the same nurse or repeated interviews by the same physician.

Data assessment by structured interviews with mainly open-ended questions bears the risk of interviewer bias. To be able to estimate the influence of different interviewers, we made the test-retest experiment which exhibited a quite good intra-observer and a somewhat lower but still acceptable inter-rater reliability.

Taking into account these limitations of our study design, we have nevertheless been able to collect valuable quantitative data on what comprises a good outcome after spinal surgery form a patients’ and surgeon’s perspective. To the best of our knowledge, no other report has covered this important issue so far.

Conclusions

There is considerable agreement on the individual conceptions of a good outcome among surgeons and patients. The main parameters determining good outcome are pain relief, improvement of disability, social reintegration, and meeting the patients’ expectations. Overall, the regional variations were relatively small. However, differences in national social security and health care systems (“black flags”) have an impact on what is considered a good outcome.