Introduction

Degenerative diseases of the lumbar spine that result in compression of the neural elements are common in the aging population and are being encountered with greater frequency as the average life-expectancy increases [9]. Typical presenting complaints include weakness, numbness/tingling, radicular pain and neurogenic claudication, sometimes accompanied by pain in the lower back or buttocks [17, 55]. The consequence, in addition to much discomfort, is a restriction in mobility, and this, coupled with the natural age-associated decline in musculoskeletal function, results in a serious impediment to the performance of activities of daily living and a threat to the patient’s independence. In all but the most moderate of cases, the problem is difficult to treat without recourse to surgery [57, 20, 25], due to the largely mechanical (structural) nature of the disorder [57]. Nonetheless, the success rates of surgery are not particularly impressive: systematic reviews of the long-term outcome report an average success rate of between 45 and 72%, depending on the criterion by which ‘success’ is judged (ability to work, neurologic symptoms, leg/back pain) [16, 43, 58].

For many types of joint surgery, the quality of postoperative rehabilitation has been shown to have an impact on subsequent disability, recurrent injury, and health care use [15, 35, 45]. In the case of decompression surgery in the older patient, however, there are no guidelines describing the need for, or recommended content of post-operative rehabilitation programmes. In attempting to design an appropriate programme for these patients, it is important to identify the limitations that they most commonly encounter, and to target these accordingly. Previous studies indicate that a poor outcome after decompression surgery is strongly associated with severe low back pain [27] or with subjective difficulty in walking [27, 61]. However, these complaints appear to have little association with the presence of stenotic findings at either the operated or neighbouring segments at long-term follow-up [21, 22], suggesting that it is not simply a case of “failing to achieve the surgical goal”. In part, these post-operative complaints may be the result of physical deconditioning, which can occur in association with chronic spinal disorders [19, 36, 40, 53], and may be further exacerbated by the inactivity enforced by the surgery itself. Alternatively (or in addition), especially if the original degenerative changes are accompanied by instability of the affected motion segment or if iatrogenic instability ensues, the ongoing symptoms may be the result of an impaired mechanical integrity of the spine [24]. Conceivably, the latter could be compensated for by focused training of the active stabilising system of the spine i.e. the deep-lying trunk muscles [46]. The critical role played by the musculature in imparting stability to the spine has been confirmed by numerous biomechanical studies [48, 54, 56, 60], and partially substantiated by clinical studies on patients with radiographic instability [44].

Previous studies that have identified the usefulness of active rehabilitation programmes as an adjunct to spinal surgery have typically involved relatively young patients undergoing first-time simple discectomy [45]; even if the underlying principles of the therapy and the desired outcomes are identical, the question naturally arises as to whether older patients can be expected to achieve comparable levels of success with muscle reconditioning programmes. A further issue, highlighted as a research priority in a recent Cochrane systematic review, is whether all patients should be treated post-surgery or whether a minimal intervention with the message “return to an active lifestlye” would suffice [45].

The aim of the present study was to carry out a randomised controlled trial to compare the outcome up to 2 years after spinal decompression surgery of patients assigned to different programmes of rehabilitation, begun 2 months post-operatively: (1) no supervised therapy programme (self-management), with the simple recommendation to “keep active” by undertaking physical activities of one’s own choosing; (2) individual, supervised physiotherapy exercises designed to promote the active stabilization of the spine through improved control of the deep-lying abdominal and back muscles; (3) individual, supervised physiotherapy programme using mixed physiotherapeutic techniques.

Methods

Study admission criteria

The inclusion criteria for the study were: diagnosis of degenerative spinal disease (spinal stenosis or lumbar herniated disc) as ascertained from the medical history, clinical examination, conventional radiography and MRI/CT of the lumbar spine, with an indication for decompression surgery without fusion (if fusion was subsequently deemed necessary, intraoperatively, the patient was excluded from further analysis); failed conservative therapy; willingness to comply with any programme to which randomly assigned, attend for all necessary follow-ups, and complete postal questionnaires; a good understanding of written and spoken German; and aged over 45 years.

The exclusion criteria were: previous spinal fusion (although other spinal surgery carried out more than 6 months ago was acceptable); disorders preventing active rehabilitation.

Recruitment of patients

Patients were recruited from the authors’ hospital, a non-profit making (foundation) orthopaedic clinic. The hospital operation list for the forthcoming 2 months was examined on a weekly basis, and consecutive patients who appeared to satisfy the main admission criteria were sent a letter inviting them to participate, along with an information sheet about the study and a questionnaire booklet (see later). Patients were partly ‘blinded’, to control for expectation bias, by being informed that the study sought to compare three popular approaches to post-operative rehabilitation, the relative efficacy of which had not yet been established. All patients were contacted by phone a week later, to enquire about their interest in participating, to check the inclusion/exclusion criteria, and to address any queries in connection with the study. Those that agreed to participate were invited to attend the research department on the day of admission to clarify any remaining issues and sign the informed consent form. The patient’s details (hospital number, age and gender) were then forwarded to a secretary from a non-surgical department of the clinic, responsible for randomisation (see later).

The study was approved by the local University Ethics Committee.

Surgical procedure

Four experienced surgeons operated on the patients. The typical surgical approach comprised a midline incision with standard preparation of the involved segments of the lumbar spine. The multifidus was separated from the bony attachment by sharp dissection with a Cobb rasp. After identification of the segment to be decompressed, the ligamentum flavum was resected to visualise the dura and the nerve root. Undercutting laminotomy and medial partial resection of the facet without compromising segmental stability was performed in cases with degenerative spinal stenosis. Mobilization of the dura and nerve root with subsequent removal of the compressive disc tissue was used in cases with disc problems. Patients were allowed to leave the bed the day after surgery. No specific treatment or rehabilitation was carried out in the first 6–8 weeks, in order to promote wound healing, but the patients were encouraged to walk and move around as normal.

Assignment to the rehabilitation programmes

Before randomization, patients were pre-stratified by age (<60 years and ≥60 years) and by gender to prevent unequal distributions of these variables among the treatment groups [28]. Using a restricted randomisation procedure (blocks of 12) and a random numbers table [28] prepared in advance by the lead author, patients were assigned to one of three treatment groups (see below). Group assignment took place immediately after inclusion into the study (i.e. after informed consent was signed), but was not revealed until after the patient had completed the first post-operative check-up/assessments.

Postoperative rehabilitation programmes

The three treatments were as follows:

  1. 1.

    Self-management for 12 weeks (control group; CONT): Patients were advised to keep as active as possible by doing the type of exercise/physical activities they enjoyed and documenting these in a daily exercise diary (see later). They were not given any specific exercises to do, but were told that the project manager was happy to discuss with them their individual needs/give advice if they wished.

  2. 2.

    Physiotherapy with spine stabilisation exercises (PT-StabEx): 2 × 30 min sessions per week for 12 weeks (although not entirely typical of clinical practice—in which individual rates of progress are generally considered rather than fixed lengths of treatment—these limitations were set in order to establish a comparable treatment volume in each group). The treatment was administered by physiotherapists specially trained in the concept of spine stabilisation exercises/“muscle balance”. The majority of them worked at the study hospital, but others based in surrounding practices served the needs of patients living a distance away. Specific isometric exercises for the muscles surrounding the lumbar spine were carried out according to the programme described by Richardson and Jull [50, 51]. The exercises aim to improve the dynamic stability of the spine by retraining a precise co-contraction pattern of the deep trunk muscles, the transversus abdominis, internal obliques and lumbar multifidus. Once successful coordination of the action of these muscles is achieved, more demanding activities that require the additional support of the ‘global’ muscle systems (i.e. the large prime movers of the trunk) are gradually integrated into the programme. Finally, the use of these specific exercises is integrated into the performance of activities of daily living (tailored to the needs of the individual patient), especially in relation to movements where the patient has or fears pain.

  3. 3.

    Physiotherapy using mixed techniques (PT-Mixed): 2 × 30 min sessions per week for 12 weeks. This treatment was administered in accordance with the professional judgment and experience of the treating physiotherapist. Patients were invited to locate the practice that was most convenient for them to attend. Each physiotherapist was allowed the freedom to adopt the treatment that they considered to be most appropriate for the given patient. No attempt was made to standardise this, as the treatment was intended to reflect “daily practice”, in all its (potential) variety.

The physiotherapists in both groups completed an ongoing treatment diary for each session and a post-treatment questionnaire enquiring about the main physiotherapeutic concepts and the specific techniques/methods they had used with the given patient. They also rated, on 0–10 graphic rating scale, the extent to which the treatment was passive (0) or active (10), and indicated how well they perceived that the patient had adhered to the home exercises.

Patients in all three groups completed a daily exercise diary documenting whether/for how long they had done special exercises for their back problem, sport (e.g. swimming, hiking), other physical activities (gardening, housework), and cycling and walking as a means of transport.

Administration of the questionnaires

Questionnaires were completed before surgery, before and after the rehabilitation phase (approx 2 and 5 months post-surgery, respectively), and 12 and 24 months after surgery. At each time-point, the patient filled in the questionnaire at home, in order to achieve consistent conditions for completion. For those follow-ups associated with a hospital visit (pre-op, 2 and 5 months post-op), the patient was mailed the questionnaire and requested to bring it with them, completed, to their appointment; otherwise (12 and 24 months) they returned the questionnaire by mail. Questionnaires were always checked immediately and any incomplete parts were highlighted and returned to the patient for completion. Non-responders during follow-up were contacted by study personnel not involved in the patients’ care until the questionnaire was returned or the patient explicitly stated that no questionnaire would be returned.

Questionnaires

The questionnaire variables enquired about before surgery and at follow-up are shown in Table 1.

Table 1 Variables enquired about in the questionnaires administered before surgery, before and after rehabilitation, and at 12 and 24 months after surgery

Immediately after rehabilitation, to assess the acceptability of the treatments offered [45] the patients also answered questions concerning: the rehabilitation group that they would have chosen, given the chance; their initial reaction upon hearing which group they were in; their changed impressions during treatment; how much the programme had helped their back problem (5-point Likert-scale ranging from “helped a lot” to “made things worse”). Patients in PT-StabEx and PT-Mixed also rated the “active” nature of the treatment, their performance of home exercises, and various characteristics of their therapist. CONT patients answered questions about their activity habits in the last 3 months and whether they had received any other medical/physiotherapeutic treatment (co-interventions).

For all patients, re-operations (co-interventions) were enquired about in the follow-up questionnaires and also checked in the hospital information system (or by communication with the patient). At the 12 and 24-month follow-ups, in addition to the outcome questionnaires (Table 1) patients rated the global effectiveness of the treatment they had received (i.e. operation and post-op rehabilitation) on a 5-point Likert Scale (helped a lot, helped, helped only little, didn’t help, made things worse).

Functional assessments

Before surgery, and before and after post-op rehabilitation, a number of functional assessments were carried out: range of motion of the lumbar spine and hips in the sagittal plane [34]; treadmill walking capacity [14] and the ability to selectively activate the deep trunk muscles [50]. The data from these assessments and their relationship to clinical variables are the subject of a separate article.

Each stage of the study (assessment of eligibility, assignment to the treatments, provision of treatment, and functional assessments/administration of questionnaires) was carried out by different groups of professionals.

Statistics

The sample size (approx. 50 in each group) was determined so as to be able to detect a clinically relevant difference of 2.5 points [8, 26] between any two groups for the change in the Roland Morris disability score [11, 52] after rehabilitation, assuming a within-group standard deviation for the change score of 4.0, a type I error probability of 5%, a type II error probability of 20% (i.e. power of 80%), and a 10–15% dropout rate.

As most of the continuous variables under investigation were approximately normally distributed, parametric statistics were employed for the analyses. The comparability of the groups at baseline was assessed using ANOVA. Changes in continuous variables in the three groups over the course of the study were assessed using a two-factor ANOVA with repeated measures (group x time of assessment). Global changes over time in the entire patient cohort were identified from the significance of the main effects of the repeated-measures ANOVA, and differences between subsequent time-points were identified by contrast analyses. Differences in the response over time between the three groups were revealed by a significant interaction, and the location of the significant differences (i.e. how the groups differed in their behaviour for a given variable over time) was identified by the performance of contrast analyses. For categorical variables, group differences were analysed using contingency analyses. Relationships between variables were analysed using Pearson (parametric data) or Spearman rank correlation coefficients (non-parametric).

Statistical analysis was carried out according to the ‘intention-to-treat’ principle, where patients were compared according to the group to which they had been randomly assigned, regardless of their compliance, crossover to other treatments, withdrawal, or re-operation status [23]. Missing data were dealt with as follows: for the first post-surgery assessment, at which point all patients had received the same treatment, values were imputed by using the patient’s pre-op value and assuming the same % change from pre-surgery to post-surgery as for the whole group (the same procedure was used if the pre-op value was missing but the first post-op value was available); at all other follow-ups the last value was carried forward [23]. A second analysis was performed in relation to the actual treatment received; further, for this, if patients in the physiotherapy groups had completed less than 9/24 sessions they were considered not to have received PT treatment. Thirdly, analyses were carried out on only those patients that did not undergo further surgery during the 2-year follow-up. Only where notable differences were produced with these different analyses will this be mentioned in the text.

Significance was accepted at the 5% level, but, as a result of the multiple analyses that were carried out, caution will be exercised in the interpretation of the results, especially when P values approach this 5% limit. This is considered preferable to correcting for multiple testing [47].

Results

Study sample

Figure 1 shows the formation of the final study group in detail. 167 patients were randomised. However, in three of these, the decision was made during surgery to supplement the decompression with fusion; in one, an error in implementation of the eligibility criteria was identified just after surgery (patient had had prior fusion); and in four, no rehab programme was assigned (3, other medical problems; 1, administrative error) (for details, see Fig. 1)—these were all considered to be valid post-randomisation exclusions, according to the literature [12]. Hundred and fifty-nine patients went on to enter the trial and to be included in the intention-to-treat analysis: N = 54 in CONT group; N = 56 in PT-StabEx; N = 49 in PT-Mixed. There were no significant differences in the baseline sociodemographic, pain/disability, general medical and psychological characteristics of the patients in each group (Table 2).

Fig. 1
figure 1

Details of the formation of the final study group

Table 2 Synopsis of baseline sociodemographic, physical, clinical and psychological characteristics of the patients in each of the three treatment groups (mean ± SD, or % values)

Upon receiving their group allocation but before therapy actually began (i.e. not due to dissatisfaction with treatment), 11 patients requested a change of group from physiotherapy to self-management, mostly due to perceived time constraints (for details, see Fig. 1); 4 patients requested (or were recommended by their surgeon/GP) to move from self-management to physiotherapy; and 7 patients transferred over from PT-StabEx to PT-Mixed (either because no spine stabilisation therapists could be located in their area, or they already had a local physiotherapist with whom they wanted to stay).

Five patients left physiotherapy after just a few treatment sessions (see Fig. 1 for details): two from PT-StabEx (to undergo fusion) and three from PT-Mixed (one was unwittingly discharged early by the treating therapist; one due to increased pain; and one to undergo abdominal surgery). Compliance with treatment was similar in the two PT groups, with 21 ± 4 sessions attended by patients in PT-StabEx and 20 ± 4 in PT-Mixed (P > 0.05).

Table 3 shows the proportion of questionnaires returned at each follow-up for each of the randomized groups. Questionnaires were sometimes missed at one follow-up, but completed again at the next. After 24 months, four patients had died; of those still alive, 53/54 (98%) in the CONT group, 50/53 (94%) in PT-StabEx and 48/48 (100%) in PT-Mixed returned questionnaires. Due to this low withdrawal rate, only 4–5% of all the outcome data (from 159 patients assessed at 5 time-points) had to be imputed due to missing values.

Table 3 Proportion of questionnaires returned at each time-point up to 24 months follow-up

By 24 months, 16 patients (10%) had undergone re-operation (9 fusions, 6 further decompressions, 1 intradural drainage and debridement). 8/54 (15%) were from the CONT group, 4/56 (7%) from PT-StabEx and 4/49 (8%) from PT-Mixed (difference between groups, P = 0.37).

Comparison of the content of the two physiotherapy programmes

Physiotherapy documentation forms were received from 38/39 (97%) therapists who had administered PT-StabEx and from 49/54 (91%) PT-Mixed therapists.

The session duration (modal value, 30 min) and manner (nearly all one-to-one) of therapy was similar in both groups, as was the number of years in the profession of the treating therapists (13 ± 7 years). Home exercises were prescribed to a similar extent in both groups (in about 95% patients) and the patient’s compliance with these was also rated similarly (average for both groups: 68% good/very good, 27% adequate, 5% poor). Table 4 shows the main physiotherapeutic concepts and techniques employed, and the extent to which these differed between the groups. Both types of physiotherapy incorporated largely active elements i.e. were truly “rehabilitation”, rather than being based on passive, pain relieving measures. As expected, spine stabilization exercises were used significantly more frequently in the PT-StabEx group, although they were also utilized some of the time in about 30% patients in the PT-Mixed group; the latter was otherwise based mostly on Maitland, and back school/back reconditioning concepts, using strengthening and other exercises.

Table 4 The proportions of patients in each treatment group in which a given physiotherapeutic concept/method was employed as part of the treatment programme (N.B. numbers don’t add up to 100%, since many methods used on the same patient)

The patients and the physiotherapists gave almost identical ratings on the 0–10 scale as to how active the rehabilitation was at the start and the end of treatment (Table 5); the PT-StabEx was rated by both the therapists and patients as slightly but significantly more active than the PT-Mixed.

Table 5 Physiotherapist and patient ratings of the “active versus passive” nature of the therapy at the beginning and end of treatment

Patients’ assessment of the therapy received

35/39 (90%) of the patients that actually received PT-StabEx completed the “assessment of therapy” questionnaire and 35/39 (90%) a daily exercise diary; for the PT-Mixed group the corresponding figures were 53/54 (98%) and 46/54 (85%).

A similar proportion of patients in each group declared receiving home exercises (100% PT-StabEx, 92% PT-Mixed; P = 0.10). 68% in the PT-StabEx group did these daily, 29% often (several times/week), 3% occasionally (several times/month), and 0% never. The corresponding values for the PT-Mixed group were 48, 38, 10, and 4% (difference between groups, P = 0.17). The retrospective ratings showed a highly significant correlation with the data reported in the daily exercise diaries (Spearman rho 0.61; P = 0.0001).

Figure 2 shows the patients’ assessments of their physiotherapist. Overall, the ratings were slightly more favourable for PT-StabEx than PT-Mixed, but they did not differ significantly (P > 0.05): on average (for both groups), more than two-thirds of the patients gave the physiotherapist the top rating with regards to their friendliness, competence, ability to motivate, understanding, and advice (Fig. 2).

Fig. 2
figure 2

Patients’ ratings of various qualities of the physiotherapist, for PT-StabEx and PT-Mixed groups

Activity in the self-management group

Forty-four patients out of 61 (72%) in the control (self-management) group returned a completed daily exercise diary and 38 (62%) provided retrospective data on their activities in the “self-management” group. Thirty-two percent of patients declared that they had been quite/very active in the preceding 3 months, 23% had rather “taken it easy”, and 45% were in between. Sixty-seven percent of them declared that they had begun new physical activities or resumed previous ones since the operation; 60% performed exercises for their back at least 2–3/week, 71% did household/garden work at least 2–3/week and 50% did sport or exercise at least 2–3/week. Forty-seven percent were more active than before the operation, 40% similarly active and 13% less active. The most common reasons for not being (even more) active were that “physical activity caused more pain” (27% agreed and 32% partially agreed with this statement) and “worried about damaging my back” (19% agreed, 28% partially agreed).

Seventy-six percent of patients in the control group undertook no additional treatment during the 3-month rehabilitation phase; the remainder had had occasional sessions of massage, acupuncture, heat treatment, medical training therapy, electrotherapy, and three had received injections.

Success of blinding and credibility of the treatment programmes

Given the choice of treatment, more patients would have chosen either self-management (35%) or PT-StabEx (36%) rather than PT-Mixed (20%) or being indifferent (9%); however, the majority of patients (76% of the control group, 78% of the PT-Mixed and 87% of PT-StabEx) still declared being satisfied/very satisfied when they heard which group they had been randomised to; only 14, 2 and 2%, respectively, expressed dissatisfaction with their group assignment (and the remainder were indifferent).

Outcome

Roland Morris Disability score

There was a significant reduction in Roland Morris disability scores after surgery, with no significant difference between the three groups in the extent of this reduction (Fig. 3a). Between the first (pre-rehab) and second (post-rehab) follow-ups there was a further significant reduction in disability, but again with no significant differences between the groups in the extent of the reduction. After this, there were no further significant changes in disability up to 24 months post-op.

Fig. 3
figure 3

Change in outcome over time in each group (all patients). a Disability, b leg pain, c back pain, d all pain scores, e pain frequency, f pain medication, g psychological disturbance, h fear avoidance beliefs about physical activity. * indicates significant differences (P < 0.05) between assessment periods (for all groups, main effects of the ANOVA), or between groups across a given assessment period (interaction from ANOVA)

Pain intensity

After surgery, in each group there was a significant and clinically relevant reduction in leg pain, back pain, and average of all pain scores (Fig. 3). Leg pain stayed relatively constant over the rehabilitation phase, but showed a slight though statistically significant increase in all groups between the end of rehab and the12-month post-op follow-up (Fig. 3b); in the last 12 months it remained stable. For all other pain measures, there were no significant changes between 2 months post-op and 24 months post-op, in any of the groups (Fig. 3b–d).

Pain frequency and pain medication

Pain frequency showed a significant reduction after surgery in all groups alike (P < 0.05), and remained relatively stable thereafter (Fig. 3e). Pain medication followed the same pattern (Fig. 3f).

Psychological characteristics

Psychological disturbance showed a significant reduction after surgery (except in the PT-Mixed group), but no further significant changes thereafter in any of the groups (Fig. 3g). The changes in fear-avoidance beliefs about physical activity tended to mirror those of disability, showing a significant reduction from pre-surgery to post-surgery, and from pre-rehab to post-rehab, but with no significant differences between the treatment groups (Fig. 3h).

Global outcome at 12 and 24 months post-surgery

The ratings of global outcome at each follow-up are shown in Table 6. There were no significant differences between the groups for the global outcome of the whole treatment package at 2, 12 and 24 months post-surgery (P > 0.05). However, a clear trend existed for more positive ratings in the PTStabEx group (80% good) than the other two groups (63% good) for the global outcome of rehabilitation (P = 0.10), and this reached significance when the data were analysed in relation to the actual treatment received rather than the randomised group (P = 0.02).

Table 6 Global ratings of outcome at each follow-up point, dichotomised as “good” (treatment helped/helped a lot) and “poor” (treatment helped only little, didn’t help, made things worse)

Sub-group analysis of patients with a poor outcome at the first post-operative follow-up

In order to assess whether the rehabilitation was more beneficial in those patients with an initially poor operative result, the influence of therapy group on pain and disability in this sub-group [25/159; 15% (average of the poor outcomes in all treatments groups; see Table 6)] was examined. The two physiotherapy groups were merged for these purposes (due to the low numbers). No significant differences between the groups in the course of change in any aspect of pain or Roland Morris disability after the 2-month follow-up were observed (Fig. 4).

Fig. 4
figure 4

Change in main outcome measures over time for the N = 25 patients with a poor surgical result at the first follow-up (see text for details). a disability, b leg pain, c back pain. * indicates significant differences (P < 0.05) between assessment periods (for both groups, main effects of the ANOVA), or between groups across a given assessment period (interaction from ANOVA)

Discussion

Main findings and methodological considerations

The present study was a randomised controlled trial (RCT) examining the effectiveness of post-operative physiotherapy (rehabilitation) after spinal decompression in older patients with degenerative spinal disorders. The main finding was that, compared with no treatment/self-management, supervised rehabilitation had no significant influence on the course of change in pain and self-rated disability up to 24 months after surgery. This is the first RCT of its type in older patients undergoing spinal decompression, and the results should have far reaching implications for the future management of these patients.

The trial was carried out as far as was practicable in accordance with established recommendations [12, 29, 39, 45], to ensure that it would be scientifically robust and that the findings would be statistically and clinically relevant. Sample size calculations were carried out [3] in order to ensure that clinically relevant changes would be detected, should these occur, and no serious deviations in the baseline assumptions were observed (e.g. the group SD for the change in Roland Morris score, or the drop-our rate) that might otherwise have suggested a type II error. Although it was not possible to blind the patients to treatment group, we attempted to blind them from any expectation bias by presenting all treatments as credible options, the relative efficacy of which was as yet unknown, which is probably the best one can hope to achieve in clinical trials of exercise/physical therapies [45]. Further probing revealed that this approach was relatively successful: although most of the patients declared having had a preference for a particular treatment group, more than 75% were nonetheless satisfied with their randomised group allocation. Blinding of the caregiver is rarely possible in trials of physical therapy [45], and the blinding of outcome was not a relevant issue, since the patients themselves independently assessed the outcome by completing questionnaires at home. In terms of the external validity of the study, it must be noted that only approximately 40% of consecutive patients considered potential study candidates actually volunteered to participate. This was mainly due to time constraints or anticipated problems with travel, or the lack of desire to even think about such things as clinical studies when just about to undergo surgery—perfectly plausible reasons. Although the study group comprised slightly more males and fewer patients with a main diagnosis of stenosis than the group that declined to participate, their medical history and clinical status was nonetheless comparable to that of the typical patient undergoing decompression described in the literature [4, 37]. Further, our own quality management system carried out in connection with the Spine Society of Europe Spine Tango registry, which involves the prospective evaluation of all spine surgery patients using a short patient-orientated questionnaire [32], allowed us to broadly compare the characteristics and outcomes of the patients in the present trial with those of the “typical patient” undergoing decompression in our hospital. The baseline symptoms tended to be slightly lower in the present group than in the “typical patient”, and the predominance of men was again evident; however, there were no major differences for age, or the reduction in back or leg pain, or for the global outcome at 2 months’ post-op (unpublished observations). We therefore, believe that the patients who volunteered for the present study were reasonably representative of the “typical patient” with this indication for surgery.

Randomised controlled trials involving elderly patients are notoriously difficult to carry out, and the present study was no exception. The length of time required to recruit the necessary sample size, the existence or development of other co-morbidities that hindered the ability to perform active physiotherapy, the treatment crossovers needed in the interest of the patient’s general health, the repeated surgery, and deaths during follow-up, were just some of the problems that beset the current investigation. Nonetheless, these problems tended to occur with a similar prevalence in each group, and a highly respectable follow-up rate (97%) was still achieved up to two years post-surgery. As such, all but one of the secondary analyses showed similar results whether patients were analysed according to the “actual treatment received” or using the recommended intention-to-treat analyses. Since less than 5% of the outcome data was missing over the whole follow-up period, no “worst-case analysis” was considered necessary [29, 42].

Comparison between physiotherapy treatments

Unable to find any guidelines in the literature, prior to starting the present study a request was made to various on-line physiotherapy interest-lists enquiring as to what the “standard post-operative treatment programme” for spinal stenosis patients might entail. The few responses received suggested that there was little consensus, and this provided the impetus for the design of the present trial. We chose to include one group that would receive a potentially varied but (as a group) “representative” or “typical” range of physiotherapeutic treatments, with no constraints as to what these should be. The second group was to receive a very specific type of physiotherapy that, based on theoretical considerations [46], cadaveric studies [48, 54, 56, 60], and a clinical trial [44] might be expected to benefit the patients by improving their ability to actively stabilise the spine. Although the original working hypothesis was that this treatment might counteract potential mechanical instability of the osteoligamentous spine after decompression, two recent systematic reviews suggest that documented instability is not actually a prerequisite for the success of the treatment; it has also proven beneficial—though no better than conventional physiotherapy—in patients with chronic non-specific low back pain (LBP) [13, 49]. The findings of the present study are therefore at least in part consistent with the existing literature, in that the patients in the PT-StabEx group did not fare any better than those in the PT-mixed group. The therapists’ documentation revealed that, although the two programmes were quite different in relation to the main physiotherapeutic concept used during treatment, they also had a lot in common, and both incorporated primarily “active” elements. This was also confirmed from the patients’ perspective. Few other differences were evident in either the characteristics of the treatment or the therapists delivering it, although the patients’ ratings of the therapist were generally more positive for the PT-StabEx group (see Fig. 2). This may have explained the more favourable ratings of global outcome of physiotherapy in this group in the absence of any group differences in the longitudinally measured pain intensity and self-rated disability scores. Conceivably, the answer to the direct “global outcome” question may be influenced not only by objective changes in function arising from the actual therapy, but also by the therapy experience itself, such as how enjoyable it was, how competent, encouraging, or supportive the therapist was, etc. Although these factors may be important for the patients’ sense of well-being, confidence, and ability to cope with pain, the prospective measures give an “unbiased assessment of the current situation”, at repeated points in time, and hence are probably better suited to the examination of long-term treatment effectiveness.

Advice to “keep active” with self-management

Ostensibly at variance with the existing systematic reviews on PT-StabEx for chronic non-specific LBP [13, 49] was our finding that neither this treatment nor PT-mixed was superior to simple advice to “keep active”. No previous studies have compared supervised versus self-management methods of post-operative rehabilitation, although it has been highlighted as a priority for further research in a recent systematic review of rehabilitation after first-time disc surgery [45]. In the present study, it is uncertain whether the “advice to keep active” approach was of comparable effectiveness because of the exercise that the patients actually carried out—and their activity diaries suggested that they were indeed relatively successful, even in the absence of any real advice or support as to what exactly they should be doing—or whether the outcome in this group simply reflected the natural history of the condition. Previous studies on a similar patient group have shown comparable patterns of change in pain (marked reduction after surgery, stable thereafter), disability (gradual reduction up to 1 year post-surgery, with the greatest change occurring between 6 weeks and 6 months) and psychological attributes (minimal change over time), though without regard to the rehabilitation process [38]. Either way, it is clear that administration of such a treatment—perhaps even including the instruction to complete a simple exercise diary, which does not necessarily need to be analysed, but may provide further motivation for the patient [41]—is associated with absolutely no costs to the healthcare provider. Further RCTs on this growing sector of the population are clearly required, in order to refine our current health-care practices to meet restricted budgets whilst still delivering effective treatment.

Concluding remarks

It was notable that the final outcome for the patients in the present study, 2 years after surgery, still suggested that they experienced (on average) moderate disability in everyday activities due to their back trouble. Whether more intensive exercise programmes or an earlier start to rehabilitation would have resulted in a more favourable course of change remains to be known; however, studies on other post-operative groups suggest that even the beneficial effects of early, intensive therapy are only short-lived [45]. Hence, this may well represent the level of disability that these patients finally have to learn to cope with, if there is no indication for further surgery. Certainly, the results of the present study suggest that targeted physiotherapy is unable to alter the course of change, or even do much for those patients with an initially poor surgical result. This highlights the fact that more effort must be invested in “getting it right in the first place”, perhaps starting with better patient selection; the emerging evidence-base on predictors of poor outcome should assist in this respect [1]. In the meantime, the provision of better education regarding the likely outcome of surgery should at least lead to better informed, and hence more satisfied patients [33], whilst pain management based on cognitive-behavioural principles might be considered for those with chronic persistent problems [2].