Introduction

Verbiest [30] introduced the concept of spinal stenosis and brought it to the attention of the medical world. Lumbar spinal stenosis refers to a pathologic condition resulting in compression of the contents of the canal, particularly the neural structures. If compression does not occur, the canal should be described as narrow but not stenotic. Lumbar spinal stenosis is therefore a clinical condition and not a radiologic finding or diagnosis.

When conservative treatment for lumbar spinal stenosis fails, operation to improve the quality of life may be proposed. Degenerative disc disease is by far the most common cause of lumbar spinal stenosis, and increasing numbers of patients, particularly the elderly, are undergoing surgery for lumbar stenosis. Indeed, canal stenosis is now the most common indication for lumbar spine surgery in elderly subjects. With aging of the population, the incidence of surgical decompressions will go up [5]. A meta-analysis of the literature in 1991 showed that, on average, 64% of patients treated surgically for lumbar spinal stenosis were reported to have good-to-excellent outcomes [29].

Wide decompressive laminectomy, often combined with medial facetectomy and foraminotomy, used to be the standard treatment. In recent years, however, a growing tendency towards less invasive decompressive surgery has emerged [2, 3, 4, 6, 7, 20, 21, 23, 28]. One such procedure, laminarthrectomy, refers to surgical decompression involving partial laminectomy of the vertebra above and below the stenotic level combined with a partial arthrectomy at that level [9]. As evidence-based medicine becomes the norm, it is important to try and determine predictive factors associated with particular surgical procedures.

This prospective study included a cohort of patients presenting lumbar spinal stenosis and operated on by a single surgeon (RG). The clinical, radiographic, and biomechanical analyses combined with outcome measures for patients undergoing decompressive lumbar laminarthrectomy without fusion will be reported elsewhere. The aim of this study was to predict surgical outcomes based on clinical, radiologic, biomechanical, and psychofunctional information available prior to surgery.

Material and methods

Patient selection

Between January 1996 and January 1998, consecutive patients presenting with clinically and radiologically confirmed lumbar spinal stenosis and admitted for surgery after conservative treatment had failed for at least 1 year were entered in the study. Patients admitted for similar surgery in the same period but who had histories of spinal surgery or in whom spinal fusion was also carried out were excluded. Surgical decompressions in patients presenting with degenerative spondylolisthesis were also excluded.

Presurgery clinical features

An independent orthopedic surgeon observer (KV) performed standard clinical examination upon admission to hospital, one day before surgery. History revealed that 33 patients complained of neurogenic claudication, whereas 13 had complaints of sciatica. Other clinical information is primarily descriptive and will be presented elsewhere.

The nonorganic physical signs (NOS) according to Waddell et al. [32] were also measured: tenderness, simulation, distraction, regional disturbances, and overreaction. These were recorded in percentages, no positive sign scoring 0% and eight positive signs scoring 100%.

Outcome assessments

Preoperatively, the patients had to complete several self-administered questionnaires. The Waddell disability index (WDI) [18, 31] and the Oswestry low back pain disability questionnaire (ODI) [8] measured physical impairment. The low-back outcome score [10] (LBOS) and a general questionnaire including personal and medical history as well as claudication and radiation of pain were administered.

Trunk dynamometer testing

Trunk function was assessed using computerized, triaxial, isoinertial dynamometer equipment (Isostation B200, Isotechnologies, Hillsborough, N.C., USA). Only results from dynamic flexion-extension velocity tests conducted at 50% of the maximum flexion torque in the sagittal plane are reported in this study. Dynamometer tests were performed within 48 h prior to surgery [19]. The validity and reliability of the trunk dynamometer equipment have been established [25, 26, 27].

Computed tomographic imaging and analysis

Prior to surgery, standard lumbar spine radiographs were taken as well as 2-mm continuous and nonoverlapping slices on CT and/or myelo-CT scan (HiQ, Siemens, Erlangen, Germany). Based on the appearance of the preoperative CT scans, the operated levels were classified as presenting congenital stenosis, acquired stenosis (due to degenerative changes), or mixed stenosis (partly congenital, partly acquired) according to guidelines published by Airaksinen et al. [1].

Simple canal measurements were made from the patients’ CT images according to the following protocol. Slices corresponding to each stenotic disc level were chosen and digitized using a 32-bit, 1,000-line frame-capture color camera. The images were converted to enlarged gray-scale images, contrast-enhanced, and printed. Using a dial caliper (0.025-mm resolution), two calibrated canal dimensions were recorded, one taking into account the medial protrusion of the facets (bony canal diameter) and a second taking in account the buckled and/or hypertrophic ligamentum flavum (soft tissue canal diameter). The presence or absence of aorta calcification on CT images was noted.

Surgery

Technique

The partial laminectomy/arthrectomy (or laminarthrectomy) surgical procedure has been previously described in detail [9, 33]. Briefly, patients are placed in ventral decubitus with a padded support at the level of the iliac crests and sternum. A very slight flexion of hips and knees assures that the subjects lie in a lordotic position simulating normal erect posture [11]. After midline posterior skin and subcutaneous tissue incision, the dissection goes through the dorsolumbar fascia approximately 5 mm left of the midline, preserving the supraspinous ligamentous attachment to the fascia. The multifidus is detached from the left side of the spinous processes and laminar attachments. An osteotomy is performed with a curved osteotome at the base of the spinous processes of the vertebrae above and below the stenotic levels, just superficially to their junction with the laminae. Flavectomies are carried out, and the superior and inferior laminae are partially resected. Partial facetectomies and foraminal decompressions are carried out under direct vision with the aid of Kerrisson rongeurs and/or a power drill.

After completion of thorough decompression, the dorsolumbar fascia is resutured over a suction drain to the supraspinous ligamentous/fascial complex, with the osteotomized spinous processes resuming their initial positions over the neural arches.

Follow-up

With minimum follow-up of 1 year after surgery (mean 1.7, range 1–2.6), the subjects underwent identical standard clinical evaluation by the independent orthopedic surgeon observer (KV). At no other time did the patients see this observer, who was not involved in the patient care.

During follow-up visit, the patients completed the same self-administered questionnaires as preoperatively. In addition, some questions were asked concerning their treatment perception. A mechanical testing protocol was performed on the Isostation B200 identical to that before surgery, and CT scans were taken at the operated levels. The bony canal dimensions described previously were taken again.

Defining successful outcome

In this study, a paradigm for successful outcome of surgical treatment of patients presenting with acquired or mixed stenosis was defined in terms of four variables. Two of these were general health outcome measures: (1) patients’ self-reported pain as measured on a single-item pain intensity visual analog scale (VAS) ranging from “no pain” to “unimaginable pain” with scores from 0 to 100, and (2) patients’ self-reported functional status as measured by the LBOS. Two other measures more specific for stenosis were defined as: (3) claudication (degree of pain while walking), and (4) leg pain. Patients were considered successful if their comparative preoperative and postoperative treatment assessments demonstrated improvement on at least three of these four measures. The data for these criteria came from the general questionnaires taken both before surgery and at follow-up. An increase of two or more points on the VAS pain scale was considered an improvement.

Statistical analysis

Predicting success

Logistic regression was used to predict successful surgical outcome, wherein success was defined as a dichotomous outcome. In this study, entry and exit criteria into the logistic regression models were set using a type I error rate of 0.10. A decision-tree methodology, chi-squared automatic interaction detection (CHAID), was also used to predict success [15]. Additional details on the CHAID methodology can be found at http://sunsite.univie.ac.at/textbooks/statistics/stclatre.html.

Relative risks for single predictors

The following parameters were chosen as predictors of outcome: gender, age at surgery, number of stenotic levels, stenosis classification, bony canal diameter, flexion:extension power ratio, NOS, continuous pain, calcification of aorta, smoking, and comorbidities. Relative risk greater than 1 indicates that the first reported value of the predictor has an increased likelihood of realizing a successful outcome relative to the second value and vice versa. Percentage of success, relative risk ratios, and 95% confidence intervals (CI) of the risk ratios were determined.

Results

Attrition

A total of 40 patients initially met the study inclusion criteria. However, several patients were later excluded. A 19-year-old female and a 71-year-old male were excluded because they underwent fusion surgery during the year following decompression. Two female patients, one 83 and the other 47 at the time of surgery, were excluded, as they did not return for the follow-up study. These two subjects were classified as unsuccessful surgical outcomes. Thus 36 patients were re-evaluated after a minimum follow-up of 1 year.

Patient personal and health demographics

There were 17 males and 19 females. Mean age at the time of surgery was 59.8±16.8 years (range 17.3–84.1). The number of females averaged 64.9±12.1 years (range 35.7–81.4) and of males 54.0±19.7 years (range 17.3–84.1). Fifty percent of the patients reported no comorbidities, three reported diabetes, two rheumatoid arthritis, four cardiac disease, one gout, and six reported various conditions. Calcification of the aorta, identified from CT scan, was noted in 50% of the patients (18/36). There were ten active smokers in the series (four females and six males).

Clinical demographics

Of the 36 patients included, 21 (58.3%) were classified as having acquired stenosis, one (2.8%) as purely congenital, and 14 (38.9%) with a combination of acquired and congenital narrowing. For data analysis, the one congenital case was added to the mixed group. Before surgery, 70% considered their general health to be good and 30% considered it average or bad, compared to 64% and 36%, respectively, after surgery.

Table 1 summarizes patient demographics and provides statistical comparisons across relevant demographics. Overall, personal and health demographics did not seem to show much inter-relationship. Although females were significantly older than males at the time of operation (P<0.05), gender was not significantly related to any other personal or health-related demographic. Stenosis classification was significantly related to only one health-related demographic: patients classified with mixed stenosis had a higher incidence of continuous pain than patients with acquired stenosis (P<0.04).

Table 1 Patient and clinical demographics

Surgical interventions

One-level decompression was performed in ten subjects, two-level in 16, three-level in five, four-level in one, and five-level in four. The average duration of follow-up was 1.7±0.4 years (range 1.0–2.6).

Predicting successful outcome

Of the 36 patients, 14 demonstrated improvement in all four surgical success criteria and seven in three of them. Therefore, 21 of 36 outcomes (58.3%) were classified as successful. From an intent-to-treat point of view, 21 of 38 patients (55.3%) would be classified as having successful outcome. Of the 15 who did not demonstrate sufficient improvement to be labeled a success, improvement was reported by 12 in two categories and by three in one category, two in walking and one in function.

Relative risks for single predictors

Relative risks for selected predictors of a successful surgical outcome are summarized in Table 2. Of the predictors examined, minimal canal diameter was most effective at predicting surgical success. However, univariate logistic regression indicated that none of the variables examined reached statistical significance.

Table 2 Percentages of success and relative risks for selected predictors of successful surgical outcome. CI confidence interval

Exact logistic regression analyses

Exact logistic regression analysis was implemented to predict success according to patient gender and age at the time of operation and allowing any of the predictors to be entered if they reached a type I error rate of 0.10 (Table 3). In this analysis, only Waddell’s NOS signs demonstrated a significant odds ratio (0.648, 90% CI 0.362–0.991). This resulted in a three-predictor model which included Waddell’s NOS along with the forced variables of gender and age and indicated that increased numbers of nonorganic signs decreased the likelihood of a successful outcome.

Table 3 Summary of logistic regression predicting success from age at time of surgery, gender, and presence of nonorganic signs. CI confidence interval

Chi-squared automatic interaction detection analyses

A CHAID-based model-building algorithm evaluating three nominal variables (gender, stenosis classification, and number of operative levels) and nine ordinal or interval-scaled variables (NOS, smoking, aorta calcification, comorbidities, continuous pain, minimum canal diameter at operative levels, flexion/extension velocity ratio, pain level on VAS, and isometric power in extension) was constructed with an entry requirement of 0.10. The resulting model, depicted in Fig. 1, was able to classify correctly 29 of 32 outcomes (90.1%). The psychometrical properties of the CHAID model are provided in Table 4.

Fig. 1
figure 1

The chi-squared automatic interaction detection (CHAID) model for predicting surgical success in the subgroup of 32 patients, with complete data

Table 4 Summary of the psychometric properties of the stepwise logistic regression and a CHAID-based model for predicting successful surgical outcome

Discussion

Surgery for lumbar spinal stenosis is generally accepted when conservative treatment has failed, and it is intended to improve quality of life by reducing symptoms such as neurogenic claudication, restless legs, and radiating neurogenic pain. Surgery does not reduce low back pain, even though most patients with lumbar spinal stenosis complain of such pain [16].

In this study, we report the surgical outcomes of lumbar spinal stenosis after 1–2.6-year follow-up. In a prospective long-term follow-up study of 146 lumbar spinal stenosis patients reviewed between 1 and 11 years after surgery, Javid and Hadar [14] found no statistical differences in outcome between 1-year and 11-year follow-ups. Therefore, we consider the follow-up period in this prospective study to be representative of an adequate short-term follow-up.

Predictive models were successfully developed based on clinical, radiologic, biomechanic, and psychofunctional information available prior to surgery. Discussion of the criteria for successful outcome has been published elsewhere.

Predictive model

A logistic regression was applied to the data in order to determine which parameters might be used to predict successful outcome. Logistic regression is a statistical technique in which a discrete outcome such as success/failure is modeled from discrete and/or continuous predictors. Predictors, including demographic factors such as patient gender and age at time of operation and nondemographic candidate predictors, enter into the model(s) using stepwise methods. However, when sample size is limited or data are sparse, skewed, or overly interdependent, the asymptotic conditional likelihood inferences produced by most logistic regression procedures are often inadequate. Thus, in addition to conventional logistic regression methods, an exact logistic regression method was also applied to our data. Using this approach, a predictive effect was obtained for Waddell’s NOS, for which high values were predictive of poor outcome. Somatization factors are classically described in nonspecific low back pain [22]. This suggests that behavior of the illness can play an important role in determining results of treatment, even in such a highly organic disorder as spinal stenosis.

A novel approach, at least in the spine literature, for predicting surgical results was also used in this study. The CHAID model summarized in Fig. 1 must be considered preliminary and therefore was used as a basis for model building and interpretation rather than as a practical model for predicting surgical success in this population. As such, this model suggests a number of interesting possibilities and raises some issues.

First, it is important to take into account that models for predicting surgical success and failure may be quite different between males and females. After an initial classification split associated with aorta calcification, patient gender was the prime discriminating factor in predicting success. Second, within the subgroups defined by gender and aorta calcification, different predictors were optimal for predicting successful treatment outcome. Third, once subgroups were defined, only one additional predictor was needed to predict success accurately for three of the four subgroups. This suggests that, once a subgroup is defined, relatively simple decision rules may be adequate to predict outcome accurately. Fourth, variables that do not become predictors may be informative. For example, neither stenosis classification nor number of operated levels, two variables generally thought to be clinically relevant, were useful predictors [12, 13]. For female patients without aorta calcification, predicted success and failure were all correctly classified based on the normalized minimum canal diameter of the operated levels. In contrast, for male patients without aorta calcification, predicted success and failure were all correctly classified based on NOS scores. For females with aorta calcification, no additional discriminating predictors were identified, and all eight patients were predicted as successes, three of them incorrect. Given the rather large set of potential predictors available in this study, the lack of any good predictors for females with aorta calcification obviously begs the question of what other types of variables might be useful in predicting success for this subgroup. Finally, for males with aorta calcification, predicted success and failure were all correctly classified based on pretreatment patient-reported VAS pain intensity.

Contrary to other reports [24], commonly reported comorbidities were not associated with poor results. Moreover, no individual predictors were significantly related to success. However, our finding that the presence of aorta calcification linked to atherosclerotic disease [17] was the main variable influencing other success predictors is very interesting. Although preoperative clinical assessment of a patient’s lower limb vascular status revealed no signs of inadequate arterial supply in this series, underlying subclinical vascular factors related to the complaints may have been overlooked. Moreover, there may have been arterial insufficiency at the spinal level [18], for which the decompression procedure helps to relieve the venous pooling effect but does not deal with all components of the complaints. Based on the findings of this study, we advocate that all patients with lumbar stenosis undergo color echo Doppler (duplex scan) examination.

Before the CHAID decision-tree model can receive definitive credence, its prospective predictive power must be confirmed. Overall biased estimators of the CHAID model sensitivity, specificity, positive predictive value, and negative predictive value were much higher than values typically reported in the literature as support for the psychometrical characteristics of a model or diagnostic test. However, these results may simply reflect very good hindsight. The CHAID techniques benefit from efficient reverse engineering of these data, namely the ability to classify known outcomes correctly by exploiting this knowledge in defining the decision tree. Nevertheless, the unique predictor sets within the four subgroups defined by gender and aorta calcification, in combination with an overall 90.1% correct classification rate (29/32), suggest that CHAID modeling may represent a powerful method for clinicians wishing to develop good decision models to inform their treatment selection.

Therefore, it remains to be seen if this particular pattern of results obtained using a decision-tree methodology will be predictive of successful outcome for lumbar stenosis surgery. Regardless of the ultimate predictive power of this model, our results suggest that multiple predictors may need to be combined in more sophisticated ways than is typically allowed by traditional linear and logistic regression models.

Conclusions

From the results of this study, we conclude that:

  1. 1.

    Conservative surgical decompression for lumbar stenosis can be recommended, as it demonstrated a success rate similar to those of more invasive techniques. Given the physiologic and biomechanical advantages of this method, it can be recommended as the surgical method of choice for this indication.

  2. 2.

    Underlying subclinical vascular factors may participate in the complaints of spinal stenosis patients. These factors should be investigated more thoroughly, as they may account for some failures of surgical relief.

  3. 3.

    The CHAID decision tree appears to be a novel and useful tool for predicting results of spinal stenosis surgery.