Introduction

Spine surgeons have several options when treating herniated lumbar disks. One treatment option, lumbar discectomy, involves surgical removal of herniated disc and decompression of the nerve root [1]. This popular option, comprising over 300,000 surgeries annually in the United States [2], involves differing techniques that ultimately remove the disc that is impinging on the root. Good evidence exists for both open discectomy and microdiscectomy (versus a conservative approach) for early improvements in pain or function (two to three months) [3] as well as long-term after evaluation using an as treated analysis [4].

A key to successful surgical outcomes is proper patient selection and at present there are few studies available to assist surgeons [3]. A 2006 systematic review [5] outlined a number of demographic and psychosocial factors in 15 studies that were associated with higher levels of pain and disability outcomes in patients with all forms of lumbar disc surgeries (including discectomy). Their findings suggested that the presence of baseline psychosocial factors associated with poor coping, less work satisfaction and illness behaviours generally result in poorer outcomes, as do findings such as higher pre-operative pain scores, higher disability levels, and longer duration of symptoms. Others [6] analysed a number of biological, social and psychological predictors and found workers’ compensation components, previous back surgery, time delay for care, and alcohol use were related to outcomes at two years. All associative studies used relatively small sample sizes and none involved generalizable populations such as those represented by spine repositories.

The increased use of electronic data management systems such as those necessary for data repositories has further refined which variables are captured during patient visits, with goals of reducing patient and administrator burden as paramount. This results in routine data collection that is associated with demographics, patient-report outcomes measures and medical risk factors. This also allows capture of larger data sets and follow up that reflects findings from real life outcomes, a form of investigation known as practice-based research [7]. The purpose of this study was to explore a prospectively collected spine database in order to identify baseline characteristics that are related to poor or favourable outcomes for patient who receive lumbar discectomy (all forms), and were followed up for one full year. We were interested in baseline characteristics that are commonly used in large datasets or surgical repositories and the identification of findings that were unique to those commonly reported in musculoskeletal literature, regardless of intervention type. The study has potential utility, since in the absence of guidelines to advocate patient selection criteria, prognostic studies provide direction on characteristics associated with responders and non-responders.

Methods

Study design

This was a retrospective study that used data obtained from a multi-institutional, prospective spine outcomes registry. The spine outcomes registry involved data compiled from 14 spine surgical institutions in two countries (United States and Canada), and incorporated surgical results from 40 physicians who specialized in spine surgery. At the time of the data transfer (November, 2014), the prospective spine outcomes registry included baseline capture for 5,876 patients who had received spinal surgery, including interventions such as discectomy (with or without decompression), fusion, decompression-only, or decompression with fusion. Institutional review board approval was obtained by the local university health and ethics review board.

Participants

Participants for this study involved patients with lumbar disorders who received a discectomy surgery (with or without decompression) between the dates of 2002 to 2012. To qualify for our analysis, individuals required a baseline capture of data and one-year outcomes for Oswestry Disability Index (ODI) and a Visual Analog Scale for Pain (VAS) (N = 1,108). There were no restrictions on type of diagnosis, type of surgical approach (open or micro), or age.

Procedures

Data were extracted from a Microsoft Excel file into a statistical management software system (SPSS version 20.0). Data included demographic variables, surgery type, surgery levels, surgery date, diagnoses, patient-report outcomes, and complications at baseline and beyond.

Predictor variables

We endeavoured to identify predictor variables that are commonly used in clinical practice management electronic databases. Upon group consultation, the predictive variables selected included: (1) age, (2) body mass index (BMI), (3) gender, (4) previous back surgery history, (5) baseline ODI, unique baseline VAS for pain for both (6) low back and (7) leg pain, (8) baseline SF-12 Physical Component Summary (PCS) scores, (9) baseline SF-12 Mental Component Summary (MCS) scores, and (10) leg pain greater than back pain.

ODI and VAS measures have been previously validated and are used widely in the spine surgery literature, with several studies showing their relevance to actual clinical practice [811]. The SF-12 MCS scores and PCS scores reflect the sub-scales for SF-12 Quality of Life questionnaire, which is routinely used in clinical practice for outcome assessment in spine surgery [6]. Leg pain greater than back pain is a unique measure that was created by subtracting the total leg pain by the total back pain, during the baseline visit [12]. Values >0 were indicated as leg pain greater than back pain whereas all other values (including equal findings) were considered otherwise.

Re-coding predictor variables

Assessment of linearity and distribution normality of each variable is necessary prior to examining relationships in regression modelling. Further, dichotomous predictor variables are often easier to understand when using regression modelling (versus continuous variables) allowing more meaningful threshold values when making decisions. To assess linearity and distribution normality, we ran Kolmogorov-Smirnov (KS) tests and plotted the variables (using Q-Q plots) to identify potential curvilinear relationships. All of the continuous variables were not normally distributed.

Each predictor variable was explored for clinically sensible and statistically significant cut points (thresholds) with anticipation toward recoding appropriately. Clinically sensible cut points are those that provide inherent value in clinical practice (e.g., heart rate of >100 when assessing potential for pulmonary emboli). When evaluating for statistically appropriate cut points, we utilized a distribution-based method that identified a dichotomous threshold within the inner 80 % of the distribution (the selection interval) [13, 14]. Once identified using either mean or median values, we evaluating the two groups statistically with the outcome measure (ODI or VAS for pain) to determine if a statistical difference was present between the two newly created groups [15]. Only then did we consider dichotomizing the continuous variables.

Using this strategy, statistically significant thresholds were found to dichotomize age (≥52 years), baseline ODI (>50/100), and VAS for back and leg pain (>6/10). Because we found no statistically or clinically sensible or statistically significant distribution trends, BMI, SF-12 MCS and PCS were retained as continuous variables. The variables gender, previous surgical history, and leg pain greater than back pain were already dichotomous.

Assessment of collinearity

To assess multicollinearity in the modelling, correlation matrixes were run for each independent variable. A correlational finding of r >0.7 between independent variables was used to assess the potential of multicollinearity [16]. When predictor variables demonstrate a relationship of r >0.7 one of the two should either be removed from the model or both should be combined into one variable.

Control variables

Control variables used to statistically control interactions within the modelling included presence/absence of complications, levels of surgery, and diagnosis. Presence/absence of complications was calculated by identifying any form of complication during the surgical intervention (e.g., neural, bleeding, hardware, etc., serious or incidental) reported within the database. For each patient, level or levels of surgery were tabulated and the variable created was the sum of all levels surgically treated. Lastly, diagnoses included degenerative disc disease, spondylolisthesis, deformity, post-laminectomy syndrome, non-union, stenosis, instability and ‘other’ (a category that was used in the repository for all diagnoses without distinction).

Outcomes measures

For this study, two different outcome measures were used: (a) percent change in pain (VAS) [10, 11] and (b) percent change in disability (ODI) [8, 9] at one year. Percentage change for pain and disability was calculated by taking the difference of the VAS for pain and the ODI score (from baseline to the one year follow up), and then dividing the difference by the baseline score, followed by multiplying by 100. The end product was a positive or negative percentage change expressed as a whole number. Use of percent change and the inclusion of a minimum of two different outcome constructs have been recommended by the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) group. The IMMPACT work group has advocated a 30 % reduction in pain and disability from baseline as a lower threshold of success whereas a 50 % reduction from baseline is a substantially clinically important change [17].

Determining appropriate number of observations per variable

We determined the appropriate number of observations per variable by using the recommendations of Homer and Lemeshow [18]. For simple univariate multinomial or logistic regression, Homer and Lemeshow[18] have recommended a minimum observation-to-variable ratio of 10, but cautioned that a number this low will likely overfit a model. That said, we adopted their preferred observation-to-variable ratio of 20 to 1 for the multivariate modelling. Using this strategy, with 1,108 participants with full one-year outcomes, we could include up to 50 predictor variables with the multivariate model, thus were in no danger of overfitting the models.

Data analysis

All analyses were performed using Statistical Package for the Social Sciences, version 20.0 (SPSS Inc., Chicago, Illinois). Baseline predictor characteristics (including control variables) were plotted by means and standard deviations or by frequencies for the 50 % outcomes thresholds (substantially clinically important change) for both VAS and ODI.

Univariate logistic regression analyses were performed for each of the independent variables for four unique models (one-year outcomes for VAS 30 %, VAS 50 %, ODI 30 % and ODI 50 %). For each univariate model, number of surgical levels, presence/absence of a complication, and diagnoses were used as control variables. For each univariate analysis, individual P values, odds ratios and 95 % confidence intervals were reported.

Findings in the univariate analyses that yielded P values of 0.10 and under were considered in four distinct multivariate predictive models (VAS 30 %, VAS 50 %, ODI 30 % and ODI 50 %). Correlational analyses for multicollinearity found no variables with significant relationships greater than 0.7, consequently, none of the ten predictor variables were excluded or recoded. For each multivariate model, number of surgical levels, presence/absence of complication, and diagnoses were used as control variables. For each multivariate analysis, individual P values, odds ratios, 95 % confidence intervals, and Nagelkerke values were reported. A Nagelkerke is a pseudo R square measure that investigates the usefulness of the model [19]. The value is similar in concept to the coefficient of determination (R²) in linear regression. For all models, a P value of <0.05 was considered significant.

Results

Descriptive findings

Table 1 outlines the categorized comparisons of the predictor and control variables used in the study. For the VAS for improvement in pain of 50 % from baseline (substantially clinically important change), 515 subjects (46.4 %) met the 50 % threshold, whereas 593 failed to meet this improvement. For the 50 % change of the ODI from baseline, 423 (38.1 %) met this predetermined threshold, whereas 685 failed to meet the threshold. With respect to trends in the data, it is notable that most of the subjects who met the 50 % thresholds in both categories were older, had higher SF-12 PCS and SF-12 MCS scores (suggesting better quality of life scores), and did not have a previous surgical history.

Table 1 Descriptive analyses for discectomy surgery (N = 1,108)

Univariate analysis for pain

Table 2 reflects the univariate logistic regression analyses for the ten predictor variables. Statistically significant predictors of 30 % improvement in pain from baseline included: (1) age ≥52 years (OR = 1.60; 95%CI = 1.26, 2.02), (2) baseline ODI of >50/100 (OR = 0.77; 95%CI = 0.61, 0.97), (3) baseline VAS of >6/10 for back pain (OR = 1.81; 95%CI = 1.42, 2.31), and (4) baseline SF-12 MCS scores (OR = 1.01; 95%CI = 1.00, 1.02).

Table 2 Univariate logistic regression analyses for microdiscectomy with threshold of 30 % improvement and 50 % improvement in visual analog scale for pain at one year (N = 1,108)

Statistically significant predictors of 50 % improvement in pain from baseline included (1) age ≥52 years (OR = 1.61; 95%CI = 1.27, 2.05), (2) previous surgical history (OR = 0.66; 95%CI = 0.44, 0.99), (3) baseline ODI of >50/100 (OR = 0.67; 95%CI = 0.53, 0.84), (4) baseline VAS of >6/10 for back pain (OR = 1.34; 95%CI = 1.05, 1.71), and (5) baseline SF-12 PCS scores (OR = 1.02; 95%CI = 1.01, 1.03) and (6) baseline SF-12 MCS scores (OR = 1.02; 95%CI = 1.01, 1.03). In both models older age, lower levels of disability, higher levels of back pain, no previous history of surgery and higher reports of quality of life at baseline were associated with better recovery.

Univariate analysis for disability

Table 3 explores the univariate logistic regression analyses for outcomes of 30 % and 50 % improvements in ODI from baseline. Statistically significant predictors of 30 % improvement in ODI included: (1) age ≥52 years (OR = 1.31; 95%CI = 1.25, 2.23), (2) previous surgical history (OR = 0.50; 95%CI = 0.33, 0.76), (3) baseline ODI of >50/100 (OR = 0.71; 95%CI = 0.55, 0.90), (4) baseline VAS of >6/10 for back pain (OR = 0.62; 95%CI = 0.48, 0.81), (5) baseline SF-12 PCS scores (OR = 1.02; 95%CI = 1.00, 1.02), (6) SF-12 MCS scores (OR = 1.01; 95%CI = 1.00, 1.02), and (7) leg pain greater than back pain (OR = 1.98; 95%CI = 1.51, 2.61).

Table 3 Univariate logistic regression analyses for microdiscectomy with threshold of 30 % improvement and 50 % improvement in Oswestry disability score at one year (N = 1,108)

Statistically significant predictors of 50 % improvement in ODI included: (1) previous surgical history (OR = 0.62; 95%CI = 0.39, 0.97), (2) baseline ODI of >50/100 (OR = 0.66; 95%CI = 0.51, 0.85), (3) baseline VAS of >6/10 for back pain (OR = 0.56; 95%CI = 0.43, 0.73), (4) baseline SF-12 PCS scores (OR = 1.03; 95%CI = 1.01, 1.04), (5) SF-12 MCS scores (OR = 1.02; 95%CI = 1.01, 1.03), and (6) leg pain greater than back pain (OR = 2.24; 95%CI = 1.70, 2.94). In one or both models, older age, lower levels of disability, lower levels of back pain, no previous history of surgery, greater leg pain than back pain, and higher reports of quality of life at baseline were associated with better recovery.

Multivariate modeling for pain

Multivariate logistic regression modeling for 30 % and 50 % reductions in pain modeling is presented in Table 4. Statistically significant predictors of 30 % improvement in pain included (1) age ≥ 52 years (OR = 1.58; 95%CI = 1.11, 2.24), (2) baseline ODI of >50/100 (OR = 0.69; 95%CI = 0.49, 0.97), (3) baseline VAS of >6/10 for back pain (OR = 1.54; 95%CI = 1.09, 2.18), and (4) baseline SF-12 MCS scores (OR = 1.01; 95%CI = 1.00, 1.02).

Table 4 Multivariate logistic regression analyses for microdiscectomy with threshold of 30 % improvement and 50 % improvement in VAS for pain at one year

Statistically significant predictors of 50 % improvement in pain included: (1) age > 52 years (OR = 1.60; 95%CI = 1.13, 2.28), (2) baseline VAS of >6/10 for back pain (OR = 1.57; 95%CI = 1.07, 2.29), and (3) baseline SF-12 MCS scores (OR = 1.01; 95%CI = 1.00, 1.02). Directionality reflected the univariate regression analyses, finding older patients, those with higher baseline back pain, those with lesser reported disability and higher SF-12 MCS quality of life scores were associated with better outcomes.

Multivariate modeling for disability

Multivariate logistic regression modeling for 30 % and 50 % reductions in ODI modeling is presented in Table 5. Statistically significant predictors of 30 % improvement in ODI included: (1) leg pain greater than back pain (OR = 1.71; 95%CI = 1.18, 2.47) and (2) previous back surgery (OR = 0.50; 95%CI = 0.33, 0.78). Statistically significant predictors of 50 % improvement in ODI included leg pain greater than back pain only (OR = 1.93; 95%CI = 1.35, 2.77). In both models, presence of leg pain greater than back pain suggested a better outcome as did no previous surgery in the 30 % model.

Table 5 Multivariate logistic regression analyses for microdiscectomy with threshold of 30 % improvement and 50 % improvement for Oswestry disability index at one year

Discussion

The purpose of this study was to identify baseline characteristics that were related to poor or favourable outcomes for patients who were followed up for one full year after lumbar discectomy. The baseline characteristics were those commonly collected in most spine surgery patients, which are available in nearly all spine repositories. The large sample included over 40 surgeons and multiple surgical sites; thus, a more appropriate representation of surgeons as a whole. We elected to follow one of the many recommendations of the IMMPACT work group and report outcomes for both pain and disability, with thresholds incorporating a 30 % change from baseline (a lower threshold of success) and a 50 % change from baseline (substantial clinically important change) [17]. Our findings suggest that there may be different predictors for pain and disability outcomes. In one particular case, level of baseline low back pain, back pain predicted success in one outcome variable (pain) but a poor outcome in another (disability). These findings provide support to the IMMPACT work groups recommendations of capturing two or more outcomes measures, specifically since recovery from low back pain is multi-contextual.

At present, decision making for lumbar discectomy often involves uncertainty, specifically as objective criteria to identify responders from non-responders are lacking [2]. Guidelines for the use of discectomy in treating disk herniation are presently limited, failing to recommend appropriate baseline characteristics for surgical selection [3]. Decision making is further challenged since many individuals with low back pain improve over time (even without treatment) and some variables that are frequently associated with substantial improvements predict improvement regardless of the intervention selected. For example, Laisne et al. [20] reported that lower levels of disability, quality of life and greater comorbidities are associated with poorer outcomes in all forms of musculoskeletal disorders (with or without an intervention). Others have performed research on specific conditions such as recurrent disc herniation [21], and identified increasing age, sex, and higher BMI as predictors of poorer outcomes.

Findings similar to those in the literature that explore general prognostic variables, regardless of treatment, are not typically useful when attempting to make decisions about the use of a specific surgical intervention. For example, with respect to poorer results with pain and disability as outcomes, we found comparable results with higher degrees of disability and lower reported levels of quality of life. In our study, previous surgical history was an indicator of poorer disability outcomes (30 % improvement only) at one year. Others [6] that investigated the predictive influence of prior surgery in a small sample (N = 266) of worker’s compensation recipients of lumbar discectomy also found previous surgery as a predictor of multiple forms of outcomes measures. It is our impression that these findings provide less utility since they are similar to those that predict outcome for individuals with low back pain, regardless of intervention. Simply put, one would not withhold surgery based on this finding, nor would one suggest surgery is necessary.

For other variables we found conflicting or unique findings, prognostic variables that we feel may provide decision making utility. For example, we did not find gender to be a significant predictor for outcomes, and do not think the sex of an individual should influence surgical decision making. Additionally, we found older age (≥52 years) to be related to better pain-related outcomes with discectomy. We feel this finding has potential promise since the mean age of the sample was over 54 years of age, and because of degenerative changes older individuals are commonly recipients of spine surgery. Within the data, there were notable trends toward continued improvement in outcomes for pain, even in individuals much older than 52 years of age.

Another unique finding included leg pain greater than back pain as a predictor of improved disability outcomes. It has been suggested that individuals with low-back-related leg pain differ notably than those with low back pain only [22]. Leg pain greater than back pain is most likely a disease of the spinal nerve root [23]. Those with leg pain and neurological signs have been shown to improve the greatest from baseline with surgery [24, 25]. In a cross-sectional study, the condition has been shown to have less affiliation with concomitant comorbidities in individuals at a tertiary spine center [12]. It is a possibility that the psychosocial influences that moderate recovery processes for patients with low back pain only may not be as influential to those with leg pain greater than back pain, thus allowing the surgery to demonstrate greater effectiveness [12].

One last potential valuable finding in this study is the mechanism in which the predictors were assimilated. The predictors in this study were those that are commonly captured by most surgeons, a finding we think provides transferability to future decision making processes. Collecting complex patient-related outcomes measures with high administrative and patient burden, similar to those used to determine psychosocial influences, is challenging to accomplish during patients’ clinical visits [26]. It is well known that because of time and resources restraints, the majority of surgeons do not utilize standardized patient questionnaires to evaluate for psychosocial risks [27]. Use of meaningful conventional data that is collected as part of a standard visit, that is, normally collected data from repositories, provides opportunities for health services researchers to explore concepts well beyond those capable within a clinical trial [28]. We feel the findings for our large sample (the largest sample investigated to date) could assist in discriminating who would and would not improve with surgery, assisting further in classifying candidates for discectomy surgery.

Limitations

It is important to recognize that this study only looked at patients who received surgery and did not compare these prognostic findings to a control group who did not receive surgery. A comparative arm would likely assist in identifying unique predictive characteristics that are unique to surgery and are not general predictive findings of outcome with any kind of intervention.

Conclusion

Data analyses using the relatively large, multicentre repository identified multiple predictor variables that were associated with good and poorer outcomes for lumbar discectomy. The variables used are commonly captured in conventional clinical practice and many of the variables conflict with present prognostic literature, suggesting potential utility that is unique to lumbar discectomy. Future study is needed with an appropriate multi-arm design to investigate the predictive capacity of these variables.