Introduction

For over three decades, transforaminal lumbar interbody fusion (TLIF) has been used for a variety of degenerative lumbar disorders. With a posterolateral approach, sufficient disc space exposure could be achieved through the resection of a single facet joint. This approach reduces the retraction of the thecal sac and nerve roots, and preserves the contralateral structures [13]. In revision cases, such as recurrent lumbar disc herniation, TLIF can be an effective procedure for those patients whose midline scar adheres to neural structures [1, 4]. Moreover, high fusion rate has been reported using this technique [2].

Although clinical studies have proved the efficacy of conventional open-TLIF, there are concerns regarding lengthy hospital stays, excessive blood loss, and postoperative complications. These concerns are often associated with the stripping of paravertebral muscles [1, 2]. To address these problems, Foley et al. [5] described an alternative technique: minimally invasive TLIF (MI-TLIF). MI-TLIF was developed with the advancement of modern surgical instrumentation and optical systems [1]. Through a tubular retraction system, MI-TLIF might reduce muscular dissection. However, several disadvantages have also been reported. First, with limited visibility and working space, MI-TLIF requires good familiarity of anatomy. Some surgeons have suggested that MI-TLIF could increase surgical time [6, 7]. Secondly, to facilitate a minimally invasive approach, more X-ray exposure was used [4, 79]. Thirdly, MI-TLIF has a steep learning curve and is a technically demanding procedure. High complication rates have been reported during the learning stage [10, 11].

In the recent years, an increasing number of studies have been conducted to compare the effectiveness between MI-TLIF and open-TLIF for degenerative lumbar diseases. However, only limited Class I evidence is available [4, 619]. The objective of the present study was to provide cumulative effect estimates of the clinical and radiological outcomes using meta-analysis and to determine which surgical technique was more beneficial.

Materials and methods

Search strategy and inclusion criteria

Because only a small number of randomized controlled trials is available in the literature, non-randomized comparative studies (prospective and retrospective) were also included. A literature search was conducted up to July 2012 using MEDLINE database. We screened all fields by combining the term “transforaminal lumbar interbody fusion” or “TLIF” with “MIS”, “minimally invasive”, or “minimally invasive spine surgery”. Articles were limited to those published in English. In addition, the references of the retrieved articles were also searched. The following eligibility criteria were applied: (1) the study included a comparative design (MI-TLIF versus open-TLIF). (2) The study population consisted of adult patients suffering from degenerative lumbar diseases (disc herniation, spinal stenosis, or spondylolisthesis). Isthmic spondylolisthesis was not excluded. (3) At least one of the following outcomes should be reported: perioperative results (operative time, blood loss, or hospital stay), X-ray exposure time, pain or disability improvement, complications, or re-operations. (4) A minimum sample size of ten was required for both groups. Articles were excluded if they had any of following characteristics. (1) Patients suffering from spinal deformities, trauma, or spinal tumors. (2) Postoperative medicine use, such as steroids or chemotherapy agents, which might affect the fusion rate. (3) Biomechanical study, cadaveric study, comment, and case report. (4) Repeated studies. Two reviewers of this paper independently extracted data using a standardized form. Inconsistencies between reviewers’ data were resolved through discussion until a consensus was reached.

Data extraction

We extracted data based on the following categories. (1) Study year, country, and study design. (2) Basic study characteristics including patients’ inclusion/exclusion criteria, enrolled number, age, and sex proportion. (3) Baseline comparison information of confounding factors, such as sex, age, height, weight, BMI, diagnosis, surgical level, insurance, education, smoking status, alcohol use, workers’ compensation, and concomitant diseases. (4) Surgical information, including detailed spinal level and level numbers, instrumentation, and bone graft. (5) Perioperative outcomes such as operative time, intraoperative and postoperative blood loss, intraoperative X-ray exposure time, and hospital stay. (6) Functional outcome improvement at last follow up including visual analogue scores (VAS), Oswestry disability index (ODI), and short-form-36 (SF-36). (7) Fusion assessment method, fusion success criteria, and fusion rate at last follow-up. (8) Complication types and complication rates. Both total and specified complication rates were extracted. We referred to the previous published reviews to categorized specified complication types [2, 20].

Study quality

Because both randomized and non-randomized studies were included in current analysis, we applied two assessing tools. For non-randomized studies, the validated instrument called MINORS score was used [21]. A maximum score of 24 points can be generated for each included comparative study. For prospective randomized controlled trials, the Detsky quality index was applied [22]. The total score is 20 for positive trials and 21 for negative trials. Based on the previous published papers, studies scoring >75 % of the maximum MINORS or Detsky score were designated high quality. Each eligible study was independently reviewed by two raters for methodological quality (F.M.M. and X.L.Z). All discrepancies were resolved by consensus.

Meta-analysis

Binary outcome data (total complication rate, specified complication rate, and reoperation rate) were summarized using relative risk (RR) and 95 % confidence intervals (CIs). Continuous outcomes (functional outcome, operative time, blood loss, hospital stay, X-ray exposure time) were summarized by the weighted mean difference (WMD) and 95 % CIs. Standard errors and interquartile ranges were transformed into standard deviations (SD), where necessary, according to the method described by Cochrane handbook for systematic reviews of interventions. The level of significance was set at P < 0.05.

Heterogeneity was evaluated using the χ 2 test and I 2 statistics. Fixed-effect models were applied unless statistical heterogeneity was significant, in which case a random-effect model was used. Funnel plots were employed to assess the possibility of publication bias. These plots showed the intervention effect from each study against the respective standard error. A symmetrical plot reveals no bias and any asymmetry of the plot would suggest publication bias. The sensitivity analysis was performed to test the strength and robustness of pooled results by sequential omission of individual studies. The analysis was carried out using the statistical software Review Manager Version 5.0 (Cochrane Collaboration, Oxford, UK).

Results

The search strategy (Fig. 1) identified eleven comparative studies that met the inclusion criteria, including one randomized controlled trial, five prospective comparative studies, and five retrospective comparative studies. Four studies were removed because they included the same, or a subset of the patient population of their previous studies [1619]. One study analyzed three groups of patients (divided by the number of operative levels) [12]. We only included the single level group because the patient’s number of multilevel fusion procedures was not large enough to meet our inclusion criteria. The search of the references in the retrieved articles did not yield any other eligible studies. The outcomes of 785 patients were examined. The basic information of included studies was presented in Table 1.

Fig. 1
figure 1

Selection of relevant publications, reasons for exclusion

Table 1 Characteristics of included studies

Study characteristics

According to the quality assessment criteria, there were six high quality and five low quality studies. The patients’ diagnoses included degenerative lumbar disease in ten studies. Two of the ten studies also enrolled patients with isthmic spondylolisthesis. One study was focused on revision surgery. Eight studies involved only single level procedures. Seven studies reported the use of intervertebral cages. Bone graft (iliac crest bone graft, local bone, or allograft) was used in eight studies. Moreover, rhBMP-2 was applied in two papers. Graft information was not available in three papers.

Baseline comparisons were performed in the ten included studies. However, the comparisons varied in these papers. Two articles analyzed three factors, four articles analyzed four factors, and three articles analyzed five factors. One paper compared seven factors between the open and MI groups. The reported baseline characteristics were statistically similar between the two groups in all studies (Table 2).

Table 2 Comparison of baseline characteristics

Clinical function improvement

The most frequently reported clinical outcomes were mean back and/or leg pain VAS improvement and mean ODI improvement. Although the mean score improvement could be extracted from the majority of these studies, none provided the corresponding SD. As a result, we used a descriptive method for these indexes. Three studies showed that the mean back pain VAS improvement was better in the MI group. Two papers indicated that the open group had better improvement. Improvement was similar for both groups in one study. The data for mean leg pain VAS improvement was available in only two studies. Out of the six studies that reported mean ODI improvement, five studies showed that the score improvement was better in the MI group (Table 3).

Table 3 Improvement of functional outcomes

Operative time and X-ray exposure time

Ten studies reported operative time. Seven of them provided adequate data about the mean and SD. Two studies reported the mean and P value. One study reported median and interquartile ranges. The weighted mean difference was equivalent for both groups (WMD = 1.63, P = 0.83 95 % CI −13.73 to 17.00). There was obvious evidence for statistically significant heterogeneity (I 2 = 87 %, P < 0.0001) (Fig. 2).

Fig. 2
figure 2

Forest plot illustrating operative time, X-ray exposure, blood loss, and hospital stay of meta-analysis comparing MI-TLIF with open-TLIF

Details regarding intraoperative X-ray exposure time were available in four studies. All four studies reported significantly reduced exposure time in the open group. Overall, the weighted mean difference is 40.85 (95 % CI 31.97–49.73, P < 0.0001) in favor of the open group. Significant heterogeneity was detected among the studies (I 2 = 79 %, P = 0.002) (Fig. 2).

Blood loss

Intraoperative blood loss was assessed in eleven eligible studies. All studies reported lower intraoperative blood loss in the MI group, with ten of them indicating statistical significance. Overall, the weighted mean difference was statistically significant (WMD = −218.91, 95 % CI −307.63 to −130.20, P < 0.0001) in favor of the MI group. Six studies reported postoperative blood loss. Pooled estimate also revealed that the MI group achieved significantly reduced postoperative blood loss (WMD = −112.7, 95 % CI −155.15 to −67.39, P < 0.0001). Strong evidence for statistically significant heterogeneity was detected when we pooled both intraoperative and postoperative blood loss (Fig. 2).

Hospital stay

Seven studies reported the mean length of hospital stay. All of them reported statistically significant difference. Overall, the weighted mean difference was 2.7 days shorter in the MI-TLIF group (95 % CI −3.49 to −1.92, P < 0.0001) than that in the open group. Moderate heterogeneity existed among the studies (I 2 = 64 %, P = 0.01) (Fig. 2).

Complication and re-operation

Data regarding complications were available in ten studies. The overall complication rate was similar between the MI and open groups (RR = 1.02, 95 % CI 0.74–1.4, P = 0.9). Statistical heterogeneity was not detected among the studies (I 2 = 0 %, P = 0.57) (Fig. 3).

Fig. 3
figure 3

Forest plot illustrating total complication rate and re-operation rate of meta-analysis comparing MI-TLIF with open-TLIF

Five main complication types including graft (pedicle screw, cage, bone graft) malposition, cage migration, fusion failure, dural tear, and infection were observed in the eligible studies. Pooled data indicated a higher rate of graft malposition and fusion failure in the MI-TLIF group, a higher rate of dural tear and infection in the open-TLIF group, and a similar cage migration rate in both groups. However, none of these differences were statistically significant. χ 2 tests indicated no statistical evidence of heterogeneity (I 2 = 0 %, P > 0.1) (Fig. 4).

Fig. 4
figure 4

Forest plot illustrating specified complication rate of meta-analysis comparing MI-TLIF with open-TLIF

Eight studies reported re-operation rate. The pooled estimate showed that the MI group was associated with a higher, but statistically insignificant reoperation rate when compared with the open group (RR = 1.53, 95 % CI 0.69–3.42, P = 0.3). There is no evidence for significant heterogeneity (I 2 = 0 %, P = 0.9) (Fig. 3).

Publication bias and sensitivity analysis

The funnel plot showed a fairly symmetrical distribution of the studies that reported complication rate. All studies lied within the 95 % CI and were distributed evenly about the vertical, implying minimal publication bias (Fig. 5). Sensitivity analysis was conducted by reanalyzing our data after sequential omission of individual studies. Pooled results did not yield any significant difference by omitting any single study data.

Fig. 5
figure 5

Funnel plot of total complication rate

Discussion

Our meta-analysis suggested that MI-TLIF had significantly lower intra- and postoperative blood loss, and shorter hospital stay than the open method. Although statistically significant heterogeneity was detected among these studies, nearly all the included articles reported consistent results. For clinical outcomes, more studies reported a favorable improvement trend towards MI-TLIF. However, a precise pooled mean difference could not be calculated because no study provided detailed SD for the mean function outcome improvement. The advantages associated with MI-TLIF might be attributed to less intraoperative dissection and retraction of paravertebral muscles [1, 2, 5]. Shunwu et al. [6] found the minimally invasive group was associated with a significantly lower creatine kinase (a marker of muscle injury) level on the third postoperative day. Wang et al. [7] observed no differences in postoperative serum creatine kinase levels between the MI and open groups. However, they found significantly reduced sacrospinalis muscle injury in the minimally invasive group through MRI scanning and electrophysiology examination.

MI-TLIF significantly increased the X-ray exposure time. All four studies reported consistent results. The open technique needed only half of the X-ray exposure required for the MI procedure. Increased fluoroscopic use was needed during the placement of both the tubular retractor system and pedicle screws. Therefore, more efforts should be made to reduce the radiation exposure in MI-TLIF procedures. Kim et al. [23] used navigation-assisted fluoroscopy when performing MI-TLIF. Their study revealed that navigated MI-TLIF significantly reduced intraoperative radiation exposure when compared with open-TLIF using standard fluoroscopy [23]. Moreover, it has been reported that navigation could also reduce fluoroscopic time during the placement of pedicle screws [24]. In the future, navigation may be one of the ways to solve the problems of excessive X-ray exposure of the surgeons.

Our meta-analysis revealed that there was no significant difference between the MI and open-TLIF with regard to operative time. However, several studies have reported a trend of longer operative time for the MI-TLIF group [68, 11, 15]). One reason might be that MI-TLIF, which was performed in limited space, is a more technically demanding procedure. A learning curve exists in the early stage of performing this surgery [10, 11, 13]. Lee et al. [25] found operative time could reach an asymptote after about 30 cases. Despite the learning curve, MI-TLIF is still very safe and effective for lumbar spinal diseases [25].

For re-operation rate and complication rate, all studies showed statistically insignificant difference. The re-operation rate for both MI and open techniques were very low (<5 %). Reasons for reoperation were similar among the studies, including pedicle screw or inter-vertebral graft malposition/loosing/migration, pseudarthrosis, and epidural hematoma. However, we found the definition of complication was different in each study. Thus, pooling of the complication data might lead to bias. The main complication types included graft malposition, cage migration, non-union, dural tear, and infection. It should be noted that for a specific complication type, pooled results revealed no significant difference between MI and open method.

Our study has a number of weaknesses. First of all, both prospective and retrospective comparative studies were selected for analysis. Methodology defects have been found in some of these studies, including failure to collect data prospectively, non-consecutive enrollment of patients, inadequate baseline comparisons, and improper blinding or non-blinding evaluation. Thus, the level of evidence for this meta-analysis was not high. Secondly, statistical heterogeneity was detected among the studies particularly when we pooled the continuous outcomes. The heterogeneity might be explained by the study design, study quality, patients’ characteristics, and the diverse technical specifications. Thirdly, multiple assessment tools and fusion criteria, used in the included studies might confounded the combined results. Lastly, incomplete data recording was observed when we extracted clinical outcomes. Pooling of such data might lead to bias. Despite these weaknesses, our meta-analysis can still provide some value for clinical reference due to the lack of high quality randomized controlled trials. In summary, this meta-analysis demonstrated that MI-TLIF resulted in less blood loss and shorter hospital stay, but was associated with more intraoperative X-ray exposure. Both MI and open-TLIF obtained similar operative time, complication rate, and re-operation rate. Our findings suggest that MI-TLIF is a promising procedure, but more effort should be conducted to reduce intraoperative radiation exposure in the future. Because patients selected for MI-TLIF or open-TLIF may have difference in symptoms and severity of diseases, high-quality randomized controlled trials are also needed to further compare these two techniques.