Introduction

Anterior cruciate ligament (ACL) disruption is a common cause of anterior knee instability, particularly as a result of sports activities. Reconstruction for the anterior cruciate ligament has become a well-established procedure world-wide, and it is estimated that more than 100,000 reconstructions are performed in the US per year [13], with the reported clinical success rates ranging from 70 to 95 % [2, 6]. Even though ACL reconstruction technique has provided high patient outcome success rates, there are still about 20 % of patients who complain of persistent knee pain and instability or insecurity, and/or develop a degenerative joint after the operation [22, 26, 31].

Conventionally, the gold standard for ACL reconstruction is the arthroscopic single-bundle (SB) technique with autologous tendons like the patellar, hamstring, or quadriceps tendon or with allografts [8, 12, 29]. However, the normal ACL has at least two different bundles, the anteromedial (AM) bundle and the posterolateral (PL) bundle, and each appears to function at different angles of flexion of the knee, together providing responsibility for the stability of the joint [4, 9, 11]. Therefore, single-bundle anteromedial or posterolateral reconstruction alone was proved to be insufficient in controlling the combined rotatory load and valgus torque that simulates pivot shift test [21, 34]. So the need for double-bundle (DB) reconstruction is on rise.

In a previous biomechanical cadaver studies [36], it was reported that the anatomic double-bundle ACL reconstruction guarantees more similar results to the physiologically intact knee and is advantageous in restoring anterior knee stability and rotational stability when compared with a single-bundle ACL reconstruction. To date, in order to reproduce a ligament more similar to the original native ACL and to restore optimal rotational stability, several double-bundle ACL reconstruction procedures have been developed [1, 7, 25]. However, a few studies have reported that there were no significant differences in patients’ clinical outcomes between the single-bundle and double-bundle ACL reconstruction procedures [1, 14].

Due to disagreement regarding the real advantage of any specific surgical option for the double-bundle ACL reconstruction, the objective of this study is to systematically search relevant trials and to comprehensively compare the short- and long-term clinical outcomes of the DB ACL reconstruction with those of SB ACL reconstruction. The null hypothesis is identified that there is no difference in the short- and long-term clinical outcomes between the single-bundle and double-bundle ACL reconstruction.

Materials and methods

Search strategy

An electronic search of the database PubMed (1966–September 2011), EMBASE (1984–September 2011), and Cochrane Controlled Trials Register (CENTRAL; 3rd Quarter, 2011) was undertaken to identify relevant studies. The search strategy consisted of a combination of keywords concerning the technical procedure (double-bundle, reconstruction) and keywords regarding the anatomic features and pathology (anterior cruciate ligament, ACL). These keywords were used as MESH headings and free text words. In addition, a specific search was also performed using the terms “anterior cruciate ligament double-bundle” and “ACL double-bundle”. All searches were limited to humans, clinical trial (Level I, II, or III evidence), review, and meta-analysis. In addition, the manual searching of reference lists from potentially relevant papers was performed, based on the computer-assisted strategy, to identify any additional studies that may have been missed.

Selection of studies

Using a pre-defined protocol, two reviewers (ZY and ZP) independently selected studies for evaluation, with the disagreements resolved through consensus decision. The inclusive criteria were (1) studies comparing the clinical outcomes of single-bundle versus double-bundle ACL reconstruction; (2) studies that were prospective and randomized (Level I or II evidence); (3) studies that were published in English; (4) studies with the full-text availability; (5) the age of patient population over 18 years; (6) data that were not duplicated in another manuscript. Trials with nonclinical outcomes (e.g, MRI) and without a follow-up (e.g, intraoperative analysis) were excluded from this study.

Data extraction

Two reviewers independently extracted relevant data from the included studies regarding the treatment details, patient demography (number, mean age, and sex ratio), type of graft fixation, type of knee support after surgery, and the length of follow-up. The relevant clinical outcomes pooled in this analysis include (1) the knee stability of injured leg quantitatively measured using the KT-1000 arthrometer; (2) the knee stability of injured leg manually examined through the pivot shift test and Lachman test; (3) the general status of patients and the injured leg evaluated by the International Knee Documentation Committee (IKDC) (4) Lysholm knee score; (5) Tegner activity score; and (6) complications.

Heterogeneity

To establish inconsistency in the study results, a test for heterogeneity (Cochrane Q) was performed. However, because the test is susceptible to the number of trials included in the meta-analysis, we also calculated I 2. I 2, directly calculated from the Q statistic, describes the percentage of variation across the studies that is due to heterogeneity rather than change. I 2 ranges from 0 to 100 %, with 0 % indicating the absence of any heterogeneity. Although absolute numbers for I 2 are not available, values < 50 % are considered low heterogeneity. When I 2 is < 50 %, low heterogeneity is assumed, and the effect is thought to be due to change. Conversely, when I 2 exceeds 50 %, then heterogeneity is thought to exist and the effect is random.

Assessment of risk of bias

Two independent investigators evaluated the risk of bias of the included studies according to the Collaboration’s recommended tool [Chapter 8 (Section 8.5)] [16]. Briefly, the risk of bias of each study was assessed using the following methodological components: (1) randomization and generation of the allocation sequence (for selection bias); (2) allocation concealment (for selection bias); (3) patients blinding (for performance bias) and examiner blinding (for detection bias); and (4) description of the follow-up (for attrition bias). The details of each methodological item are shown in Table 2. As the nature of surgical treatment, the domain of patients blinding cannot be easily performed, and therefore, the trials with an adequate method of randomization and allocation concealment as well as clearly reporting of the examiner blinding and the follow-up were considered to be with low risk of bias.

Statistical analysis

We conducted the meta-analysis using the software Revman 5.1 (provided by the Cochrane Collaboration, Oxford, UK) for an outcome where data are available from more than one study. The analyses included all patients irrespective of compliance or follow-up following the “intention-to-treat” principle and using the last reported observed response. Results were expressed as risk ratio (RR) and/or odds ratios (OR) with 95 percent confidence intervals (CIs) for dichotomous outcomes, and for continuous outcomes as mean differences (MD) with 95 % CIs. A fixed effects model was initially used; however, a random effects model was planned to be used if there was evidence of significant heterogeneity across trials (p < 0.1; I 2 > 50 %). A sensitivity analysis was performed to explore the potential source of heterogeneity. In addition, we planned to use funnel plot asymmetry to assess the existence of publication bias and other biases.

Results

Figure 1 shows details of study identification, inclusion, and exclusion. The search on PubMed, EMBASE, and the Cochrane Library under the defined terms yielded 957 articles. By screening the titles and abstracts, 852 references were excluded due to the irrelevance to this topic. In 105 potentially relevant references, 80 references were excluded and the remaining 25 articles were taken for a comprehensive evaluation. Finally, eighteen trials were included in this meta-analysis. Ten of these included studies [3, 5, 15, 18, 20, 27, 28, 30, 33, 35] reported the results within 24 months, while the other eight studies [1, 10, 17, 19, 24, 32, 37, 38] reported the results over 24 months, which are respectively pooled together as the short- and long-term group. The main characteristics of included studies are shown in Table 1. Totally, eighteen studies enrolled 1,229 patients who underwent ACL reconstruction. 514 patients were randomly assigned into DB group, while the other 715 patients into SB group.

Fig. 1
figure 1

Flow of study identification, inclusion, exclusion

Table 1 Main characteristics of included studies

Among these included trials, two trials [10, 35] had two SB groups (one with anteromedial SB and one with posterolateral SB), and another two trials [17, 20] had one more SB groups regarding the type of graft fixation. According to the Cochrane Handbook for systematic reviews [Chapter 16 (Section 16.5.4)] [16], it is recommended to create a single pair-wise comparison in a particular meta-analysis, which is to combine all relevant experimental intervention groups of the study into a single group (DB group), and to combine all relevant control intervention groups into a single control group (SB group). Concerning the data synthesis, both the sample sizes and the numbers of patients with events can be summed up across groups for dichotomous outcomes, while for continuous outcomes, means and SD can be combined using methods described in the Cochrane Handbook [Chapter 7 (Section 7.7.3.8)] [16].

The assessment of the risk of bias in all included studies is shown in Table 2. A computer-generated randomization tables or random numbers was described in five trials [3, 15, 27, 37, 38], while in the other trials, it was either inadequate (e.g, dates, names, or admittance numbers) or unclear. Sealed-envelop technique for allocation concealment was applied in five trials [5, 1720], while it was either not described or unclear in the other trials. In three trials [3, 15, 28], the patients remained blinded until the radiographic evaluation after operation. The examiner blinding was performed in most included trials, and the description of the follow-up was considered adequate (numbers and reasons for dropouts and withdrawals) in all included trials. As a result, all these included trials, with two or more methodological components inadequate or unclear, were regarded as high-bias risk trials.

Table 2 Risk of bias of the included studies

Knee stability measurements

The knee stability was evaluated at the latest follow-up period. Among the evaluation methods, manual examination for assessing stability mainly included the pivot shift test and Lachman test, while for quantitative evaluation, KT-1000 arthrometer was used.

KT-1000 arthrometer measurement

The anterior knee laxity measured with the KT-1000 arthrometer at manual maximum pull was expressed as the side-to-side difference (Mean ± SD) between the injured and uninjured legs, with a smaller difference between legs representing a better knee laxity. There were seven trials [3, 5, 18, 20, 28, 30, 35] reporting the continuous outcomes within 24 months and another three trials [10, 19, 24] reporting over 24 months. The test for heterogeneity did not detect significant heterogeneity across the trials in the short-term group (p = 0.55, I 2 = 0 %) and trials in the long-term group (p = 0.28, I 2 = 21 %). Overall, DB-treated patients had a significantly smaller difference between the injured and uninjured legs in a short-term follow-up (MD = −0.63, 95 %CI: −0.93 to −0.32, p < 0.0001) (Fig. 2A, a) as well as in a long-term follow-up (MD = −1.00, 95 %CI: −1.54 to −0.46, p = 0.0003) (Fig. 2A, b) compared to SB-treated patients.

Fig. 2
figure 2

Forest plot of the meta-analysis for knee stability measurements. A KT-1000 arthrometer measurement; B pivot shift test; C Lachman test

Pivot shift test

Manual knee laxity evaluation based on the pivot shift test was performed and assessed for both legs, with the differences between the injured and uninjured legs expressed as negative, 1+, 2+, or 3+. Overall, the negative rate in the DB group was 90.3 % (159/176) and SB group was 69.7 % (147/211) within a follow-up of 2 years, while over 2 years, the negative rate was 87.6 % (155/177) in the DB group and 54.0 % (154/285) in the SB group. The test for heterogeneity detected no evidence of heterogeneity across the trials [3, 5, 18, 20, 28, 30, 35] in the short-term group (p = 0.81, I 2 = 0 %), while there was a significant heterogeneity across the trials [10, 17, 19, 24, 37] in the long-term group (p = 0.0004, I 2 = 81 %).

Pooled results revealed that compared to SB-treated patients, DB-treated patients had a significantly higher negative rate between the injured and uninjured legs both in a short-term follow-up (RR = 1.29, 95 %CI: 1.16–1.44, p < 0.00001, fixed model) (Fig. 2B, a) and in a long-term follow-up (RR = 1.46, 95 %CI: 1.12–1.90, p = 0.006, random model) (Fig. 2B, b). Sensitivity analysis showed that the main contributor to heterogeneity among the trials in the long-term group was the study by Ibrahim et al. [17]. When this study was excluded, the heterogeneity was eliminated (p = 0.75, I 2 = 0 %), and pooled results showed that there was still a significantly higher negative rate between legs in the DB group compared to the SB group (p = 0.0004).

Lachman test

Similarly, the knee laxity evaluated by Lachman test was first expressed as an absolute grade in each leg independently, and then, the differences between the injured and uninjured legs were also classified as negative, 1+, 2+, or 3+. Totally, the negative rate in the DB group was 93.8 % (61/65) and the SB group was 85.9 % (73/85) within a follow-up of 24 months, while over 24 months, the negative rate in the DB group and SB group was 92.9 % (78 of 84) and 72.3 % (133 of 184), respectively. The test for heterogeneity did not detect significant heterogeneity across the trials [3, 5, 35] in the short-term group (p = 0.78, I 2 = 0 %) and trials [17, 24] in the long-term group (p = 0.72, I 2 = 0 %). Pooled results showed that there was no significant difference regarding the negative rate between the two groups in a short-term follow-up (RR = 1.06, 95 %CI: 0.96–1.18, n.s.) (Fig. 2C, a); however, in a long-term follow-up, DB-treated patients had a significantly higher negative rate between the injured and uninjured legs (RR = 1.26, 95 %CI: 1.13–1.40, p < 0.0001) compared to SB-treated patients (Fig. 2C, b).

Clinical outcome measurements

The clinical status of each patient was evaluated postoperatively during the follow-up. The clinical evaluation methods mainly included the International Knee Documentation Committee (IKDC), Lysholm knee score, and Tegner activity score.

International Knee Documentation Committee (IKDC)

International Knee Documentation Committee (IKDC) is a clinical objective evaluation form used to comprehensively evaluate the general status of patients and the injured knee. There were six trials [3, 18, 20, 28, 30, 35] reporting the IKDC (RR) within 24 months and another five trials [17, 19, 32, 37, 38] reporting over 24 months. According to our meta-analysis, 106 and 91 patients (who received DB (n = 166) and SB (n = 201), respectively) had normal knees within a follow-up of 2 years, while over 2 years, the percentage of normal knees was 72.5 % (132/182) in the DB group and 53.2 % (143/269) in the SB group. The test for heterogeneity revealed a significant heterogeneity across the trials in the short-term group (p = 0.04, I 2 = 57 %), while there was no evidence of heterogeneity across the trials in the long-term group (p = 0.33, I 2 = 14 %).

Overall, DB-treated patients had a significantly higher percentage of normal knees both in a short-term follow-up (RR = 1.44, 95 %CI: 1.07–1.93, p = 0.02, random model) (Fig. 3A, a) and in a long-term follow-up (RR = 1.35, 95 %CI: 1.16–1.56, p < 0.0001, fixed model) (Fig. 3A, b) compared to SB-treated patients. Sensitivity analysis revealed that the study by Yagi et al. [35] influenced the pooled results of trials in the short-term group. By removing this study, the heterogeneity among the studies was eliminated (p = 0.11, I 2 = 46 %), and pooled results of the remaining trials revealed that there was still a significantly higher percentage of normal knees in the DB group compared to the SB group (p < 0.0001).

Fig. 3
figure 3

Forest plot of the meta-analysis for clinical outcome measurements. A International Knee Documentation Committee (IKDC); B Lysholm knee score; C Tegner activity scores

Lysholm knee score

Lysholm knee score was used as a general knee evaluation method. The subjective evaluation of the knee after surgery was expressed as a score (Mean ± SD) of the injured knee. There were six trials [5, 18, 20, 28, 30, 33] reporting the continuous outcomes within 2 years and another two trials [10, 19] reporting over 2 years. The test for heterogeneity did not detect significant heterogeneity across the trials in the short-term group (p = 0.79, I 2 = 0 %) and trials in the long-term group (p = 0.15, I 2 = 52 %). Pooled results showed that there was no significant difference regarding Lysholm knee score between the two groups in a short-term (MD: −1.55, 95 % CI: −3.36 to 0.26, n.s.) and long-term (MD: −1.52, 95 % CI: −4.44 to 1.40, n.s.) follow-up (Fig. 3B, a, b).

Tegner activity score

With regard to sporting activity level, performance, and return to sporting activities, preoperative and postoperative Tegner activity scores were recorded. Data required for this meta-analysis were available from three trials [30, 33, 38]. The test for heterogeneity did not detect significant heterogeneity across the trials (p = 0.88, I 2 = 0 %). Pooled results revealed that there was no significant difference with regard to the Tegner activity score between the two groups (MD: 0.38, 95 % CI: −0.04 to 0.79, n.s.), indicating that DB-treated patients do not have a better functional capability of the injured knee when compared with SB-treated patients. The detail is shown in the Figure 3C.

Complications

The complications are the composite of secondary meniscal tear or unhealed meniscal fixation requiring a second-look arthroscopy, deep infections, graft failures, Cyclops lesions, thrombosis as well as pain and swelling, etc. Overall, the complications were reported in most of the included trials and occurred in 20 of the 453 patients (4.4 %) in the DB group, as compared to 45 of the 604 patients (7.5 %) in the SB group. Pooled results suggested that the incidence of complications was significantly lower in the DB group than in the SB group (OR = 0.56, 95 %CI: 0.33–0.96, p = 0.03); the results were robust and there was no heterogeneity among the studies (p = 0.47, I 2 = 0 %) (Fig. 4).

Fig. 4
figure 4

Forest plot of the meta-analysis for complications

Discussion

The most important finding of the present study was that DB ACL reconstruction yields better clinical outcomes not only regarding the KT-1000 arthrometer outcome and pivot shift results but also with regard to the results of Lachman test, IKDC, and complications when compared to SB ACL reconstruction. Although many studies have supported the evidence that the clinical outcomes of double-bundle ACL reconstruction are superior to those of single-bundle ACL reconstruction, some trials have not found significant differences between clinical outcomes in the patient groups. A previous meta-analysis [23] showed that DB ACL reconstruction resulted in a slightly better KT-1000 arthrometer outcome compared to SB ACL reconstruction; however, this finding was considered to have no clinically significant differences. And for the results of the pivot shift test, there was still not a significant difference between the two techniques. So the prior meta-analysis led to the conclusion that there was little evidence to support the additional benefits of the DB ACL reconstruction.

In our opinion, the conclusion of the prior meta-analysis is not convincing. The main reasons are as follows. First, in order to compare the DB technique with SB technique, we should include all relevant DB groups and SB groups, while in the prior meta-analysis the clinical outcomes of nonanatomic DB ACL reconstruction were excluded, which resulted in a loss of information and thus is not generally recommended. Second, the statistics in the prior meta-analysis used to pool the results of the pivot shift test was odds ratio (OR). However, when events are common (such as when the rate of events is more than 20 %), the risk ratio (RR) should be used. As the negative events of the pivot shift test are common, the misuse of OR in the prior meta-analysis will tend to overestimate the pooled effects. Finally, the prior meta-analysis did not take the length of follow-up into consideration, which was also important to the clinical outcomes of treatments. So, the validity of results in the prior meta-analysis needs further confirmation.

In our meta-analysis, we included more trials and divided these trials into the short-term (≤24 months) and long-term (>24 months) groups according to the length of follow-up. Our results demonstrate that when compared to the SB technique, the DB technique results in a KT-1000 arthrometer outcome 1.00 mm closer to the normal knee in a long-term follow-up, and this finding is not only statistically significant (p = 0.0003) but also considered to have a clinical significance. In addition, our results also reveal that in a long-term follow-up, DB-treated patients have a significantly higher negative rate of the pivot shift test (p = 0.006) and Lachman test (p < 0.0001) compared to SB-treated patients. From the above results, we can make a conclusion that possibly because of the PL bundle reconstruction, the DB technique yields a better outcome regarding both anterior and rotatory knee stability when compared to the SB technique in a long-term follow-up.

As for the clinical outcome measurements, a statistically significant difference is found between SB versus DB ACL reconstruction with regard to the IKDC (p = 0.006 and <0.0001 in a short- and long-term follow-up, respectively) and complications (p = 0.03), while there is no significant difference between the two groups regarding the mean difference of Lysholm knee score (p = 0.09 and 0.31 in a short- and long-term follow-up, respectively) and Tegner activity score (p = 0.08). These above results may lead to a conclusion that although DB-treated patients have similar results for Lysholm knee score and Tegner activity score when compared with SB-treated patients, a significantly higher rate of normal knees and lower incidence of complications (for example, degenerative changes and reoperations) are seen with the use of DB technique, which means the DB technique can help to maintain patient activity level and protect the joint against degeneration in case of ACL injures.

The heterogeneity across the included trials was slight, so most evidences from our study should be considered to be robust. Nevertheless, our study still has several potential limitations. One potential limitation is that the types of graft fixation and knee support for rehabilitation are not completely consistent across the included trials, and this might cause a bias. A second potential limitation involves the fact that most included trials are considered to be of high risk of bias due to lack of two or more unclear or inadequate methodological components. Although we minimized selection bias by including only RCTs or quasi-RCTs, our meta-analysis is still limited by selection bias due to the systematic differences between baseline characteristics of the study groups that are compared. Because it is usually impossible to blind people regarding whether or not major surgery has to or has been undertaken, so performance bias is inevitable in the present meta-analysis. Moreover, our meta-analysis is also limited by detection bias due to the insufficiency regarding the outcome assessor blinding to intervention reported in part of the included studies. A third confounder is that the sample size of all included trails is small. In addition, the number of trails in the short- and long-term group is relatively small and a funnel plot for pooled estimates to assess the potential publication bias has not been performed, and the unpublished researches with negative results cannot be identified. Therefore, there may be publication bias as well, which could result in the overestimation of the effectiveness of interventions.

Overall, the present meta-analysis reveals that DB-treated patients may have a better range of motion and reduced glide pivot shift phenomenon after surgery, and the DB technique can also help to maintain patient activity level and protect the joint against degeneration. Therefore, we still recommend the use of DB technique in the process of ACL reconstruction. However, the differences in measured outcomes of knee stability could be only attributed to the contribution of the ACL to anterior or valgus restraint, rather than rotational restraint. Moreover, the functional capability of patients does not seem to be further improved with the use of DB ACL reconstruction. As a result, more high-quality studies that have quantitatively measured the rotational laxity are needed to further confirm the clinical benefits of DB ACL reconstruction.

Conclusion

Our results do not seem to support the null hypothesis: although there is no significant difference regarding the Lysholm knee score and Tegner activity score, most clinical outcomes are significantly different when comparing the DB ACL reconstruction with SB ACL reconstruction.