Introduction

Posterior cruciate ligament (PCL) reconstruction has become more popular and shows consistent stability with recent improvements in arthroscopic techniques [3]. Several promising methods and techniques have been reported using various graft selections. Among various techniques, single bundle transtibial PCL reconstruction is most popular method and shows comparative functional outcome with double-bundle PCL reconstruction [11]. However, despite the theoretical development, the reported failure rate of PCL surgery and degenerative change is relatively high; there is little consensus regarding how to optimally reconstruct the PCL, and which is the best choice of graft [9, 10, 12, 13, 26].

During the selection of graft material, consideration should be given to the origin (autograft versus allograft), nature (bony fixation versus soft tissue graft), size (diameter), and length (single versus multi-strand graft) of the graft. Transtibial PCL reconstruction usually requires a longer graft length compared to that used for anterior cruciate ligament (ACL) or inlay PCL reconstruction, because tunnel length is longer than that of the ACL, and most fixations are performed at the exit portion of the tunnel [1]. Therefore, there can be additional limitations in choosing the graft material for transtibial PCL reconstruction.

Despite numerous published reports on PCL reconstruction in the past 30 years, the ideal graft source remains unclear, and few objective scientific data have been published that thoroughly evaluate the long-term outcomes according to the graft source. Furthermore, only the origin of the graft (allograft versus autograft) has been an important concern in the analysis. We, therefore, conducted a systematic review of available high-quality comparative studies that evaluated clinical and objective stability testing to compare the different graft sources for PCL reconstruction. The hypothesis of this study was that clinical and stability outcomes would be similar regardless of the graft source.

Materials and methods

Search strategy

A rigorous and systematic approach according to the preferred reporting items for systematic review and meta-analysis (PRISMA) guidelines was used [23]. In phase 1 of the PRISMA search process, the MEDLINE, EMBASE, and Cochrane database were systematically searched (November 2016). Using a Boolean strategy, all field search terms included the following: Search (((posterior cruciate ligament) AND (((repair) OR augmentation) OR reconstruction)) AND graft). The citations in the included studies were screened, and we also hand-checked for articles not identified in the search. The bibliographies of the relevant articles were subsequently cross-checked for articles not identified in the search. In phase 2, abstracts and titles were screened for their relevance. In phase 3, the full text of the selected studies was reviewed to assess for the inclusion criteria and methodological appropriateness with a predetermined question. In phase 4, the studies underwent a systematic review process, if appropriate.

Eligibility criteria

The inclusion criteria were as follows: (1) articles written in English, (2) single-bundle transtibial PCL reconstruction, (3) comparison of outcomes using different graft materials as a primary objective, (4) more than 2 years of follow-up, and (5) prospective or retrospective comparative studies (PCS or RCS) (Fig. 1).

Fig. 1
figure 1

PRISMA flow chart

Data extraction

Data were extracted for the following: study type, level of evidence, graft source (case versus control), number (case versus control), age (case versus control), sex ratio (case versus control), augment material (case versus control), fixation (case versus control), treating method for the remnant PCL, follow-up period, clinical results, stability results, conclusion of the study, and other relevant findings. The extracted data were subsequently cross-checked for accuracy.

Quality assessment

The methodological quality of the randomized controlled trials (RCT) was assessed using risk of bias (ROB), based on the Cochrane handbook, with the following nine standard criteria: allocation sequence generation, allocation concealment, baseline outcome measurement, baseline characteristics, incomplete outcome data, knowledge of the allocated interventions, protection against contamination, selective outcome reporting, and other ROB. Each criteria was scored as “Yes (low ROB)”, “No (high ROB)”, or “Unclear”.

The methodological quality of the non-randomized controlled trials was assessed using ROBIN-I tool [27], based on the Cochrane. It consisted of three main domains (pre-intervention and at-intervention, post-intervention, overall risk of bias) and each criteria was scored as “Low”, “Moderate”, “Serious”, “Critical” or “No information”.

Grading of the quality of the evidence

Apart from describing the methodological quality of the included studies, evidence grade was determined using the guidelines of the grading of recommendations, assessment, development, and evaluation (GRADE) working group [4]. The GRADE system uses a sequential assessment of the evidence quality that is followed by an assessment of the risk–benefit balance and a subsequent judgement on the strength of the recommendations. The evidence grades are divided into the following categories: (1) high, which indicates that further research is unlikely to alter confidence in the effect estimate; (2) moderate, which indicates that further research is likely to significantly alter confidence in the effect estimate and may change the estimate; (3) low, which indicates that further research is likely to significantly alter confidence in the effect estimate and to change the estimate; and (4) very low, which indicates that any effect estimate is uncertain. The strengths of the recommendations were based on the quality of the evidence [19].

Results

Search

Eight articles were included in the final analysis. Among these, there were two RCT studies [18, 30], one PCS [5], and five RCSs [2, 20, 28, 31, 32]. There were two level II [18, 30] and six level III [2, 5, 20, 28, 31, 32] studies. Autograft included four-strand hamstring grafts (SHGs) [2, 5, 18, 20, 28, 30,31,32], 7-SHGs [32], quadriceps tendon [5, 30], and patellar tendon [20]. Allografts included Achilles tendon [2, 30] and tibialis anterior tendon [18, 28, 30]. Hybrid graft [18] (tibialis anterior allograft plus semitendinosus autograft) and a ligament advanced reinforcement system (LARS) [31] were used in one study each. Comparison was performed between autografts and allografts in three studies [2, 28, 30], between different autografts in two studies [5, 20], between autograft and LARS in one study [31], among three different grafts (autograft, hybrid graft, and allograft) in one study [18], and between 4 and 7-SHGs in one study [32]. Detailed characteristics of the studies are summarized in Table 1.

Table 1 Characteristics of included studies

Quality

Quality assessment details are presented in Table 2. Two RCTs were assessed using ROB, based on the Cochrane handbook. One study was scored as “Yes” in four categories, “Unclear” in four categories, and “No” in 1 category. The other RCT study was scored as “Yes” in three categories, “Unclear” in two categories, and “No” in four categories. Five retrospective comparative studies and one prospective comparative study were assessed using the ROBIN-I assessment tool. In the pre-intervention & at-intervention domain, three studies [2, 20, 28] were scored “no information” and others were scored “Moderate”. All studies [1, 5, 20, 28, 31, 32] in post-intervention domain were scored as “Low”. In overall ROB domain, three studies [2, 20, 28] were scored as “Serious” and other three studies [5, 31, 32] were scored as “Moderate”.

Table 2 Quality assessment of included studies

GRADE evidence quality of each outcome

GRADE evidence quality of each outcome was presented in Table 3. Four outcomes were separately evaluated. There were one of high quality and three of low quality. Comparisons of the Tegner activity score using two RCTs and two RCSs showed moderate quality. However, others such as IKDC, Lysholm, Telos, and Instrumented anteroposterior laxity measurement showed low quality.

Table 3 GRADE evidence quality for each outcome

Clinical results

Surgical options are presented in the Table 4 and clinical results are presented in Table 5 and Fig. 2. All eight studies reported clinical results. In postoperative values, International Knee Documentation Committee (IKDC) score, Lysholm score, and Tegner activity score were reported in two or more articles. A 4-SHG was included in all eight studies, and was compared to the hybrid graft and tibialis anterior allograft in a level II study [18], an Achilles allograft and tibialis anterior allograft in one level II study [30], an Achilles allograft in one level III study [2], a quadriceps autograft in one level III study [5], a patellar tendon autograft in one level III study [20], a LARS ligament in one level III study [31], and a 7-SHG in one level III study [32]. In general, most studies reported no statistically significant differences, except for one study that compared 4- and 7-SHGs.

Table 4 Surgical options of included studies
Table 5 Outcomes of included studies
Fig. 2
figure 2

Diagram of the Lysholm scores in all included studies

In one level II study by Li et al. [18], the differences in clinical results, including IKDC subjective and objective, Lysholom, and Tegner activity scores, were not significant among the three groups (4-SHG, hybrid graft [tibialsi anterior allograft plus semitendinosus autograft], and tibialis anterior allograft). Wang et al. [30] compared the clinical results using IKDC objective score, Lyshlom score, and Tegner activity score in autografts (16 HG and 16 quadriceps) and allografts (14 Achilles and 9 tibialis anterior) in another level II study. They also found no statistically significant differences between groups.

Among the remaining six level III studies, two studies compared 4-SHGs to allografts (Achilles and tibialis anterior). Ahn et al. [2] compared 4-SHG to an Achilles allograft. The IKDC objective score was not statistically different, but Lysholm score [90 (78–100) in 4-SHG, 85 (70–95) in Achilles allograft, p < 0.01] showed statistically significant differences between groups. However, they concluded that the clinical outcome was the same for both groups. Sun et al. [28] compared 4-SHG to the tibialis anterior allograft. The IKDC objective score, Lysholm, and Tegner activity score were not statistically different between groups.

In two studies, 4-SHG was compared to the autograft. Chen et al. [5] compared 4-SHG to the quadriceps autograft. They evaluated IKDC objective score and Lysholm score, and there were no statistically significant differences between groups. Lin et al. [20] compared 4-SHG to the patellar tendon autograft. They also evaluated clinical results using the same scales used by Chen et al. [5] and their results were also not statistically different. Xu et al. [31] compared 4-SHG to the LARS. The IKDC objective score, Lysholm, and Tegner activity score were evaluated and they were not different between groups. Zhao et al. [32] performed a study that compared 4- and 7-SHG. They found statistically significant superior results in the 7-SHG group regarding the IKDC objective score and Lysholm score.

Stability results

Stability results are presented in the Table 5 and Fig. 3. All eight studies reported stability results. Two studies [2, 5] reported using a stress radiograph and six studies [18, 20, 28, 30,31,32] reported using an instrumented anteroposterior laxity measurement. Five studies reported the comparison between autograft and allograft. Among them, two studies [18, 28] reported that stability was superior in autograft group, while three studies [2, 30, 31] reported similar result between two groups. The stability was not statistically different between different autografts or between 4-SHG and LARS. More-stranded HG showed better stability that that of lesser-stranded HG.

Fig. 3
figure 3

Diagram of the stability results in all included studies

In one level II study by Li et al. [18], both the autograft and hybrid graft groups showed statistically significant differences when compared with the gamma-irradiated allograft group in terms of instrumented anteroposterior measurements (p = 0.006). The autograft group showed slightly superior stability compared with the hybrid group, but no statistically significant difference was found (p = 0.189). Wang et al. [30] compared the stability results using an instrumented anteroposterior laxity measurement in autografts (16 HG and 16 quadriceps) and allografts (14 Achilles and 9 tibialis anterior) in another level II study. They found no statistically significant differences between groups.

In two level III studies that compared 4-SHG to an allograft, Ahn et al. [2] reported no statistically significant differences between 4-SHG and Achilles allograft, but Sun et al. [28] reported superior stability in the 4-SHG compared to that of the tibialis anterior allograft, with statistical significance. In another two studies that compared 4-SHG to another autograft, both studies reported no statistically significant differences between 4-SHG and quadriceps autograft or between 4-SHG and patellar tendon autograft [5, 20]. In the study by Xu et al. [31], comparison between 4-SHG and LARS also showed no statistically significant difference, either. However, 7-SHG showed better stability than that of 4-SHG in the study by Zhao et al. [32].

Overall conclusions and other relevant findings

Overall conclusions and relevant findings are included in Table 6. Chen et al. [5] and Wang et al. [30] evaluated muscle strength data and found no significant differences between quadriceps autograft and 4-SHG and between autograft and allograft. Proprioception was evaluated by Li et al. [13] Threshold to detection of passive motion (TTDPM) and reproduction of passive motion (RPP) tests showed no significant differences among the three groups (p = 0.376 and 0.196, respectively) In the study by Chen et al. [5], superficial infection or irritation was more frequent in the 4-SHG than those of the quadriceps tendon group. Wang et al. [30] also reported more complications in the autograft group, including infection, donor site pain, and reflex sympathetic dystrophy. Lin et al. [20] reported several shortcomings of the patellar tendon, such as anterior knee pain, squatting pain, kneeling pain, and osteoarthritic change. Therefore, they recommended a hamstring tendon autograft as a better choice in transtibial PCL reconstruction. Sun et al. [28] reported better stability in the 4-SHG and a higher incidence of numbness and dysesthesia around the incision in the 4-SHG.

Table 6 Conclusions and other relevant findings of included studies

Discussion

The principal findings of this systematic review were that (1) most studies reported no statistically significant differences in the clinical results, except for one study that compared 4-SHG and 7-SHG; (2) stability was similar or superior in a comparison between autografts and allografts, and was not statistically different between different autografts or between 4-SHG and LARS, but more-stranded HG showed better stability than that of the less-stranded HG; (3) kinematic data were not different regardless of the graft; and (4) complications were more frequent with autografts, and included superficial infection, irritation, and reflex sympathetic dystrophy in the 4-SHG, and anterior knee pain, kneeling pain, and osteoarthritic change in the patellar tendon. Therefore, our hypothesis was supported by the clinical results. However, in the stability results, a definite conclusion could not be reached, although autograft was more favorable because some studies reported superior stability with 4-SHG compared to that for tibialis or Achilles allograft. Furthermore, there were also statistically significant differences between less- and more-stranded HG.

A previous systematic review compared allograft versus autograft in PCL reconstruction, and no appreciable differences were identified [10]. The review used 2 direct comparisons, 5 allograft, and 12 autografts. Single-bundle and double-bundle reconstruction were mixed, and detailed differentiation between autografts or allografts was not performed in the analysis. Furthermore, there were too many level IV studies. The authors reported a paucity of data comparing autografts and allografts, leading to general heterogeneity of available studies. However, newly published studies directly compared autograft versus allograft, different autografts, and autograft versus artificial ligament. This enabled a more qualified analysis in our study.

Comparing to the PCL reconstruction, there were relatively abundant qualified studies in ACL reconstruction comparing autograft versus allograft [8, 21, 22]. Recent analyses clearly reported that the incidence of failure after ACL reconstruction was higher in allograft groups than in autograft groups [7, 15, 24]. Comparing with the PCL reconstruction, there were relatively abundant qualified studies comparing autograft versus allograft. However, longer grafts are required when using soft tissue graft and graft selection would be limited in the transtibial PCL reconstruction. Furthermore, PCL has been shown to have different biomechanical requirements than the ACL [14, 25]. Therefore, the ideal graft source could be different in transtibial PCL reconstruction. Appropriate graft choice remains controversial in PCL reconstruction. The most commonly used grafts for PCL reconstruction are the patellar tendon or quadriceps with the bony portion, multiple-strand HG, and Achilles tendon grafts [6]. Soft tissue grafts including the 4-SHG and tibialis allografts are attracting more attention, and new methods of graft fixation are being developed [1, 16, 17]. However, when using soft tissue graft, graft length is an important consideration in selecting the graft source. Therefore, the ideal graft should have adequate length, and should be multi-stranded such as double hamstring graft and 4-SHG, with low donor site morbidity, and strong biomechanical characteristics. Tornese et al. [29] reported that use of the 4-SHG with a possible loss of flexor strength could be a more acceptable solution than reconstruction with the patellar tendon and weakening of the extensor at the autograft source, since biomechanical considerations underscore the importance of recovery of the quadriceps after PCL reconstruction.

This systematic review included two RCTs, five RCSs, and one PCS. There were two level II and six level III studies. Two level II studies showed contradictory results for stability, although high-quality clinical results were similar. Furthermore, different graft sources were used in each study, although 4-SHG was used in all studies. Therefore, it was impossible to perform a meta-analysis by pooling of these data with high possible bias, although most studies shared similar parameters in evaluating clinical and stability results. We strived to mitigate this fact in our review process by weighting the results of each individual article based on the level of the evidence that it supplied. Results of the high-level study were reported first. Then, results of the low-level study followed, and were compared with those of the high-level study. These results also affected the quality of the GRADE evidence for each outcome. Comparisons of the clinical and stability outcomes using two RCTs only showed relatively high quality, and the others showed mostly low quality.

Our study has strength, in that only comparative studies that used graft source as a primary objective were included. It would be ideal to analyze the effect of graft source on outcomes using the currently available literature. In each article included in this study, individual graft materials were used for the analysis, although most studies only compared allografts and autografts. However, there would be some differences within autografts or within allografts. Additionally, detailed quality evidence for each outcome was provided, and this made our analysis more objective. There were several limitations in this systematic review. First, small number of cohort studies and low level of evidence studies were included in this study. However, because PCL-based studies were relatively fewer, we think it was the best for systematic review at this point. Second, some studies showed superior stability for the 4-SHG compared to that of the allograft. However, the difference was only within 2 mm, and the clinical relevance of this difference was questionable. Finally, there is a possibility that the sensitivity of the evaluating parameters is inadequate to detect difference in the graft source.

Conclusion

Using a comprehensive analysis of the current literature, the authors could not identify an individual graft source with clearly superior clinical results, compared with other graft sources. However, autografts, especially 4-SHGs, showed similar or superior stability to irradiated allografts. Therefore, the graft source has a minimal effect on the clinical outcome, but it could have some effects on stability in single bundle transtibial PCL reconstruction.