Introduction

The reported incidence of posterior cruciate ligament (PCL) injuries varies greatly, from 1 to 44% of all acute knee injuries [46]. One study reported that only 3.5% of all PCL injuries were isolated [18]. With the exception of cases of bony avulsion fractures, both non-operative and operative treatments are used for isolated PCL tears. Whether operative treatment is better than non-operative treatment when it comes to grade III isolated PCL injuries is still a matter of debate [40]. There are several surgical repair/reconstruction techniques in use for PCL reconstruction. Two recent studies found no difference in results between tibial inlay and transtibial techniques for tibial fixation [31, 45]. Two other studies found no significant difference between single-bundle or double-bundle techniques [22, 55]. Unfortunately, few studies in the literature focus on isolated PCL injuries or on PCL injuries combined only with posterolateral corner (PLC) injuries. In addition, none or a limited number of these few studies are randomized controlled trials [39], and few are prospective comparative studies. This may limit the value of the reported results on treatment of isolated PCL injuries.

The purpose of the present work was to analyze studies on treatment of isolated PCL injuries and combined PCL and posterolateral (PLC) and posteromedial corner injuries respect to their methodological quality. To assess methodological limitations, we calculated a modified Coleman methodology score [10, 23] for each of the included studies. In this system, an optimal study will have a score of 100. Our main hypothesis was that studies with a high success rate would have a low Coleman methodology score. Finally we wanted to examine if the methodological quality has improved over time, and if the Coleman methodology score correlates well with the level-of-evidence [57].

Materials and methods

We performed a search in Medline Ovid, Cinahl and Embase 12.12.2005, and in Cochrane 14.12.2005. In Medline, Cinahl and Cochrane we searched for “posterior cruciate ligament/” OR “(posterolateral corner OR posterolateral complex).mp”. Embase uses different medical subject heading terms (MeSH) terms, so we had to use a different search strategy; “((knee ligament/OR knee cruciate ligament/) AND (PCL OR knee posterior cruciate ligament OR ligament knee posterior cruciate OR posterior cruciate ligament OR posterior cruciate ligament knee OR ligamentum cruciatum posterius).mp.) OR (posterolateral complex OR posterolateral corner).mp”. We limited our search to articles in English that were published in the period from 1985 to 2005. The search resulted in a total of 1,312 articles.

Selection criteria: We included studies with a primary aim to report the outcome after surgery or conservative treatment of isolated PCL injuries or PCL injuries as part of injuries to the posterolateral or posteromedial corner. Combined injuries involving the PCL and ACL were excluded, as were non-clinical studies, i.e., studies on animals and cadavers, biomechanical studies and in vitro studies. In order to be qualified, an article would have to have more than ten patients included, and to have been published in peer-reviewed journals.

Using these selection criteria, we first excluded papers based on the title of the abstract. This resulted in 210 abstracts being reviewed. Full-text versions were obtained if the decision to include or exclude could not be made from the abstract. If in doubt whether an article should be included, the senior author (LE) made the decision. We finally included 40 articles. These 40 articles were reviewed for methodological quality with the use of a version of the methodology score introduced by Coleman et al. [10], subsequently modified by Jakobsen et al. [23] to assess description of the rehabilitation program as well as compliance. The papers were divided between the two junior authors who did the preliminary scoring. Then the results were discussed with the senior author. In effect, each paper was scored by at least two of the authors.

The Coleman methodology score, which was originally developed to grade clinical studies on patellar and Achilles tendinopathy, assesses methodology with use of ten criteria, giving a total score between 0 and 100. A score of 100 indicates that the study largely avoids chance, various biases and confounding factors. The subsections that make up the Coleman methodology score are based on the subsections of the CONSORT statement (for randomized controlled trials) [2] but are modified to allow for other trial designs. We also assessed the studies using the level-of-evidence ratings introduced in the American volume of The Journal of Bone and Joint Surgery in 2003 [57] and later updated.

The clinical outcome scales used in the selected papers were collected. Furthermore, we also collected the reported clinical outcomes from each paper. If the data were reported, we collected the mean Lysholm score which was validated for knee ligament injury patients [30], and the percentage of patients with a score that equaled a good or excellent clinical outcome from the use of other scales. If several clinical outcome scales were used in a study, we used the Lysholm scale if available, then the IKDC scale [21]. If a study had two groups of patients (two surgical methods), we added the outcomes and reported the average result.

The outcome was correlated with the total Coleman methodology score to assess the impact of methodology on the reported outcomes. We also correlated the Coleman methodology score with the year of publication to investigate trends in methodology over a period of time. Finally the Coleman methodology score was correlated with the level-of-evidence rating.

Statistical methods

The SPSS software (version 14.0; SPSS, Chicago, IL) was used to analyze the data. Not all data were normally distributed (Shapiro–Wilks test), and we therefore used both parametric (mean and standard deviation) and non-parametric (median and interquartile range) descriptive statistics. For the same reason we used both parametric (Pearson) and non-parametric (Spearman) correlation methods. We also performed tests for linear regression that were weighted and unweighted with respect to the number of patients included in each study. Parametric and weighted regression analysis did not markedly change the results of non-parametric methods, and we therefore only report the non-parametric correlations here. The Mann–Whitney test was used to test whether the outcomes from different kinds of therapy differed significantly.

Results

Due to the limitations above, we finally included 40 articles that were concerned with the treatment of PCL injuries, 31 studies of surgical treatment [1, 59, 11, 13, 1517, 20, 22, 24, 26, 28, 29, 3237, 4144, 49, 5255], 8 studies of conservative treatment [4, 19, 27, 38, 4648, 51], and 1 study comparing operative and non-operative treatment [49]. Of the studies 17 only included isolated PCL injuries, 4 studies only included PCL injuries combined with PLC injuries, 13 studies included both isolated PCL injuries and injuries combined with PLC injuries, while 3 studies did not state this clearly. Of the 40 studies there was only one randomized controlled trial, and this study compared two surgical procedures [55]. The median number of patients included in each study was 27 and the 25th–75th percentile was ranging from 19.0 to 39.8 patients. The median duration of follow-up was 40 months, and the 25th–75th percentile was ranging from 27 to 60 months.

The average modified Coleman methodology score was 52.1 (95% confidence interval 47.7–56.6). The following four categories had the lowest scores, (1) study size, (2) type of study, (3) diagnostic certainty, and (4) procedure for assessing outcome. The average total Coleman methodology score and the average Coleman methodology score for each criterion are given in Table 1. No studies were rated as level of evidence I; 5 studies were rated as level II and III; 30 studies were rated as level IV. In Table 2 the distribution of the studies is given with regard to the different types of treatment, the different types of studies, and the level-of-evidence rating.

Table 1 Coleman methodology score for studies on treatment of PCL injuries
Table 2 Distribution and mean Coleman methodology score of the studies according to type of treatment, type of study and level of evidence rating

In 36 of the 40 studies it was possible to find the results reported as a Lysholm scale score, or possible to transform the results reported with other scales into a percentage of good or excellent results. The median Lysholm scale score (17 studies) was 90.3 and the 25th–75th percentile 85.5–91.8. The median percentage of good or excellent with use of other scales (26 studies) was 80.5%, and the 25th–75th percentile 68.9–89.9%.

When outcome results (the percentage of good or excellent) were analyzed with respect to conservative or surgical treatment, no significant difference was found. The Mann–Whitney test gave U = 42.5 and p = 0.915. Figure 1 shows that there were large variations in reported outcome within each treatment modality. Since only one conservative study reported results with the Lysholm scale, we did not compare the results of surgical and conservative treatment based on this scale.

Fig. 1
figure 1

Box plot showing the percentage of good or excellent result for conservative or surgical treatment. Each box with bars shows the median, the quartiles and the minimum and maximum values. A percentage of good or excellent could be found in four conservative studies, the median was 80.6% and the 25th–75th percentile was ranging from 64.3 to 95.3%. A percentage of good or excellent was found in 22 surgical studies. Among these, the median was 79.2% and the 25th–75th percentile ranged from 68.9 to 89.9%

We did not find a significant correlation when analyzing the Coleman methodology score with respect to the Lysholm scale score (17 studies, Spearman’s ρ = 0.19, p = 0.44), or with respect to the percentage of good or excellent (26 studies, Spearman’s ρ = 0.25, p = 0.23; Fig. 2).

Fig. 2
figure 2

Percentage of good or excellent outcome for different Coleman methodology scores. There is no significant correlation between percentage of good or excellent and Coleman methodology score (Spearman’s ρ = 0.25, p = 0.23)

The Coleman methodology score correlated positively with the publication year (Spearman ρ = 0.64, p < 0.01; Fig. 3). The Coleman methodology score correlated with the level-of-evidence rating (Spearman ρ = −0.42, p < 0.01); this means that the higher the level-of-evidence, the higher the Coleman methodology score. However, Fig. 4 shows that the variations especially within level of evidence III and IV were large.

Fig. 3
figure 3

Coleman methodology score for publications from 1995 to 2005. There was a significant correlation between Coleman methodology score and the year of publication (Spearman ρ = 0.64, p < 0.01)

Fig. 4
figure 4

Box plot of the Coleman methodology score for each level of evidence. Each box with bars shows the median, quartiles, and minimum and maximum values. If the minimum or maximum values are more than 1.5 box lengths from the upper or lower edge of the box, these are instead illustrated by an “o” (outlier). The median CMS for level of evidence II was 73 (25th–75th percentile 40.8–59.5), the median for level III was 49 (25th–75th percentile 40.5–59.5), and the median for level IV was 47 (25th–75th percentile 40.8–59.3)

We found 12 different scales used for clinical outcome assessment. The most frequently used scale was the one introduced by Lysholm and Gillquist [30] that was used in 20 studies. The scale introduced by the International Knee Documentation Committee (IKDC) [21], was the second-most frequently used and was used in 17 studies. In some of the studies several scales were reported to have been used, however, all of the results were not reported in “Results”.

Discussion

Our main hypothesis in this review was that studies on treatment of isolated PCL injuries with a high success rate had methodological limitations. Several review articles on treatment of PCL injuries have been published [12, 14, 40, 56]; however, none of them has questioned the methodological qualities of the studies reviewed. Based on the findings in the present study, there are many reasons to question the conclusions made in the majority of the available studies in this field.

Two limitations of this study are the assumption made by the Coleman methodology score and the high number of outcome scales used by different authors. The Coleman methodology score actually assesses the quality of reporting, not the quality of the study, i.e., a high-quality study that is reported poorly would receive a low score. Unless the individual authors are contacted directly, this is an inherent weakness of all methodology scores as they do not necessarily reflect the true validity of the study, but are biased by the quality of reporting. The assumption of this review is that existing guidelines on how to report a clinical trial have been followed in all articles, and that the Coleman methodology score assessed from the article therefore reflects the quality of the study.

As for the second limitation, our initial search returned a high number of abstracts indicating that we are not likely to have missed many studies in this field. However, as we limited the search to papers published in English, we may have missed articles published in other languages. None of the papers comparing two groups of patients undergoing different surgical repair techniques reported significant differences between the groups. We therefore report the averaged outcome from these studies [9, 22, 24, 29, 33, 36, 52, 55]. One study comparing PCL reconstruction alone versus PCL and PLC reconstruction, reports significant difference [20]. However, they only report a mean Lysholm score for the two groups. One study that compared conservative and surgical management [49] was not included in our outcome analysis, as it did not use the Lysholm scale and did not report outcome in percentage terms of good or excellent. Another study compared conservative treatment with surgical treatment, but reported results that could be transformed to percentage terms of good or excellent only for the surgical group [44]. For this reason, we analyzed this study as surgical.

A generally low methodological quality was found in the included papers based on the results of the Coleman score. However, literature reviews of the surgical intervention in patellar tendinopathy [10], Achilles tendinopathy [50] and cartilage injuries [23] found even lower methodological quality. None of the papers in the present review mentioned compliance with the rehabilitation protocol, and we therefore used Jakobsen’s modified version of the Coleman methodology score. The modification led to higher scores in the category of rehabilitation protocol in comparison with the two former reviews, which may partly explain the higher mean Coleman methodology score.

Four categories within the Coleman methodology score had distinct methodological limitations. Some of these were identical to those identified in the previously mentioned reviews of methodological quality [10, 23, 50]. The category “type of study” scored particularly low. Among the included papers there was only one randomized controlled trial [55]. It compared two different surgical procedures. The randomization, however, was not well described. Furthermore, there were only three prospective cohort studies [22, 36, 52], that compared different surgical techniques. This indicates that randomized controlled trials are required, especially when comparing operative and non-operative treatment. In cases where a randomized controlled design is not feasible the study should be prospectively constructed taking into account as many of the features of a randomized controlled trial as possible. The majority of studies in our sample had very few included patients. Together with the low incidence of isolated PCL injuries [18, 46], this demonstrates a need for a multicenter approach. Using this kind of approach would also make it possible to design and perform a randomized controlled trial to investigate a possible difference between surgical and conservative treatment. The category of diagnostic certainty had limitations as well. Diagnostic uncertainty about the type and grade of injury might make the reported outcome unreliable. In order to confirm the diagnosis of an isolated PCL injury or a PLC injury, one should perform an MR and stress radiography [3, 25]. In the acute phase a diagnostic arthroscopy would add to the value of the diagnosis, whereas arthroscopy in the chronic case can be misleading. Finally we found limitations regarding outcome assessment. The patient’s relationship with the investigator might affect the neutrality of the patient. Outcome assessment should be done by an independent investigator to avoid observer bias, and ideally the patient should complete this in a written form without investigator assistance to minimize the risk of response bias [2].

Our study detected no difference between conservative and surgical treatment of isolated injury to the PCL. Currently, no published randomized controlled trials compare these two treatment options. One retrospective comparative study included in our review did not find a difference between operative and non-operative treatment [49]. Petrigliano et al.’s [40] systematic review from 2006 concludes that there was no difference. Our analysis therefore agrees with their statements. However, it is important to be aware that this systematic review is not a meta-analysis of well-done randomized controlled trials. Our main purpose was not to make any conclusion about the best treatment option, but to draw attention to the fact that outcomes are highly variable within both treatment modalities. It should be noted that there are several limitations to our comparison of conservative and surgical management. First, our result is based on 26 studies that report outcome transformable into percentage of good or excellent, and only 4 of these are studies on conservative treatment. Secondly, since the studies report outcome with different scoring systems, it is difficult to compare them, even though we transformed all the results into percentages of good or excellent. If a common, validated scale for clinical measurements was constructed for PCL injuries, the comparison of outcomes in different studies would be easier and more reliable. Thirdly, the two treatment groups may not be equal or comparable. For example, the conservative group might involve grade I, II and III injuries, while the surgical group only contains patients with grade III injuries. Furthermore, perhaps PCL injuries combined with PLC injuries are always treated surgically, but conservatively if they are isolated. And finally, a different timing of follow-up and results assessment could affect the comparison. The only way to create as equal groups as possible is to perform randomization into conservative or surgical management. For a more reliable comparison of the two treatment options, we strongly need randomized controlled trials. In addition, if a common, validated scale for clinical measurement was constructed for PCL injuries, the comparison of outcomes in different studies would be easier and more reliable.

No significant correlation between outcome results and Coleman methodology score was detected. This is in agreement with another review that used the Coleman methodology score to assess methodological limitations [23]. On the other hand, the finding is contrary to two other corresponding reviews [10, 50]. This difference may be due to the heterogeneity of the studies, and the large diversity of outcome measurement scales used, which when combined could conceal a possible correlation.

The Coleman methodology score correlated well with the level-of-evidence rating. There was great variance in the Coleman methodology score within each level of evidence, but the variance was progressively smaller with higher levels of evidence. Hence the reader can be fairly confident that if a study receives a high level-of-evidence rating, the methodological quality of the study is good. On the other hand, the reader should still be aware that the level-of-evidence rating does not take into account all areas of sound study design. One suggestion to improve the rating would be to include a detailed methodology score in the submission process with a scoring of each subcriterion published online. This would not only enable readers to evaluate the methodology more thoroughly, but it would also serve as a guideline for authors when designing and reporting a study, and thereby hopefully increase the overall awareness regarding methodological quality.

The Coleman methodology score correlated positively with the year of publication. This implies that the methodological quality has improved. Other reviews on different areas within orthopedic treatment have reported corresponding findings [10, 23, 50].

In conclusion, the generally low methodological quality of all of the studies included in this review shows that caution is required when interpreting results after management of injury to the PCL and when recommending treatment to patients. Firm recommendations on what kind of treatment to choose cannot be given at this time on the basis of these studies. Clinicians should pay more attention to established guidelines [2] when designing, conducting and reporting trials, to improve the methodological quality. Journals could include a detailed methodology score in their submission process to encourage clinicians to focus on sound methodology.

In order to find reliable answers regarding what treatment to recommend, there is a need for more studies on management of PCL injuries, and for more patients to be included in each individual study. We believe that a multicenter approach may be needed to make it possible to construct and perform randomized controlled trials with adequate statistical power to detect differences between treatments.

We propose the following guidelines for future studies on the basis of the findings in the present review:

  1. 1.

    Studies should be prospective with a clearly defined hypothesis and one clearly defined primary end point. They should be randomized controlled trials with an adequate randomization procedure and power analysis for the primary end point. Secondary end points should only be used as supportive evidence to the primary hypothesis.

  2. 2.

    To improve diagnostic certainty, all patients should have an MR and stress-radiography assessment in addition to a clinical examination.

  3. 3.

    Detailed rehabilitation protocols should be established and reported. Compliance should be monitored. The protocols should be applied in a standardized manner to both patient cohorts.

  4. 4.

    The timing of the outcome assessment should be clearly stated. The results from various time-points after surgery should not be reported as one outcome. The assessments should be both clinical and functional. The minimum duration of follow-up should be more than 24 months.

  5. 5.

    The outcome assessment should be made by a truly independent investigator. The assessment should be in a written form and ideally be completed by the patient without investigator assistance.

  6. 6.

    The patient inclusion and exclusion criteria should be clearly established and reported. The recruitment rate should be reported, and attempts should be made to account for eligible patients who are not included and those who are lost to follow-up.

  7. 7.

    The outcome measure should be validated for use on patients with PCL injuries.

  8. 8.

    No commercial entity paid or directed, or agreed to pay or direct, any benefits to any research fund, foundation, educational institution, or other charitable or non-profit organization with which the authors are affiliated or associated.