Introduction

Articular hyaline cartilage is an avascular tissue which covers the surface of the bone in a joint for low-friction motion and weight-bearing capacity. It is usually less than 3 mm and likely to suffer from trauma or degeneration (Smith et al. 2005; Chiang and Jiang 2009). The joints with cartilage defects usually have the symptoms of pain, articular effusion and crepitus that are hard to relief. For the lack of access to abundant nutrients or circulating progenitor cells, articular cartilage has a poor self-repair capacity after injury and these defects are easy to progress into osteoarthritis with pain and instability (Huey et al. 2012). Thus the cartilage defects often require surgical intervention for a better function.

The surgical interventions in knee can be classified as reparative and reconstructive surgeries. The reparative surgery includes microfracture, drilling and abrasion arthropasty and the reconstructive surgery comprises allograft transplantation, osteochondral autograft transplantation (OAT) and autologous chondrocyte implantation (ACI) (Stroh et al. 2011). According to the results of clinical trials and review articles, reparative surgery is more effective on the smaller defects (<100 mm2) than reconstructive surgery while in large defects (>100 mm2), reconstructive surgery performs better (Perera et al. 2012; Smith et al. 2005). Osteochondral autograft transplantation and autologous chondrocyte implantation are two popular methods in the treatment of large cartilage defects.

Osteochondral autograft transplantation, also known as osteochondral cylinder transplantation or mosaicplasty, is a technique that was described in 1990s (Matsusue et al. 1993; Hangody and Karpati 1994). It uses the osteochondral grafts taken from lighter-load-bearing areas of patient’s own joint to fill the focal defects (Hangody et al. 1998). Autologous chondrocyte implantation is put forward in the same period (Brittberg et al. 1994) but it is a totally different technique from osteochondral autograft transplantation. It needs to get the chondrocytes from the patient’s joint and expand them in vitro for the reimplantation (Chilelli et al. 2014). However, for the treatment of large cartilage defects, it still remains controversial that which method has better outcomes. So this meta-analysis mainly aimed to compare the efficacy of osteochondral autograft transplantation with autologous chondrocyte implantation in the treatment of large cartilage defects of knee.

Materials and methods

Search strategy

The literature search was performed on November 5, 2014 through the electronic databases: PubMed (1966 to November 2014), OVID (1974 to November 2014), The Cochrane Library (Issue 10 of 12, October 2014) and SinoMed (1978 to November 2014). The search terms and Boolean operators used were as follow: [(osteochondral autograft transplantation or osteochondral autologous transplantation or mosaicplasty or osteochondral cylinder transplantation) and (autologous chondrocyte implantation or autologous chondrocyte transplantation or matrix-assisted chondrocyte implantation)]. No language or article type restrictions were included in search strategy. The eligibility criteria are presented in Table 1.

Table 1 Eligibility criteria

Study selection and data collection

After excluding the duplicates, two reviewers independently screened the titles and abstracts of the initial studies to determine if each article might meet the eligibility criteria. Studies which definitely failed to meet the eligibility criteria were excluded. The rest studies were screened the full text without the authors, institutions, journal of publications and results. The studies which had disagreement in either assessment were screened by the third reviewer to determine whether the studies were included or not.

Each included randomized controlled trials were assessed by two reviewers according the assessment of study quality from randomization, allocation concealment, blinding, drop-out and outcomes. For each eligible study, information was extracted as follows: first author’s name, year and country of publication, study design, follow-up period, setting, mean age and defect size of patients and postoperative functional assessment. We attempted to contact the authors through emails for the data that was not presented on the paper.

Statistical analysis

The extracted information from each study was turned into dichotomous data to describe the postoperative outcomes. The data was expressed as risk ratios (RR) and 95 % confidence intervals (CIs) and was combined by the method of Mantel and Haenszel to test significance and homogeneity. The statistical heterogeneity was evaluated by χ2 test and I2 statistics. The pooled effect was calculated in fixed-effect model when there was no significant heterogeneity (P > 0.10 and I2 < 50 %), or random-effects model was used.

Results

Study selection (Fig. 1)

The initial literature research identified 850 potential relevant studies with 108 duplicates. After the screen of title, abstract and full text according to the eligibility criteria, 5 studies were included: Bentley et al. (2003, 2012), Dozin et al. (2005), Horas et al. (2003), Lim et al. (2012) (Table 2). The Kappa value was 0.828 for the title and abstract screening and 0.876 for the full-text screening. Two reviewers had a controversy on Dozin et al. (2005) which was included after discussion. All the studies were randomized controlled trials with evidence level II (evidence level as declared by the journal in which the study was published). Two studies were in the same cohort with different follow-up periods (Bentley et al. 2003, 2012).

Fig. 1
figure 1

Process of selection of eligible studies

Table 2 Characteristics of the studies included in the meta-analysis

Study characteristics

Bentley et al. (2003) had a total of 100 patients and 96 had osteochondral defects on medial/lateral femoral condyle or patella. The clinical results were graded as excellent (>80), good (55–79), fair (30–54) or poor (<30) by the assessment of modified Cincinatti rating system at 12 months. This study also used arthroscopy to assess the results according to the International Cartilage Research Society (ICRS) grading system in 60 patients. There were no lost in follow-up patients.

Bentley et al. (2012) used the same cohort with Bentley et al. (2003) and 6 patients were lost (5 ACI patients and 1 OAT patient). This study reported the 5-year failure rate (failure was defined as the poor result by the assessment of modified Cincinatti rating system) and 63 patients were taken arthroscope to evaluate the outcomes.

Dozin et al. (2005) was a multicenter trial of 5 orthopedic units with 47 patients registered and 3 quitted. The results were presented on the basis of Lysholm Knee Scoring Scale (LKSS). There was a poor follow-up at 12 months with 18 out of 44 (41 %) so that the outcomes of 37 patients was categorized into four classes: LKSS < 60, LKSS = 69–90, LKSS > 90 and subjective improvement.

Horas et al. (2003) included 40 patients with the lesion on the medial/lateral femoral condyle. Seven of twenty patients with ACI and four of twenty patients with OAT had arthroplasty or spongiolization previously. The postoperative results were presented as enumeration data by Lysholm, Tegner and Meyers scores and ranked data by chief complaint. No participants were lost to follow-up.

Lim et al. (2012) investigated 30 knees with microfracture, 22 with OAT and 18 with ACI which were compared postoperatively by Lysholm score, Tegner score, Hospital for Special Surgery score, modified Outerbridge cartilage grades using MRI and International Cartilage Research Society (ICRS) grading system using arthroscopy. MRI was taken on 36 of 40 (90 %) while arthroscopy was used on 32 of 40 (80 %) and the results was ranked as excellent (1), good (2), fair (3), poor (4).

Meta-analysis

The outcome measurements of these studies were different from each other and there are no satisfied responses from the authors by emails, so we categorized the results into the grades as excellent, good, fair and poor results (Table 3 and Table 4). Four studies (Bentley et al. 2003; Dozin et al. 2005; Horas et al. 2003; Lim et al. 2012) were included first for the comparison of the excellent or good results (Table 3 presented the outcome assessments used in the comparisons) and Fig. 2 showed that the P value was 0.06 and there was no significant overall results. Figure 3 displayed the results that Dozin et al. (2005) was excluded for the poor follow-up and dissatisfied outcome assessments and the total RR was 0.85 with a 95 % confidence interval of (0.73, 1.00). In these two comparisons, neither ACI nor OAT had more satisfied outcomes in about one-year period. Then, we included five studies (Bentley et al. 2003, 2012; Dozin et al. 2005; Horas et al. 2003; Lim et al. 2012) for the comparison of postoperative poor results between these two therapies (Table 4 presented the outcome assessments used in the comparisons). Bentley et al. (2003, 2012) were not used in one comparison for they came from the same cohort with different follow-up periods. Figures 4 and 5 which included Bentley et al. (2003) with/without Dozin et al. (2005) showed no significant results and the P values were 0.12 and 0.19. Lastly, we used Bentley et al. (2012) which had a five-year follow-up to replace Bentley et al. (2003) (1-year follow-up) (Figs. 6, 7) and there was a significant overall effect with Dozin et al. (2005) excluded. In Fig. 7, the overall RR was 2.57 with a 95 % confidence interval of (1.09, 6.07) and it revealed OAT had poorer outcomes than ACI in statistics.

Table 3 Characteristics of the studies for the comparison of excellent or good results
Table 4 Characteristics of the studies for the comparison of poor results
Fig. 2
figure 2

Comparison of excellent or good results

Fig. 3
figure 3

Comparison of excellent or good results (without Dozin et al. 2005)

Fig. 4
figure 4

Comparison of poor results (with Bentley et al. 2003)

Fig. 5
figure 5

Comparison of poor results (with Bentley et al. 2003 and without Dozin et al. 2005)

Fig. 6
figure 6

Comparison of poor results (with Bentley et al. 2012)

Fig. 7
figure 7

Comparison of poor results (with Bentley et al. 2003 and without Dozin et al. 2005)

Discussion

OAT and ACI are two effective surgeries for the treatment of large cartilage defects that usually bigger than 100 mm2 (Perera et al. 2012).

In surgical techniques, OAT needs multiple cylindrical autogenous osteochondral plugs which are usually taken from the less weight-bearing joint area as a mosaic to fill the lesions for a resurfaced area. Therefore it is usually used in the defect that is less than 400 mm2 for the limitation of autograft and may have some complications years after the cylindrical cutting devices. It also has some limitations such as the absence of fill among the mosaics and different orientations of the surfaces of plugs (Bedi et al. 2010; Robert 2011; Bekkers et al. 2009). In the arthroscopical and histological results, smooth surface with hyaline or hyaline-like cartilage could be seen in the most of OAT cases but some patients had a circular gap between the transplanted and surrounding resident cartilage (Bentley et al. 2003, Horas et al. 2003).

ACI has a more complicated procedure than OAT since ACI needs at least two operations. Firstly, the joint should be evaluated by arthroscopy to assess the injury. It also needs full-thickness cartilage biopsy for chondrocyte culturing and expansion (Smith et al. 2005). Obviously, the injury of preparation for the surgery is much less than that of OAT but ACI needs more time and higher cost. Because of the chondrocyte culturing, ACI usually used for the defects over 400 mm2 (Bekkers et al. 2009). There are various kinds of technique for implantation, such as periosteum-covered ACI, collagen-covered ACI, matrix-induced ACI and ACI within a 3D scaffold (Harris et al. 2011; Brittberg 2010; Marlovits et al. 2006). In the included studies, three used the surgical techniques of periosteum-covered ACI while the rest two (Bentley et al. 2003, 2012) used ACI covered with porcine collagen membrane or periosteum. In the most cases of ACI, fibrous repair tissues or fibrocartilage and monodirectional collagen bundles were grown with the surfaces rippled or rough and some had overgrown the level of the surrounding cartilage. Only did a few patients have the results with hyaline or hyaline-like cartilage (Horas et al. 2003; Bentley et al. 2003; Roberts et al. 2009; Huey et al. 2012).

In recent years, there have been several systematic reviews discussing the outcomes of these surgeries, but as little researches and high heterogeneity, no powerful conclusion was drawn (Goyal et al. 2014; Bekkers et al. 2009; Vasiliadis and Wasiak 2010). In this meta-analysis, there were five randomized controlled trials about OAT and ACI included with two of them in one cohort and we included one more study than the previous reviews (Bentley et al. 2012). For these five studies, various outcome assessments (Lysholm score, Meyers score, Tegner activity score, Hospital for Special Surgery score, International Knee Documentation Committee Scale, mofied Outerbridge cartilage grades using MRI, International Cartilage Repair Society repair grade using arthroscopy) and statistic data (quantitative data and quantitative data) were used and no available unified criteria and data remained. We attempted to communicate with the authors for the details of data but there were no positive responses. Consequently, we utilized the ranked data extracted from the original articles. Bentley et al. (2003) used the assessments of modified Cincinatti rating system in 100 patients (100 %) and International Cartilage Research Society (ICRS) in 60 patients (60 %). As a result we prefer the former one for the comparison. Same condition was found in Bentley et al. (2012) where 94 patients (94 %) assessed by Cincinatti rating system while only 63 patients (63 %) by ICRS. In Lim et al. (2012), the available outcomes were presented by MRI (36 of 40, 90 %) and arthroscopy (32 of 40, 80 %) but we considered the results of arthroscopy were more precise and credible. Dozin et al. (2005) and Horas et al. (2003) had no more available choices in ranked data for comparison. In the quality assessment of all the studies included, we found that Dozin et al. (2005) was a multicenter trial with different operators and technique, poor follow-up, dissatisfied outcome assessments and smaller mean defect size (192.5 mm2). So we did the meta-analysis without Dozin et al. (2005) to reduce its interference.

The pooled outcomes of each study were graded as excellent, good, fair and poor. In the comparison of the results with the follow-up of 1–2 years, whether good or poor, there was no significant difference between OAT and ACI. However, when we replaced Bentley et al. (2003) with Bentley et al. (2012) which had a longer follow-up, we could see the significant result that OAT was poorer. So in the short-term, we may not distinguish which one has the better outcomes. Both of these two techniques can cover the large cartilage defects and relief the symptom obviously but as time goes on, the disadvantages shows. In OAT, the injuries for autografts, absence of fill and differences in orientation may influence the outcomes in the future and patients can not afford another OAT operation. In contrast of OAT, ACI has huge potential to be renovated into a more effective technique with the development of tissue engineering and can be taken repeatedly on one patient.

In the six comparisons of the excellent or good results and the poor results, we may reach a primary conclusion that there is no significant different outcomes between ACI and OAT in a short-term follow-up but it may indicate that the patients with OAT may be more likely to have worse conditions than that with ACI for a long-term period.

Obviously, there were some limitations in this study. For this meta-analysis, although all the included studies were RCTs, the qualities of them were not satisfied (level II) and these results might have placebo effects. Moreover, the outcome measurements of the studies were different from each other and we had to categorize the quantitative data into crude grades for comparisons. Different selections of patients, operative technique and rehabilitation program might also cause the heterogeneity. The heterogeneity made it difficult to reach a conclusion that was strong enough, so more high-quality randomized controlled trials and other clinical trials are needed urgently with unified criteria, surgical techniques and long-term follow-up.