Introduction

Hodgkin lymphoma (HL) is recognized as a malignancy of the lymphatic system which could be cured, even in the advance stage [1]. Major advance has been achieved in recent years in the treatment of advanced HL patients [2].

For a long time, the alternating polychemotherapy cyclophosphamide, vincristine, procarbazine, and prednisone (COPP)/the combination of doxorubicin, bleomycin, vinblastine, and dacarbazine (ABVD) have been considered as the standard treatment for advanced-stage HL patients [3]. Furthermore, several investigators tried to explore more effective treatment. Then, based on the combination of bleomycin, etoposide, doxorubicin, cyclophosphamide, vincristine, procarbazine, and prednisone (BEACOPP), a more aggressive regimen was developed [4]. According to its better response rate and higher efficacy, many groups have adopt BEACOPP as new standard for advanced HL [5, 6]. Recent reports have showed comparable survival rates and lower tumor development for HL patients underwent BEACOPP [7, 8]. Furthermore, the study by Bauer et al. has provided an evidence-based answer regarding the advantages and disadvantages [9]. Skoetz and his colleagues put forward that six cycles of BEACOPP escalated were the best initial treatment strategy based on 10 % advantage over ABVD in overall survival (OS) [10]. To explore the best choice of treatment for HL patients, long-time clinical outcomes and other indexes, including complete remission (CR) and progression-free survival (PFS), should be taken into consideration.

Therefore, the study was designed to compare ABVD and BEACOPP on short- and long-time clinical outcomes through systematically calculating the long survival benefits to find better treatment for HL. In our meta-analysis, indexes associated with clinical efficacy, including CR rate, and PFS longer than 5 years for ABVD versus BEACOPP were further enrolled in the meta-analysis.

Materials and methods

Search strategy

Electric publications were searched from original to December 2015 throughout databases, including Medline, PubMed, and Embase. The following keywords were used for the search: “ABVD” AND” BEACOPP” AND (“Hodgkin lymphoma” or “HL”). Two authors independently screened articles identified from the sources above for potentially relevant studies.

Selection criteria

Randomized controlled trials (RCTs) on patients with previously untreated and histologically confirmed early unfavorable or advanced-stage HL were included. Moreover, included studies should be met the following criteria: (1) participants were newly diagnosed with early unfavorable or advanced-stage Hodgkin lymphoma; (2) participants in the control group were treated with COPP/ABVD, and participants in the experiment group were treated with BEACOPP; (3) OS and PFS were given in the articles.

Non-original articles, such as reviews, reports, and mail articles, would be excluded. Moreover, original articles did not provide sufficient data that were also excluded.

Data extraction and quality assessment

Data were extracted from studies by two review authors using a pre-designed extraction form. The following information was extracted: the first author and the year of publication, location, sample size, length of follow-up, recruitment period, intervention, and outcomes, including OS and PFS. Disagreement would be resolved through discussing with the third reviewer until reach an agreement.

Two reviewers independently assessed literature quality. The Jadad scale was applied to evaluate the selected literature with a 5-point scoring system: random assignment (+1 point), indication of random method (+1 point), double blind (+1 point), indication of double-blind method (+1 point), and mention of subjects who quit or were lost (+1 point) [11]. In this analysis, studies were classified as “low quality” if the score was 0–2, and as “high quality” if the score was 3–5.

Statistical analysis

Odds ratio (OR) with corresponding 95 % confidence interval (CI) was chosen as the effect size to assess ABVD versus BEACOPP treatment. Heterogeneity among individual studies was calculated by Cochran’s Q statistic and I 2 test [12]. If P value <0.05 (Q statistic) and/or I 2 > 50 %, heterogeneity was recognized as significant and the random effects model would be selected. Otherwise, the fixed effect model would be used.

Patients in HD9 trial received eight cycles of BEACOPP baseline or eight cycles of BEACOPP escalated, respectively. Nevertheless, patients in other three trials received short courses of escalated BEACOPP combined with standard BEACOPP. Thus, subgroup analysis removed the study by Engert et al. was further performed [13]. Finally, a sensitivity analysis was performed to confirm the robustness of the results through omitting one study at a time. All statistic analyses were carried out using Revman 5.2.

In addition, according to different treatment-related toxicity definitions, the toxicity associated with treatment strategy could not be meta-analyzed. We reviewed the adverse events in patients treated by BEACOPP versus ABVD.

Results

Characteristics of the studies

As shown in Fig. 1, study selection was described by a flow chart. First, a total of 796 articles were identified after the initial retrieval. Of these, 155 articles were retrieved from PubMed, 393 from Embase, and 248 from Wiley. Then, 568 duplicated articles and 181 papers not relevant were excluded after reviewing the titles and abstracts. Within the last 47 articles, 28 articles were excluded, because these articles were not VBVD versus BEACOPP, non-original, non-RCT. Finally, 7 articles accounting for 4 trails were included in the meta-analysis [4, 7, 8, 1316].

Fig. 1
figure 1

Flow diagram of study selection process

The general characteristics of each study were provided in Table 1. In general, all enrolled studies were carried out in Europe, three studies accounting for two trails were in Italy, one study was in France, and three studies were in Germany. Patients with clinical stages IIB, III, or IV were older than 15 years. The median follow-up for the entire group of patients ranged from 61 months to 10 years. Moreover, sample size enrolled in the individual studies differs greatly from 150 to 1195.

Table 1 Characteristics of enrolled studies

The quality evaluation results showed that all the 7 articles were of high quality as their scores were greater than 3 points, suggesting that these literatures were suitable for meta-analysis.

Meta-analysis of CR rate for ABVD versus BEACOPP

CR rate was chosen, because this end point is directly measuring the response to the therapy. Significant heterogeneity among the enrolled studies evaluating CR was observed (P = 0.02, I 2 = 65 %), and thus, the random effects model was used for analysis. The results, as shown in Fig. 2a, indicate that patients assigned to BEACOPP therapy had a better CR rate compare with patients assigned to BEACOPP therapy (OR = 0.55, 95 % CI 0.35, 0.87). After we removed the study by Engert et al. [14], there was a substantial decrease for heterogeneity among the individual studies (P = 0.77, I 2 = 0 %). Figure 2b shows that BEACOPP is significantly better than ABVD in terms of CR (OR = 0.66, 95 % CI 0.45, 0.97).

Fig. 2
figure 2

Forest plot of complete remission for ABVD versus BEACOPP. a Complete remission for ABVD versus BEACOPP. b Complete remission for ABVD versus short courses of escalated BEACOPP combined with standard BEACOPP. Enger A 2009b represents data of BEACOPP baseline group from Enger A 2009; Enger A 2009e represents data of escalated BEACOPP group from Enger A 2009

Meta-analysis of PFS and OS for ABVD versus BEACOPP

PFS and OS rates were chosen as the main long-term clinical outcome assessment for ABVD versus BEACOPP (Fig. 3). Figure 3a shows that significant heterogeneity occurs among individual studies evaluating PFS (P = 0.02, I 2 = 66 %), and the random effects model was chosen to pool effect size. The risk of progression or death was significantly decreased in BEACOPP as compared with ABVD (OR = 0.56, 95 % CI 0.38, 0.81). Heterogeneity was decreased after the study by Engert et al. was removed (P = 0.15, I 2 = 48 %), and BEACOPP was significantly better than ABVD in terms of CR (Fig. 3b; OR = 0.59, 95 % CI 0.41, 0.85).

Fig. 3
figure 3

Forest plot of progression-free survival and overall survival for ABVD versus BEACOPP. a Progression-free survival for ABVD versus BEACOPP. b Progression-free survival for ABVD versus short courses of escalated BEACOPP combined with standard BEACOPP. c Overall survival for ABVD versus BEACOPP. d Overall survival for ABVD versus short courses of escalated BEACOPP combined with standard BEACOPP. Enger A 2009b represents data of BEACOPP baseline group from Enger A 2009; Enger A 2009e represents data of escalated BEACOPP group from Enger A 2009

According to the significant heterogeneity, sensitivity analysis was performed. As shown in Table 2, heterogeneity was significantly decreased after the study by Mounier et al. [8] or the study by Engert et al. [14] was removed.

Table 2 Outcomes for sensitivity analysis for included studies evaluating progression-free survival

As for OS evaluation, heterogeneity among the studies was not significant (P = 0.20, I 2 = 33 %), and thus, the fixed effects model was used for analysis. The results, as shown in Fig. 3c, indicated that patients assigned to BEACOPP therapy had a better OS rate compare with patients assigned to ABVD therapy (OR = 0.64, 95 % CI 0.51, 0.81). After we removed the study by Engert et al [14], no significant difference for OS was observed between BEACOPP and ABVD (OR = 0.72, 95 % CI 0.45, 1.15).

Treatment-related toxicity comparison for ABVD versus BEACOPP

Adverse events were reviewed in all enrolled studies. Engert et al. reported that there were acute hematologie in 71 % patients receiving COPP/ABVD, 74 % patients receiving BEACOPP baseline, and 98 % patients receiving BEACOPP escalated [14]. Data from another study showed that hematologic toxicities occurred in 54 % patients treated by BEACOPP and 34 % patients treated by ABVD [7]. Mounier and his colleagues put forward that 20 severe adverse events in the ABVD arm versus 62 in the BEACOPP arm occurred [8]. Three deaths from toxic effects in the ABVD group (7 %) and three deaths from toxic effects in the BEACOPP group (15 %) were observed in the study by Viviani et al. [16]. Although different treatment-related toxicities were observed, toxicity introduced by BEACOPP was higher than ABVD, especially hematologic toxicities.

Discussion

In clinical, two different international standards, ABVD and BEACOPP, were used for the treatment of early unfavorable and advanced-stage Hodgkin lymphoma. The meta-analysis demonstrated that patients assigned to BEACOPP therapy had better CR rate long-time clinical outcomes, including PFS and OS, but OS outcomes did not differ significantly when we focusing on the baseline BEACOPP combined with escalated BEACOPP as compared with ABVD. Moreover, toxicity introduced by BEACOPP was higher than ABVD, especially hematologic toxicities.

Although increased-dose BEACOPP confers a long-term survival benefit, the toxic effects of the treatment and the results of salvage programs in the event of treatment failure should be considered in making a decision for initial therapy. Previous data put forward that several adverse events, including acute hematologic and non-hematologic infertility, and secondary neoplasias were accompanied by BEACOPP [1719]. Hematologic toxicities were significantly higher in patients receiving both baseline BEACOPP and escalated BEACOPP as compared with ABVD [17]. Moreover, long-time clinical outcomes should also be related to the whole treatment strategy, not just on the focused chemotherapy combinations, such as followed radiotherapy [9]. The meta-analysis supported benefit from BEACOPP and dose escalation. Consequently, we suggested that the patients who have blood dyscrasias might avoid the escalated BEACOPP. In addition, Johnson et al. put forward one possible direction of treatment choice of interim positron-emission tomography–computed tomography (PET–CT) for guiding de-escalation of therapy, and the overall results under the scan showed favorable outcomes as compared with full-course ABVD and more consolidation radiotherapy [20]. Then, the technology would be benefit for the quality of life of patients with Hodgkin lymphoma through improving the long-term toxic effects. Thus, it would be recommended that BEACOPP treatment was applied after measure of early response using PET-CT.

No significant heterogeneity was observed among individual studies evaluating CR, suggesting the high-balanced backgrounds of enrolled individual studies. Meanwhile, patients in the studies received different combinations of BEACOPP baseline and BEACOPP escalated. For example, in the study by Merli et al., patients received four cycles of BEACOPP in the escalated regimen, followed by two cycles of BEACOPP in the standard regimen [15]. Alternatively, patients enrolled in the study by Viviani et al. received escalated BEACOPP for four cycles and baseline administration for four cycles [16]. Although this study supported the superiority of BEACOPP treatment over ABVD in terms of CR, the best choice for HL patients should be further focused on the combination strategy for BEACOPP.

Based on our study, it demonstrated that BEACOPP was the more effective regimen, and ABVD was better tolerated but less effective. Therefore, in clinical, the strategy improving efficacy of ABVD or toxicity of BEACOPP would be more acceptable. A new option, antibody–drug conjugate brentuximab vedotin, which showed high efficacy and good tolerability, has been researched in a pivotal phase II study [21]. To improve the toxicity of BEACOPP, two modified BEACOPP variants incorporating brentuximab vedotin was designed [22]. In the future, we believe new generation of drugs would increasingly replace chemotherapy and radiotherapy.

Several limitations should be noted in the meta-analysis. First, significant heterogeneity was observed among individual studies evaluating PFS rate, although heterogeneity was significantly decreased after removing the study by Mounier et al. [8]. Nevertheless, the conclusion did not inverse, suggesting the strong strength of the conclusion. Second, the main backgrounds of the enrolled studies have been adjusted. Other covariates, such as proficiency of doctor and supportive therapy, could not be balanced in the study. Third, the strength of the conclusion might be influenced by limited sample size.

In summary, the meta-analysis showed that BEACOPP treatment had a stabilized significant improvement in CR rate and long-term PFS, but there was no significant difference in OS rate. However, additional considerations, balancing treatment-related toxicity and the choice of followed treatment strategy, may help with our decision-making for treatment with ABVD or BEACOPP.