Introduction

Critically ill patients are susceptible to anemia and vulnerable to its adverse consequences. The prevalence of anemia among patients admitted to the intensive care unit (ICU) for three or more days is up to 95 % [1]. The most common treatment for anemia in ICU patients is blood transfusion; almost half of these patients receive at least one allogeneic red blood cell unit [2]. However, blood transfusions are associated with an increased risk of morbidity and mortality [3]. Because inappropriate endogenous production of the hematopoietic growth factor erythropoietin is observed in most ICU patients, the administration of recombinant human erythropoietin (rHuEPO) and other erythropoiesis stimulating agents (ESAs) has emerged as a therapeutic option [4]. The application of ESAs has been further extended to acute cardiovascular and neuronal disorders since non-hematopoietic effects of erythropoietin including anti-inflammatory, antiapoptotic, and angiogenic activities have been shown in preclinical and small clinical studies [5].

ESAs are approved for treatment of anemia caused by end-stage renal disease, anemia associated with human immunodeficiency virus infection and anemia due to concomitantly administered chemotherapy in patients with non-myeloid cancers. Further, they are licensed for transfusion reduction in patients scheduled for major surgery, except heart surgery [6]. Therefore, administration of ESAs in critically ill patients is outside the license of these agents and is considered an off-label indication.

The most commonly prescribed ESAs in critical illness are epoetin alfa and darbepoetin [7]. Exposure to ESA treatment was estimated from records of more than 500 hospitals across the United States in 3 years: 72,903 patients in the ICU setting [8], 25,645 inpatients with cancer and 66,822 with chronic kidney disease [9]. On the basis of the overall cumulative dose per ICU stay, the cost of treatment with epoetin alfa and darbepoetin was estimated to be $576 and $841, respectively [8].

Systematic reviews of on-label indications of ESAs have raised concern about their safety and a possible increase in mortality [1013]. A meta-analysis of 27 randomized, controlled trials (RCTs) involving 10,452 patients with chronic kidney disease concluded that targeting higher hemoglobin concentration increased risks for fatal and nonfatal stroke, hypertension, and vascular access thrombosis compared with targeting lower hemoglobin concentration [14]. Another meta-analysis, including 52 RCTs (n = 12,006), found that ESAs increased risk of thrombotic events by 70 % and serious adverse events by more than 15 % in patients with cancer-related anemia. Accordingly, ESAs are not advised as routine treatment in patients with cancer-related anemia as an alternative to blood transfusion [13]. Moreover, the US Food and Drug Administration restricted the use of ESAs and their prescription under a risk management program following studies showing an increased risk of tumor growth and shortened survival in patients with cancer receiving ESAs [15]. The safety profile of ESAs for the treatment of anemia related to chronic renal failure and chemotherapy has been studied extensively [1014].

However, the patterns of adverse events (AEs) associated with off-label indication of ESAs in the treatment of critical illness remain unclear. Our objective was to assess the effects of ESAs compared to either placebo, no treatment, or an alternative active treatment regimen on safety and mortality in critically ill patients when administered off-label. We further aimed at exploring heterogeneity and assessing the influence of bias on the robustness of our effect estimates. An abstract of this review was presented at the 28th International Conference on Pharmacoepidemiology and Therapeutic Risk Management [16].

Materials and methods

Search methods

On 23 April 2012, we updated our search on OvidSP EMBASE, OvidSP MEDLINE, OvidSP PASCAL, OvidSP All EBM Reviews, OvidSP International Pharmaceutical Abstracts, OvidSP PsycINFO, CINAHL, BIOSIS Previews, Science Citation Index Expanded, Conference Proceedings Citation Index-Science and TOXLINE (BM) [search strategy is provided in Appendix A in the Electronic Supplementary Material (ESM)]. To increase the objectivity of our search strategy, we analyzed the content of a 2007 efficacy meta-analysis on ESAs in critically ill patients [4] and its nine included RCTs by using AntConc freeware concordance program (BM) (http://www.antlab.sci.waseda.ac.jp/). Moreover, we searched for ongoing controlled trials on http://www.controlledtrials.com using the multiple database search option (BM) (metaRegister of Controlled Trials). We also contacted the four main manufactures of ESAs (Amgen, Roche, Janssen-Cilag, Ortho biotech). Further, we tracked citations of all relevant studies using SciVerse Scopus (BM).

Selection criteria and outcome measures

We included RCTs and controlled observational studies (cohort or case–control) investigating the effect of ESAs in case of critical illness unless its indication was approved by either the European Medicine Agency or FDA. We included studies in acutely and critically ill adult patients with the intervention being scheduled systemic administration of ESAs versus placebo, no treatment or any alternative active treatment. Following group discussions, an expert in intensive care medicine (HH) assessed the setting of “critical illness” in the absence of a stringent definition of the condition.

We were exclusively interested in patient safety outcomes, including death. If mortality was assessed at several time points in a study, we used data closest to the follow-up period of 30 days.

Data collection and analysis

First, two authors (BM and BH, MS or CK) independently screened the retrieved studies by title and then by abstract for exclusion. They assessed the full text of the possible relevant studies for inclusion and exclusion criteria. Differences in opinion were settled by either consensus or by involving a third author (HH). Two authors (BM, BH, MS, CK) then extracted the data of the selected studies separately onto predesigned forms. The forms were compared and discrepancies in data extraction were resolved by discussion or if necessary by involving a third reviewer (HH). Data were then added to an MS Access database—specifically designed for this review—and analyzed in RevMan 5.1 [17].

Included studies were appraised for their risk of bias by two independent authors (BM, BH, MS, CK) using the Cochrane Collaboration’s tool [18] for assessing risk of bias in RCTs and the Newcastle-Ottawa Scale [19] for assessing risk of bias in observational studies. The results were compared and disagreements were resolved by discussion or if necessary by a third reviewer (HH). We further evaluated quality of harms assessment and reporting in included studies using the McMaster Quality Assessment Scale of Harms [20].

We assessed reporting bias and small study effects by creating funnel plots of standard errors versus effect estimates if 10 or more studies were available for each outcome using RevMan 5.1 and R package meta [21]. Asymmetry was evaluated by visual inspection and formally tested using the arcsine test [22] for data from studies where valid n/N data were available and the Egger test [23] where effect estimates with their standard errors were available.

Data synthesis was deemed appropriate if clinical heterogeneity and methodological heterogeneity was negligible. Clinical heterogeneity was assessed by judgement based on exploration of the characteristics of included studies table. Generally, we used fixed or random effects models depending on statistical heterogeneity between studies to calculate summary estimates. Statistical heterogeneity was quantified by the I 2 statistic [18]. Ideally, observational studies and randomized studies should not be different if confounding is handled appropriately. However, confounding could not be excluded in the observational studies. Therefore, we considered this a relevant source of methodological heterogeneity. Meta-analyses were fitted in a frequentist as well as in a Bayesian setting.

In order to combine data from RCTs and observational studies, we fitted a three-level hierarchical Bayesian model [2426]. This approach allows for between study variability (in the same way as a classical random effects model does) together with between design variability. In this way, the overall estimate makes use of all the information available [27]. Acknowledging that observational evidence is of a different nature, a sensitivity analysis down-weighting this in the synthesis was also carried out. This was done explicitly using a specified parameter, representing the weight given to observational evidence modeled as a multiplicative factor to the observational effect estimate precision. Different weights to inflate the variance have been applied in a sensitivity analysis. We performed subgroup analyses according to the type of erythropoietin, its dosage, non-hematopoietic biological effects and baseline anemia using the test for subgroup differences [18]. We also assessed the robustness of our estimates by comparing the effects from models that included all studies (possibly biased but higher precision due to the utilization of all individuals) to the effects from models that excluded studies with high risk of bias or low quality (potentially lower risk of bias but also lower precision due to the exclusion of studies).

Results

Study selection

The search in electronic databases, citation tracking and trial registers resulted in 12,888 hits and 7,735 hits after removing duplicates (Fig. 1). The response from manufactures did not consist of any additional relevant data related to safety aspects. We found one on-going trial [28] and 11 relevant articles [2939] through tracking reference lists and citations of 53 potentially relevant records. By using Google search, we found journal publications resulting from two relevant trials [40, 41] and two full text papers [42, 43] of potentially relevant abstracts [44, 45]. Moreover, we received further details on four studies (two RCTs and two observational studies) by contacting the authors [4650]. In total, we analyzed 48 studies out of 89 relevant documents: 34 RCTs and 14 observational studies (Fig. 1).

Fig. 1
figure 1

Flow chart of study selection for inclusion in the systematic review

Study description

The 48 studies involved a total of 944,856 participants (6,332 in RCTs and 938,524 in observational studies) from 21 countries (see details on the characteristics of 34 RCTs and 14 observational studies in Tables E1, E2 and Appendix B in ESM).

Randomized controlled studies

Twelve RCTs were designed as prospective, open-labeled studies [29, 32, 34, 43, 5158]; endpoints were blinded in three [54, 56, 58]. All RCTs were reported in English except for one, which was written in Russian [48]. The clinical setting of the patients were as follows: critically ill patients in the ICUs [43, 52, 57, 5965]; myocardial infarction with ST-segment elevation [29, 31, 34, 40, 41, 5456, 58, 66, 67]; myocardial damage with non-ST segment elevation acute coronary syndrome [33]; surgical revascularization of the heart [32]; acute ischemic stroke [39, 68]; aneurysmal subarachnoid hemorrhage [36, 38]; trauma [48, 53, 69]; burn [70, 71]; spinal cord injury [46] and multiple organ dysfunction syndrome [51].

Patients in the experiment group received intravenous epoetin alfa [32, 33, 36, 46, 48, 51, 55, 58, 59, 66, 68, 71], subcutaneous epoetin alfa [43, 52, 57, 6062, 64, 65, 69, 70], intravenous epoetin beta [29, 38, 41, 54, 56, 63, 67], subcutaneous epoetin beta [39] or intravenous darbepoetin [34, 40]. Two studies did not clearly specify the route of epoetin alfa administration, one possibly subcutaneous [53] and the other intravenous [31]. The manufacture of ESAs (mostly alongside their brand names) was reported in all but five studies [31, 43, 53, 59, 63].

RCTs were conducted from July 1990 to September 2010 across 20 countries—14 studies at multiple centers [46, 52, 54, 56, 58, 6067, 71].

Observational studies

All observational studies were designed as cohort studies: five studies used local registries or data repositories [7276], four of them used specific databases [47, 7779], and five studies recruited cases before ESAs intervention concurrently with non-exposed subjects and followed them over a particular time period or compared them with non-exposed population from past records [30, 49, 8082]. All studies were reported in English language except one, which was written in Chinese [30]. Clinical settings included: critically ill patients in ICUs [47, 7376]; trauma [49, 72, 7779]; burn [82]; out-of-hospital cardiac arrest [80, 81]; and anemic septic patients [30].

Observational studies investigated rHuEPO compared to non-rHuEPO [49, 73, 74, 76, 77, 82], ESAs versus non-ESAs [47, 72, 79], epoetin beta versus placebo [81], darbepoetin versus non-darbepoetin [75], ESAs plus either unfractionated heparin or enoxaparin versus unfractionated heparin or enoxaparin alone [78] and also epoetin alfa additional to routine care versus routine care alone [80]. Four observational studies reported the manufacture of ESAs [49, 7981], while two also reported the specific brand names [49, 79].

Observational studies were performed from January 1996 to March 2010, mostly in the U.S.

Risk of bias and quality assessment

Risk of bias

Generally the risk of bias in included studies for safety outcomes (including death) was moderate. All but seven RCTs [32, 34, 43, 52, 53, 55, 56] were identified with low to moderate risk of bias. Observational studies had low to moderate risk of bias, except for four studies [30, 72, 75, 76] that had high risk of bias (see details in Figure E1, Tables E3 and E4 in ESM). Only one of the published observational studies provided adjusted estimates for the pre-specified outcome variables [73], therefore confounding has to be assumed. Noteworthy, Brophy et al. [47] provided us with unpublished adjusted mortality estimates from a subset of their study, which we used for sensitivity analyses.

Quality of harm assessment and reporting

Harm assessment and reporting were of medium to low quality in RCTs and of low quality in observational studies overall (see details in Figure E2, Tables E5 and E6 in ESM).

Effects of ESAs on adverse events

In total, 95 different types of adverse events (including combinations of events) were identified in 37 studies; 62 AEs were reported in one study only, 11 AEs in two studies, 6 AEs in three studies and 16 AEs in at least four studies (details provided in Tables E1 and E2 in ESM). The meta-analysis of the AEs reported in RCTs and observational studies investigating ESAs treatment in critical illness is presented in Table 1. ESAs treatment did not significantly increase the risk of any AE (Fig. 2), any serious AEs (Figure E4 in ESM), venous thromboembolism (VTE, Fig. 3), deep venous thrombosis, pulmonary embolism or any other frequently reported AE in critically ill patients. However, ESAs increased the risk for VTE in the RCTs using frequentist analyses, but had no effect on VTE in the observational studies or when Bayesian methods were applied. In contrary, ESAs decreased the relative risk of central and peripheral nervous system disorders by 63 % (P = 0.03) and respiratory distress by 32 % (P = 0.02).

Table 1 Risk of adverse events associated with ESAs use in RCTs and observational studies of critical ill patients
Fig. 2
figure 2

Forest plot of studies comparing ESAs versus control for the outcome ‘Any adverse event’

Fig. 3
figure 3

Forest plot of studies comparing ESAs versus control for the outcome ‘Venous thromboembolism’

There was no effect on AEs reported in the single studies except for two: clinically relevant thrombotic vascular event (120/728 vs. 83/720; RR = 1.43, 95 % CI 1.10–1.85) [60] and metabolic disturbances including acidosis and alkalosis (11/84 vs. 1/78; RR = 10.2, 95 % CI 1.4–77.3) [63] were more frequent with ESAs.

Effects of ESAs on mortality

Mortality was reported in 38 studies (28 RCTs and 10 observational studies) (details provided in Tables E1 and E2 in ESM). Overall 67,980 deaths (669 in RCTs and 67,311 in observational studies) were observed in 931,053 participants (6,110 in RCTs and 924,943 in observational studies). We found no statistically significant difference in the mortality risk from treatment with ESAs compared to non-ESAs (RR = 0.82, 95 % CI 0.65–1.01; combining both RCTs and observational studies, Bayesian estimates) (Fig. 4).

Fig. 4
figure 4

Forest plot of studies comparing ESAs versus control for mortality

Sensitivity analysis using the Bayesian approach is consistent with the main analysis. Though estimates are similar, the estimated treatment effect based on RCT evidence is larger in the Bayesian model compared to the classical analysis (RR = 0.80, 95 % CrI: 0.63–0.97 versus RR = 0.87, 95 % CI 0.75–1.01). Differences may occur since the Bayesian approach models the binomial outcome data directly rather than for the summary statistics [83]. Combining RCT evidence and observational trials in a hierarchical model accounts for between trial design heterogeneity. Brophy et al. [47] is the main driver of heterogeneity among the observational trials. When excluding Brophy et al., the heterogeneity shrinks and the use of ESAs was associated with a significant reduction in mortality (RR = 0.80, 95 % CrI: 0.63–0.99); the same holds when including adjusted estimates from a subset of Brophy et al. (RR = 0.80, 95 % CrI: 0.62–0.99) (Table E7 in ESM).

Sensitivity analysis excluding high risk of bias studies was consistent with those of the main analysis for all AEs and mortality.

Subgroup analysis

The effect of ESAs on AEs and mortality did not differ significantly between all four pre-specified subgroups- type of erythropoietin, its dosage, non-hematopoietic biological effects and baseline anemia. Post-hoc analysis of trauma patients in five studies [48, 60, 61, 77, 79, 84] revealed a significantly reduced mortality in this population (RR = 0.51, 95 % CI 0.39–0.68; P value for subgroup differences =0.002 trauma versus non-trauma study effects).

Reporting bias

We performed a formal test for funnel plot asymmetry for VTE and mortality. For the outcome mortality there was no indication of funnel plot asymmetry at visual inspection, both for RCTs and RCTs with observational studies, except for the one disproportionally large database study resulting in very small standard errors [47]. Formally, there was no funnel plot asymmetry (arcsine transformation regression, t = −0.7, df = 20, P value = 0.48, for RCTs only and P = 0.25 in the Egger’s test for small-study effects for RCTs and observational studies) suggesting no small study effect or reporting bias. The funnel plot of VTE indicated some asymmetry at visual inspection, concordant with borderline significance at formal testing for funnel plot asymmetry (arcsine transformation regression, t = −2.1, df = 13, P value = 0.05) (See Fig. E4 in ESM for funnel plots).

Discussion

This systematic review provides evidence that ESAs in critically ill patients do not increase the risk of frequently reported AEs or mortality based on data from more than 900,000 patients included in 34 RCTs and 14 cohort studies. Even though 48 studies met the inclusion criteria, the majority of different AEs (56.2 %) were reported in only one RCT [63]. Studies were performed in a wide range of conditions in several countries and consist of RCTs, cohorts as well as ‘daily life’ observational cohorts studies. Given these characteristics and the average mortality in the included population between 7 and 12 % the results may be applicable to most circumstances of critical care medicine. Our meta-analysis significantly updates a paper by Zarychanski from 2007 [4] who concluded from 9 RCTs that insufficient evidence precluded recommendation on routine use of ESAs in critically ill patients.

The results, however, should be viewed with caution for several reasons. Among the 27 RCTs and ten cohort studies classified ‘low to moderate risk of bias’, only three RCTs [41, 67, 68] and one cohort study [79] fulfilled all criteria of low risk for AE outcomes (Figure E1, Tables E3 and E4 in ESM). Specifically confounding may have distorted the effects in observational studies, because adjusted effects lacked for most studies. The adjusted effect was reduced by 25 % towards the null for this ICU mortality compared to the crude effect (data not shown). This may be taken as an indicator of the amount of confounding. As a consequence we used a Bayesian approach to put less weight on the observational studies whilst combining the information from all available study designs. Stepwise reducing the weight from 100 to 10 % overall result estimates shift closer towards the results based on RCTs only. However, the effect is small, since overall estimates are dominated by RCT evidence due to stronger RCT evidence and larger heterogeneity in the observational data.

None of the included studies fulfilled all criteria in the quality of harm assessment and reporting evaluation (Figure E2, Tables E5 and E6 in ESM). Only five RCTs monitored AEs by an independent data and safety monitoring board [58, 60, 61, 66, 68] although the ability of the observer to accurately and consistently assess patients and detecting AEs is crucial in safety studies. The majority of studies had not defined harms in their report. Variability and overlap of terms used to describe each AE across studies was apparent. This was also an issue for considering AEs as serious. Only one study graded severity of AEs by using a coding system [63].

The result of our review is consistent with a Cochrane review on rHuEPO therapy of pre-dialysis patients with renal anemia (15 trials, 461 participants) which found no significant increase in AEs and no mortality benefit [85]. Likewise a Cochrane review on ESAs in chronic heart failure patients with anemia (11 trials, 794 patients) found no increase in AEs but lower all-cause mortality by 39 % [86].

Treatment of chronic kidney disease with ESAs, targeting high hemoglobin levels was associated with higher risks for hypertension (by 67 %), stroke (by 51 %) and vascular access thrombosis (by 33 %) compared with a lower hemoglobin target. Nonetheless, this meta-analysis of 27 RCTs (10,452 patients) did not find statistically significant increasing risks for mortality and serious AEs [14], and hemoglobin target might explain the effects rather than ESAs as such.

However, the results of our review are different compared to the effect of ESAs in patients with cancer. There was increased risk of serious AEs in a meta-analysis assessing the harms of ESAs in adults with anemia related to cancer (RR = 1.16, 95 % CI 1.08–1.25, I 2 = 0 %, 21 trials, n = 5,891) and mortality was increased by 15 % (28 trials of 31 comparisons, n = 6,525) [13]. Another systematic review on the prevention or treatment of anemia in cancer patients with ESAs demonstrated a 17 % increased risk of death in ESAs group (meta-analysis of 13,933 individual-patients (53 RCTs), HR = 1.17, 95 % CI 1.06–1.30) [11]. A possible explanation for this discrepancy might be a different mode of action in an oncologic setting, as EPO is suspected to induce tumor growth [87]. We explicitly excluded studies containing patients with cancer, which might explain this discrepancy.

The survival benefit of epoetin alfa in critically ill trauma patients has been demonstrated in post-trial analysis of two RCTs by Corwin et al. [60, 61, 84] but with uncertainty about its true cost-effectiveness [88]. Our post hoc subgroup analysis in trauma patients was in agreement but substantially driven by Corwin’s studies. Given the post hoc nature of this specific analysis we consider this finding with particular caution.

The manufacture and brand names of ESAs in some of the included studies were not described (Tables E1 and E2 in ESM). We found four biosimilars (these are generics of biotechnological products), which raises the concern of different safety profiles for the biosimilar epoetins, because they cannot be entirely identical to their originator products [89, 90]. However, given the small number of patients in the few studies exposing individuals to biosimilars no formal subgroup analysis was possible. Only one study had estimates on the effect on mortality and did not show any evidence of harm [48].

Some included studies disclosed research support from one of the rHuEPO manufacturing companies and we are uncertain whether the safety profiles reported in these studies were biased. A systematic review which assessed the reporting of adverse effects and potential association with source of funding indicated that industry funding may not be a major threat to bias in the reporting of raw adverse effects data [91]. However, a new study from the US assessed disclosures made in 404 articles published by authors identified from whistleblower complaints as being involved in the promotion of off-label drug use and found that 85 % of papers had inadequate disclosures [92].

Conclusion

There was no statistically significant increased risk of AE in general, serious AE, as well as for the most frequently reported AEs and death in critically ill patients treated with ESAs. These results were robust against within study risk of bias and analysis methods. There is some uncertain evidence that ESAs might increase the risk for VTE, and there is evidence that ESAs increase the risk for clinically relevant thrombotic vascular events. However, because included studies had low quality of harm assessment and reporting the results must be interpreted with caution.