Background

The thalassemias are a group of inherited disorders of hemoglobin [1], the most severe form of which is beta thalassemia major. Improvements in the management of patients with thalassemia major in the past four decades have resulted in "one of the most dramatic alterations in morbidity and mortality associated with a genetic disease" [2]. Regular red blood cell transfusions extend survival, eliminate complications of anemia, inhibit bone marrow hyperactivity, and support normal growth and development in patients with thalassemia major [3]. Unfortunately, regular transfusions also lead to the accumulation of tissue iron, loading the body's organs to the point of dysfunction and finally death in the second or third decade of life if left untreated [4].

To alleviate iron loading, chelating agents such as desferrioxamine (Desferal®) and, most recently, deferiprone (Ferriprox®) are available. As a complement to frequent blood transfusions, proper iron chelation therapy further improves the quality of the patient's life and extends survival, impeding iron loading complications.

Successful management of thalassemia major relies upon accurate assessment of body iron burden. Several indirect assessment methods are available, among which are measurement of serum ferritin levels, urinary iron excretion, and hepatic iron quantification. While no one method is superior for all clinical scenarios, until recently measurement of hepatic iron stores – through liver biopsy or magnetic susceptometry – has provided the most quantitative, specific and sensitive method for determining the body iron burden in patients with thalassemia major and was considered the reference method for comparison with other techniques [2, 4]. In the future, the newer advanced magnetic-resonance techniques, which allow for the assessment of both liver and cardiac iron, might provide an even more accurate assessment of total body iron [5]. Measurement of the magnitude of iron loading is useful in evaluating the effectiveness of the chelating agent, calibrating patient-specific treatment, and, in clinical research, as a determinant of clinical outcome.

The literature on desferrioxamine and deferiprone lacks consensus on their comparative effectiveness and even on the methods for its quantification. The current evidence is comprised of many small non-comparative studies that evaluate the efficacy of a chelator in the short- or, more rarely, long-term. Moreover, a significant impediment to comparing study outcomes is variation in the method employed to measure iron burden. Several systematic reviews of the literature have been published [2, 68]. None of them are quantitative comparisons of the efficacy of the primary chelators. We undertook a quantitative review of the literature to estimate the effectiveness of desferrioxamine and deferiprone in decreasing hepatic iron concentrations (HIC) in thalassemia major.

Methods

Search strategy and data extraction

All studies of desferrioxamine and deferiprone usage in thalassemia major patients – whether randomized, blinded, comparative, case series, or cross-over – irrespective of language were considered eligible for inclusion in the analysis. Although preference is usually given to randomized controlled trials – as they provide the strongest, least biased evidence of efficacy – the scanty information available for thalassemia forced us to consider all study types. Indeed, to our knowledge, only one randomized controlled trial comparing desferrioxamine and deferiprone has been conducted and results for that study are incomplete as the trial was terminated prematurely and has not been reported in full [911].

The National Library of Medicine's computerized bibliographic database (Medline 1966 to December 1999) was searched using a combination of the following keywords: thalassemia, serum ferritin, urinary iron excretion, hepatic iron, liver, chelation therapy, iron chelation, iron chelating agents, desferal, desferrioxamine, deferoxamine, L1 oral chelate, deferiprone and 1,2-dimethyl-3-hydroxypyrid-4-one. The Medline search was augmented by manually searching the reference lists of retrieved studies and reviews and by reviewing abstracts from conference proceedings. The studies identified were carefully evaluated for eligibility; they were included if they (i) enrolled subjects with thalassemia major – irrespective of age at diagnosis, treatment initiation or study start nor of treatment history in terms of transfusion regimen or iron chelation; (ii) followed patients treated with either desferrioxamine administered subcutaneously or intravenously or oral deferiprone; and (iii) measured hepatic iron concentrations to evaluate treatment efficacy. Studies aimed at comparing the relative performance of different measurement techniques (e.g., liver biopsy versus MRI) were excluded. Abstracts that provided sufficient information on the endpoint under consideration were retained for the analysis. Duplications were identified and only the original or, if pertinent, most extensive report was included. No exclusions were implemented on the basis of sample size, study quality or study duration.

Details on study design, length of follow-up, number of patients included, patient age, presence of iron overload-related complications at study start, prior and current iron chelation (drug, route, dose), use of concomitant medication and outcome measures were extracted independently by two investigators using a standardized electronic data collection form. Investigators were not blinded to journal, author, institution or treatment. All differences in extracted data were resolved by consensus between the two extractors prior to locking the database.

Statistical analysis

The main analyses were conducted using individual patient level data, as the majority of studies reported this level of detail. For the purpose of these analyses, all HIC had to be converted from the original value to a common measurement unit. We opted for mg/g dry liver weight. For each patient, we calculated the absolute change in HIC over the study period and a "responder" was defined as showing any improvement. The relative change was calculated as the absolute change divided by the baseline HIC.

The mean HIC at study end was calculated for each treatment. As initial hepatic iron load was greater, on average, in patients receiving desferrioxamine, the comparison of the means needed to take this into account. This was done by carrying out an ANCOVA, controlling for HIC at baseline. A second analysis was conducted to evaluate the proportion of responders, and the χ2-test was used to test for differences in these proportions. The odds ratio of improvement and its 95% confidence interval were also calculated. Next, for the subset of patients who showed improvement over time, the mean relative change in HIC was compared.

Although no fixed dose of either treatment is optimal for all patients [12], an attempt was made to differentiate what would generally be considered a suboptimal dose for each treatment in line with current treatment recommendations. For desferrioxamine, this was defined as less than 40 mg/kg/day. For deferiprone, the corresponding threshold was 75 mg/kg/day. It should be noted that information on the specific dose used by individual patients was not provided for the majority of the studies, where the dose varied over a certain range. Rather than ignore the potential impact of dose on the outcome measure, we assigned the mean dose reported for the entire study population to each individual within a particular study. Analyses were carried out by dose.

Finally, additional analyses were conducted to explore the impact on the odds of improvement of including the few studies that provided summary information at the study level only.

We considered a p-value less than 0.05 to be significant for all statistical tests. No correction for multiple comparisons was made. Analyses were carried out with SAS version 6.12 for Windows.

Role of funding source

The funding source had no role in the collection, analysis, or interpretation of the data.

Results

After the first screening of titles and abstracts, 167 potentially relevant articles were identified (a list of all references is available from the authors). An in-depth review of the full text articles, led us to exclude 106 articles for the following reasons: review article (n = 18), study sample did not include patients with thalassemia major (n = 21), wrong treatment mode (e.g., desferrioxamine intramuscular or bolus injection, n = 12), wrong endpoints (e.g., pharmacokinetic studies, n = 32), study presented insufficient data to be extracted (n = 18), and other (n = 5). Eleven of the remaining 61 articles contained information on HIC and are therefore the subject of this analysis (Table 1) [11, 1322].

Table 1 Characteristics of the 11 studies containing information on hepatic iron concentrations in patients with thalassemia major

Eight of the 11 studies provided data at the individual patient level; relating to 98 patients in total, of which 30 and 68 were treated with desferrioxamine and deferiprone, respectively (the individual patient data are available from the authors). Two of the additional reports [11, 18] provide results in abstract form only. Moreover, results for one of these remain partial due to study discontinuation [11].

Substantial variation in study design and execution was detected. Moreover, patients differed considerably with respect to disease and treatment history. For example, for several of the deferiprone studies [1722] it was specified that patients had been exposed previously to desferrioxamine but had failed, usually due to non-compliance. Although the length of the observation period varied notably among studies (range: 13 to 96 months), it is fairly balanced between treatments, with a mean duration of about 45 months for both groups (2.9 years for desferrioxamine and 3.3 years for deferiprone). In many cases, the data reported related to a small subset of the investigator's patient population who continued treatment for a prolonged period of time and for whom long-term information on changes in iron load was available. In that respect, the information reported in the literature might very well be biased. As there is little information in the papers as to how the patients with serial hepatic iron information were selected, the impact of this potential selection bias cannot be identified.

Clinical characteristics of the study patients are summarized in Table 2. Although all studies included predominantly children and young adults with thalassemia major, patients treated with desferrioxamine were significantly (p = 0.03) younger (mean: 13 yrs) than those treated with deferiprone (mean: 21 yrs). The severity of the hepatic iron overload at study entry differed greatly between patients, ranging from 0.5 to 115.0 mg/g dry weight (mean: 21.4 mg/g, SD: 17.6 mg/g), and was also significantly (p < 0.0001) higher for the desferrioxamine-treated patients (mean: 36.5 mg/g, SD: 21.3 mg/g) compared to deferiprone-treated patients (mean: 14.8 mg/g, SD: 10.2 mg/g). All 30 patients treated with desferrioxamine received the optimal dose; that is, 40 mg/kg/day or higher compared to about two thirds of patients treated with deferiprone (48 out of 68).

Table 2 Clinical characteristics of patients in studies included in the meta-analysis

Hepatic iron concentrations at endpoint

The mean HIC at endpoint, considering all patients, was 14.4 mg/g. Based on clinical experience with hereditary hemochromatosis, the optimal range for HIC in patients with thalassemia major is considered to be between 3.2 and 7.0 mg/g dry weight, above which patients run an increased risk of complications, including hepatic fibrosis, diabetes mellitus, and – at concentrations above 15 mg/g – of cardiac disease and early death [2]. At the end of the respective observation periods, 67.3% of all treated patients still had an HIC over 7.0 mg/g, compared to 85.7% of patients at the start.

Figure 1 clearly illustrates the anticipated correlation between initial and final HIC and, therefore, the importance of controlling for initial hepatic iron load in comparative analyses. Hepatic iron load at study end – controlling for initial hepatic iron load – was found to be significantly lower in desferrioxamine-treated patients (adjusted mean: 6.4 mg/g), compared to patients treated with either optimal dose (15.3 mg/g) or low dose (24.3 mg/g) deferiprone (overall p < 0.0001; p = 0.06 and p = 0.0002 for pairwise comparisons of desferrioxamine vs optimal and low dose deferiprone, respectively).

Figure 1
figure 1

Initial and final hepatic iron concentrations for the 98 patients included in the main analyses by treatment and dose category. Hepatic iron concentrations are expressed in mg/g dry liver weight. The dotted lines indicate the cut-off (7.0 mg/g) above which patients run an increased risk for complications due to hepatic iron overload.

Changes in hepatic iron concentration over time

Overall, 65 of the 98 patients showed an improvement in HIC. Patients treated with desferrioxamine were more likely to improve than patients treated with optimal dose (OR: 19.0; 95%CI: 2.4–151.4) or low dose (OR: 53.9; 95%CI: 6.0–483.7) deferiprone within the study observation periods (Figure 2). Controlling for hepatic iron load at baseline did not affect these results.

Figure 2
figure 2

Odds ratios for improvement in hepatic iron concentrations over time, presented on a logarithmic scale. The odds ratios for the main analysis are provided for each dose category of L1 separately and combined. The odds ratios for the sensitivity analysis are presented for inclusion of each of the additional studies one by one and combined.

Among the 65 patients who improved over time, the hepatic iron load decreased significantly more in desferrioxamine-treated patients: by 60.2% (an average of 15.8 mg/g), compared with 45.3% (12.4 mg/g) in deferiprone optimal dose and 33.5% (10.7 mg/g) in deferiprone low dose (overall p < 0.01; findings confirmed in pairwise comparisons).

Inclusion of additional studies

We explored the impact of the three studies that did not provide individual patient data by including them in additional analyses. To allow for this analysis, "improvement" had to be defined for each study, as results were reported in slightly different ways depending on the study. Patients in the study conducted by Diav-Citrin [22] were all treated with deferiprone and we defined improvement as HIC that were reduced or maintained at less than 7 mg/g of dry weight liver tissue. A threshold of 15 mg/g tissue iron had to be used instead for the comparative study conducted by Olivieri and colleagues due to how these researchers summarize the results [11]. For the study by Longo and colleagues [18], patients receiving deferiprone who were in a negative or stable iron balance at the end of the observation period were considered to have improved.

The odds ratios presented in Figure 2 demonstrate that the findings remain largely in favor of desferrioxamine in all scenarios examined; that is, the 95% CI do not include 1.

Discussion

Based on analyses of individual patient data from eight studies reporting on changes in HIC over time, desferrioxamine seems to be more effective than deferiprone in lowering hepatic iron in patients with thalassemia major. Indeed, the analyses indicate that desferrioxamine not only increases the likelihood of lowering hepatic iron load, but also tends to induce larger reductions in hepatic iron among responders, even after controlling for the imbalance in HIC at study initiation. These results remain when including data from three additional studies that provided summary information only.

There are several important qualifications to these results. The doses of deferiprone – even in the optimal dose group – are relatively low, compared to the desferrioxamine doses, which in most studies are well above the recommended 40 mg/kg/day. This is important in light of the strong connection between the dose of the iron chelator and the amount of iron excreted. On the other hand, the results from the only randomized trial suggest that even at relatively low doses, desferrioxamine outperforms deferiprone [11]. Moreover, the toxic:therapeutic ratio of deferiprone is reportedly low; doses of 100 mg/kg/day have resulted in bone marrow toxicity in animals and humans. Second, liver iron concentration is only one of the outcome parameters considered in the studies included in our analyses and in several cases only available for a small subset of the entire study population [1316, 21]. As it is impossible to determine how or why these particular patients were selected for repeated liver biopsy, the magnitude and direction of the potential bias this may have caused cannot be ascertained. Finally, as several of the patients included in the deferiprone studies had previously failed desferrioxamine treatment, due to non-compliance or other reasons, it could be postulated that they are more likely to do badly on deferiprone as well. Despite the fact that compliance is improved on deferiprone, this highlights the dangers of assuming that this drug, with potential toxicity greater than that of desferrioxamine, should be implemented in patients struggling with desferrioxamine, as proposed by some clinicians.

Comparison with specific findings in the literature is difficult because of the vast differences in study objectives, patient populations studied, parameters evaluated, and analytical approach. Broadly speaking, however, our findings are in accord with other published data. Several systematic, qualitative reviews [2, 8, 23, 24] highlight the clinical benefits of desferrioxamine treatment and, with respect to iron loading, acknowledge its ability to maintain harmless hepatic iron levels in properly chelated patients. The most recent of these reviews also corroborate our finding that deferiprone-treated patients do not experience the same degree of improvement in hepatic iron levels as desferrioxamine-treated patients [2, 6] and some project further doubt on the long-term implications of deferiprone treatment [8].

Measurement of HIC has been established in the field as the reference method for assessing total body iron burden in thalassemia major, and therefore preferred for research purposes. Nevertheless, the methods of measurement (namely liver biopsy and superconducting quantum interface device or SQUID) are often infeasible in everyday practice. Liver biopsies are neither a convenient [23] nor patient-preferred [2, 19, 25] means of monitoring efficacy in iron chelation therapy. Even less viable an alternative is the new, noninvasive SQUID, which is both prohibitively expensive and restricted to only three especially-equipped sites: one in Germany, one in Italy and one in the United States [2, 8]. Such constraints lie at the root of the limited number of published studies that use HIC as the efficacy criterion, despite its importance to clinical investigation. It should also be noted that the newer advanced magnetic-resonance techniques might provide a more accurate assessment of total body iron than liver biopsy or SQUID because iron pools in the heart and liver may be separate and this newer technique allows for the simultaneous assessment of both hepatic and myocardial iron concentrations [5].

The diversity in study design and execution as well as the reporting of the results posed many challenges for this analysis. The literature is dominated by observational studies and nonrandomized clinical trials performed on small selected patient cohorts, thereby making a traditional meta-analysis impossible to complete. The content of the studies is noteworthy as well. Many studies were carried out without treatment protocols while others reported results only for subgroups – usually undefined. Still other studies reported outcomes only graphically, that is, without providing the precise values. A further challenge to evaluating hepatic iron as reported in the published data is the significant variation in measurement units and assessment values (i.e., dry or wet), as Table 2 illustrates. For example, the wet-to-dry weight liver iron conversions require an assumption of liver water content, which ranges from 60 to 75 percent in the literature [26]. Due to incomplete information, our analyses do not take into account some potentially important patient characteristics, such as duration of treatment administration or presence of iron-induced complications, nor the patients' transfusion regimens, all of which are used to calibrate appropriate dose prescriptions and may influence efficacy outcomes [12, 21, 25, 27].

Given the methodological caveats and the heterogeneity of study characteristics, one could legitimately question the appropriateness of pooling the results on hepatic iron concentration across studies. Physicians must still select the optimal treatment for their patients, however, regardless of the quality of the evidence. In the absence of large randomized well-controlled trials, it was felt that this analysis, despite its limitations, would summarize the evidence in a useful way. To avoid any misleading conclusions we have sought to be completely transparent about the underlying assumptions and caveats of the analysis.

To many, deferiprone may not appear as a first-line chelator but rather as an alternative to desferrioxamine should the latter not be usable. From that point of view, comparative effectiveness may not be an issue. The analysis presented here casts doubt, however, even on this premise that deferiprone offers a useful alternative to desferrioxamine in patients who have difficulties with the administration of a parenteral drug.

Since this review was completed, two important new studies have become available. In the first study, 144 patients with thalassemia major and relatively low serum ferritin concentrations (1,500 – 3,000 ng/mL) were randomized to either deferiprone (n = 71) or desferrioxamine (n = 73) and followed for one year [28]. Although, the primary efficacy measure was the reduction of serum ferritin, HIC was assessed in a small subgroup of patients willing to undergo repeat liver biopsy: 21 in the deferiprone and 15 in the desferrioxamine group. No significant difference in the reduction of HIC or the presence of liver fibrosis was detected in this subgroup of patients, leading the authors to conclude that both treatments have a similar chelating effect over a relatively short time period. This apparent comparability in reduction of HIC is in conflict, however, with the results of another recently published study. A case-control study in 15 patients treated with deferiprone (cases) and 30 patients treated with desferrioxamine (controls) matched for age and serum ferritin levels, demonstrated that deferiprone appears to be more effective than desferrioxamine in removal of myocardial iron despite significantly higher HIC in the deferiprone group [5]. Although a larger prospective trial is needed to confirm the results, these findings would suggest that a combination of both drugs may be most beneficial in order to reduce both hepatic and myocardial concentrations of iron.

Conclusions

It has been said that a meta-analysis is only useful until the next good trial comes along [29]. In the case of iron chelation therapy for patients with thalassemia major, the addition of any good prospective randomized control trial documenting the impact of treatment on body iron (hepatic and/or myocardial) would contribute measurably to the evaluation of the long-term implications of treatment with deferiprone, compared to desferrioxamine. Until such a study becomes available, we present this comprehensive and quantitative review of the evidence to aid patients with thalassemia major and their physicians in clinical judgment and treatment decisions.