Introduction

The sodium-glucose cotransporter 2 (SGLT2) inhibitor is a novel class of anti-hyperglycemic drugs [1]. SGLT2 inhibitors work by blocking glucose reabsorption at the proximal renal tubule, hence increasing urinary glucose excretion and lowering blood glucose in diabetic patients. Beyond glycemic control, clinical trials have demonstrated cardiovascular [2,3,4], metabolic [5,6,7], and renal benefits [4, 8] in diabetic patients treated with SGLT2 inhibitors compared to placebo. The efficacy of SGLT2 inhibitors is reflected in the 2019 European Society of Cardiology guideline as a first-line therapy for patients with type 2 diabetes mellitus and established cardiovascular disease [9].

Heart failure is a prevalent public health problem, affecting more than 37 million individuals globally [10]. In addition, it is associated with significant morbidity and mortality and confers a huge financial burden on the healthcare system [11]. Hence, to reduce the huge financial burden attributed to heart failure, cost-effective pharmacological therapy is highly sought after.

Regardless of diabetic status, recent clinical trials of SGLT2 inhibitors have demonstrated similar cardiovascular, metabolic, and renal benefits in heart failure patients. In patients with heart failure, the EMPEROR-Reduced (Empagliflozin Outcome Trial in Patients with Chronic Heart Failure With Reduced Ejection Fraction study) [12], DAPA-HF (Dapagliflozin and Prevention of Adverse Outcomes in Heart Failure trial) [13], and CANVAS (Canagliflozin Cardiovascular Assessment Study) [4] trials demonstrated that SGLT2 inhibitors improve cardiovascular, metabolic, and renal outcomes in heart failure patients. However, the recently published VERTIS CV trial (Evaluation of Ertugliflozin Efficacy and Safety Cardiovascular Outcomes Trial) [14] that examined the use of Ertugliflozin in diabetic patients with the atherosclerotic disease showed relatively inferior treatment effects compared to other SGLT2 inhibitors. This has generated controversy over the presence of a class effect and whether such a class effect applies across patient populations such as in patients with heart failure [15,16,17,18].

A 2019 meta-analysis suggested that dapagliflozin, empagliflozin, and canagliflozin might have class effects on cardiorenal outcomes [19]. However, the 2019 meta-analysis compared SGLT2 inhibitors in diabetic patients, rather than heart failure patients. Furthermore, the study was based on only four clinical trials, and no network meta-analysis was performed [2, 4, 20, 21]. Since that publication, multiple randomized controlled trials (RCTs) and relevant subgroup analysis of previous trials were published [12, 14, 22,23,24]. To the best of our knowledge, there has not been any meta-analysis examining the differences in cardiovascular, metabolic, and renal outcomes (henceforth clinical outcomes) across different SGLT2 inhibitors in heart failure patients. Hence, we conducted a systematic review and meta-analysis to compare the clinical outcomes across different SGLT2 inhibitors in heart failure patients. We hypothesized that in heart failure patients, SGLT2 inhibitors have no significant treatment effects across clinical outcomes.

Methods

The meta-analysis was reported according to the Preferred Reporting Items of Systematic Reviews and Meta-Analyses (PRISMA) guidelines [25]. Searches of four databases (PubMed, Embase, Cochrane, and SCOPUS) were conducted on 13 September 2020 for articles published from 1 January 2000 up to 13 September 2020. A literature search was performed using the following terms in combination: (“empagliflozin” OR “canagliflozin” OR “dapagliflozin” OR “Ertugliflozin”) AND (“trial”).

Randomized controlled trials evaluating the clinical outcomes of SGLT2 inhibitors in heart failure patients were included. Clinical outcomes were classified into cardiovascular, renal, and metabolic outcomes. Cardiovascular outcomes included heart failure hospitalization, cardiovascular deaths, a composite of heart failure hospitalization and cardiovascular deaths, all-cause mortality, and a composite of non-fatal myocardial infarction, non-fatal stroke, and cardiovascular deaths. Renal outcome (henceforth defined as worsening renal function) was a composite of a 40% reduction in estimated glomerular filtration rate, need for renal replacement therapy, or death from renal causes. Randomized controlled trials that substituted a reduction in estimated glomerular filtration rate with doubling of serum creatinine or substituted the need for renal replacement therapy with end-stage renal failure in their composite of worsening renal function were included. Metabolic outcomes included mean weight change, mean change in hemoglobin A1c (HbA1c), and mean change in systolic blood pressure. We included all randomized controlled trials, according to the PICOS inclusion and exclusion criteria (Supplemental Table 1). We excluded all randomized controlled trials which did not report cardiovascular, renal, or metabolic outcomes in heart failure patients. Four reviewers independently performed the literature search and data extraction, and all disagreements were resolved by mutual consensus.

Table 1 Outcome characteristics

Apart from cardiovascular, renal, and metabolic outcomes, baseline information of heart failure patients was collected for age, sex, smoking status, diabetes mellitus, hypertension, hyperlipidemia, atrial fibrillation, previous stroke, and previous myocardial infarction. Baseline information regarding the use of heart failure medications was also collected, including angiotensin-converting enzyme inhibitors, angiotensin II receptor blockers angiotensin receptor-neprilysin inhibitors, and beta-blockers. For the SGLT2 inhibitor regimes, we collected data of the drug name, drug dosage, drug frequency, control group, length of intervention, and mean length of follow-up, as shown in Supplemental Table 2. Quality control was performed by 2 independent reviewers with the Cochrane risk of bias tool [26], as shown in Supplemental Fig. 1. The quality of pooled evidence was evaluated using the modified Grading of Recommendations Assessment, Development and Evaluation (GRADE) system for network meta-analysis, which accounts for statistical inconsistency, publication bias, risk of bias, indirectness, and statistical imprecision, as shown in Table 1 [27]. A PRISMA checklist for reporting of systematic reviews incorporating network meta-analysis [28] is included in Supplemental Fig. 2.

Table 2 Participant baseline characteristics
Fig. 1
figure 1

PRISMA flow diagram of study selection

Fig. 2
figure 2

a Heart failure hospitalization, b cardiovascular deaths, c heart failure hospitalization/cardiovascular deaths, d all-cause mortality, e cardiovascular deaths/non-fatal MI/non-fatal stroke

Statistical analysis

Prior to meta-analyses, missing data were imputed using approaches laid out by the Cochrane Handbook [29]. In studies without standard deviations, p-values or confidence intervals were converted to standard deviations [29]. In studies without standard deviations, p-values, or confidence intervals, the square-root of weighted mean variance of all other studies was used to estimate the standard deviation [30]. For panel data or longitudinal outcomes, pre-intervention baseline imbalances were corrected using the simple analysis of change scores method [29].

Frequentist network meta-analysis of aggregate data was adopted to compare the four different SGLT2 inhibitors using Stata 16.0 (StataCorp, TX, USA). The network meta-analysis is a method for comparing three or more interventions simultaneously by combining both direct and indirect evidence across a network of studies [31]. This produces estimates of the relative effects between any pair of interventions in the network, which are usually more precise than a single direct or indirect estimate [31]. The assumption of transitivity was evaluated using a global Wald test of consistency. Consistency models were fitted with restricted maximum likelihood models that assumed a common heterogeneity variance τ2 for all treatment contrasts for each clinical outcome when there was little evidence of inconsistency (P > 0.10 from Wald test). The inconsistency model was utilized for clinical outcomes with evidence of inconsistency (P < 0.10 from the Wald test). Treatment estimates were reported as hazard ratios and mean differences for time-to-event and continuous outcomes, respectively. Comparison-adjusted funnel plots of treatment estimates were visually inspected, and observation of asymmetry or points lying outside 95% pseudo-confidence limits was interpreted as publication bias. The geometry of each network plot was also visually and numerically inspected for potential biases. The relative ranking probability of the four treatments was estimated from 1000 draws, and the hierarchy of treatments was analyzed using surface under the cumulative ranking (SUCRA) curves. Higher SUCRA values correspond to greater efficacy. Interpretations regarding the relative efficacy of treatments were based on inspection of 95% confidence intervals in interval plots and supported by analysis of ranking probabilities.

As each treatment arm in the dataset is regarded as an independent treatment for comparison against each other, interval plots generated did not place placebo as the reference category. In such scenarios, to derive the effect estimate of SGLT2i compared to placebo, we took the reciprocal of the effect measure and confidence intervals for hazard ratios and risk ratios. A summary of the effect measures of the different types of SGLT2i against placebo is presented in Supplemental Table 3.

A sensitivity analysis was performed for patients with heart failure reduced ejection fraction and heart failure preserved ejection fraction for outcomes with sufficient observations (three or more randomized controlled trials). Sensitivity analyses were also performed for trials that recruited patients with chronic heart failure and trials with follow-up durations ≥ 1 year. Sensitivity analysis was additionally performed to exclude trials that had missing data imputed using Cochrane’s approaches.

Results

The PRISMA flowchart is presented in Fig. 1. A literature search of the four databases (PubMed, Embase, Cochrane, SCOPUS) retrieved 6626 results, and hand search uncovered 1 additional relevant study. A total of 2432 duplicates were removed. Title and abstract screening excluded a further 4138 articles as they did not include heart failure patients, did not focus on SGLT2 inhibitor use, or were not a randomized controlled trial. Full-text screening excluded 44 articles. Finally, 13 articles were included for the meta-analysis.

Baseline characteristics

Out of the 13 included articles, 4 [32,33,34,35] were secondary analyses of the EMPA-REG Outcome trial [36]. Thus, a total of 10 randomized controlled trials were included, comprising a combined cohort of 15,373 patients. The participant baseline characteristics of the included trials are shown in Table 2. While multiple publications reporting data from the same main trial were included in this meta-analysis in the case of the EMPA-REG OUTCOME trial, each outcome was analyzed only once using the relevant reported trial data. Therefore, there was no overrepresentation of any patient cohort.

Across the 10 randomized controlled trials, the SGLT2 inhibitor drug name, dosage, frequency, control group, length of intervention, and length of follow-up were summarized and are presented in Supplemental Table 2. Empagliflozin, dapagliflozin, canagliflozin, and ertugliflozin were the SGLT2 inhibitors used in three, four, two, and one trials respectively. Empagliflozin was administered at a dosage of 10 mg or 25 mg, and dapagliflozin was administered at a dosage of 10 mg throughout the randomized controlled trials. All regimes were given once daily and compared to a control group receiving a placebo. The length of follow-up ranged from 13 weeks to 4.2 years.

Comparison of cardiovascular outcomes across SGLT2 inhibitors

The comparison of cardiovascular outcomes is presented in Fig. 2. Frequentist network meta-analysis demonstrated that there was no significant difference in treatment effect for cardiovascular outcomes across the four SGLT2 inhibitors. Although statistically insignificant, empagliflozin showed the highest efficacy in reducing the hazard rate of heart failure hospitalization, compared to other SGLT2 inhibitors (6 RCTs, 11,556 patients) (Fig. 2a). Although statistically insignificant, compared to other SGLT2 inhibitors, canagliflozin showed the highest efficacy in reducing the hazard rate of cardiovascular deaths (4 RCTs, 10,641 patients) (Fig, 2b), composite of cardiovascular deaths and heart failure hospitalizations (7 RCTs, 14,975 patients) (Fig. 2c), all-cause mortality (5 RCTs, 10,666 patients) (Fig. 2d), and the composite of cardiovascular deaths and non-fatal myocardial infarction and non-fatal stroke (4 RCTs, 5795 patients) (Fig. 2e).

Comparison of worsening renal function across SGLT2 inhibitors

The comparison of worsening renal function is presented in Fig. 3. The three SGLT2 inhibitors that were analyzed are empagliflozin, dapagliflozin, and canagliflozin. The frequentist model demonstrated that there was no significant difference across the three SGLT2 inhibitors in reducing the hazard rate of worsening renal function (5 RCTs, 11,293 patients).

Fig. 3
figure 3

Worsening renal function

Comparison of metabolic outcomes across SGLT2 inhibitors

The comparison of metabolic outcomes is presented in Fig. 4. The two SGLT2 inhibitors analyzed were empagliflozin and dapagliflozin. In heart failure patients, the frequentist model demonstrated that there was no significant difference between the two SGLT2 inhibitors in the weighted mean difference for weight/kg (3 RCTs, 8530 patients) (Fig. 4a), HbA1c/mmol/mol (3 RCTs, 8530 patients) (Fig. 4b), and systolic blood pressure/mmHg (3 RCTs, 8530 patients) (Fig. 4c).

Fig. 4
figure 4

a Mean weight change/kg (~ 1 year), b mean change in HbA1c/mmol/mol (~ 1 year), c mean change in systolic blood pressure/mmHg (> 8 months)

Sensitivity analysis of SGLT2 inhibitors in heart failure reduced ejection fraction

In a sensitivity analysis of SGLT2 inhibitors in patients with heart failure with reduced ejection fraction, empagliflozin and dapagliflozin did not have a significant demonstrable difference in treatment effect for heart failure hospitalization (3 RCTs, 8737 patients) (Supplemental Fig. 3). There were insufficient studies (less than 3 RCTs) to analyze for all other outcomes.

Sensitivity analysis of SGLT2 inhibitors in chronic heart failure

In a sensitivity analysis of SGLT2 inhibitors in patients with chronic heart failure, Damman 2020 [24] (EMPA-RESPONSE-AHF trial) was excluded as it studied only acute decompensated heart failure patients. Canagliflozin, dapagliflozin, and empagliflozin did not have a significant demonstrable difference in treatment effect for all-cause mortality (4 RCTs, 10,587 patients) (Supplemental Fig. 4). We analyzed all other outcomes using data from trials involving only chronic heart failure patients.

Sensitivity analysis of SGLT2 inhibitors in trials with follow-up durations ≥ 1 year

In a sensitivity analysis of SGLT2 inhibitors in trials with follow-up durations of ≥ 1 year, Nassif 2019 [37] (DEFINE-HF trial) and Damman 2020 [24] (EMPA-RESPONSE-AHF trial) were excluded as their follow-up durations were 13 weeks and 60 days respectively. Canagliflozin, dapagliflozin, and empagliflozin did not have a significant difference in treatment effect for heart failure hospitalization (5 RCTs, 11,293 patients) (Supplemental Fig. 5) and all-cause mortality (4 RCTs, 10,587 patients) (Supplemental Fig. 4). We analyzed all other outcomes using data from trials reporting a follow-up duration ≥ 1 year.

Sensitivity analysis excluding trials that had missing data imputed using Cochrane’s approaches

In Damman 2020 [24], all-cause mortality was reported as a risk ratio rather than a hazard ratio; as Damman 2020 [24] has a short follow-up duration (60 days) and low number of patients lost to follow-up (n = 1), the risk ratio was approximated to hazard ratio. Hence, we excluded Damman 2020 [24] from all-cause mortality as a sensitivity analysis, in which all-cause mortality (4 RCTs, 10,587 patients) (Supplemental Fig. 6). Canagliflozin, dapagliflozin, and empagliflozin did not have a significant demonstrable difference in treatment effect for all-cause mortality.

Discussion

In this frequentist, network meta-analysis of 10 randomized controlled trials in heart failure patients, there was no significant demonstrable treatment difference across SGLT2 inhibitors across cardiovascular, renal, and metabolic outcomes.

Patients with heart failure have an increased risk of heart failure hospitalizations and mortality [38]. In 2014, in the US, there were over 900,000 heart failure hospitalizations and more than 80,000 deaths with primary heart failure [11]. This conferred a significant healthcare burden, with heart failure hospitalizations accounting for an estimated $11 billion in healthcare costs [11]. Hence, there is a growing interest to identify cost-effective pharmacological therapies to reduce the heart failure burden.

The 2019 European Society of Cardiology guideline recommends SGLT2 inhibitors for use in diabetic patients with heart failure [9]. With the completion of recent trials demonstrating consistent clinical benefits of SGLT2 inhibitors in all heart failure patients independent of diabetic status [12, 14, 22,23,24] and a recent network meta-analysis which ranked SGLT2i as the most effective therapy for heart failure reduced ejection fraction when compared among sacubitril/valsartan and vericiguat [39], the inclusion of SGLT2 inhibitors as part of the first-line therapy in heart failure appears imminent. However, across clinical trials, different SGLT2 inhibitors were employed and it remained unknown if there were significant differences in clinical outcomes across SGLT2 inhibitors. In this meta-analysis, we showed that there was no significant demonstrable difference in cardiovascular, renal, and metabolic outcomes across SGLT2 inhibitors. Hence, the decision to choose the type of SGLT2 inhibitors for the treatment of heart failure currently may be dependent on other factors, such as the cost and availability of the individual drug. While a previous network meta-analysis compared the costs of different SGLT2 inhibitor therapies in diabetic patients [40], there are currently no cost comparisons of different SGLT2 inhibitor therapies in heart failure patients.

There exists a lack of randomized controlled trials examining the effects of SGLT2 inhibitors in heart failure patients stratified according to reduced ejection fraction and preserved ejection fraction subgroups. In our meta-analysis of 10 randomized controlled trials, most did not stratify the type of heart failure according to ejection fraction, except for the following three clinical trials: the EMPEROR-Reduced (Empagliflozin Outcome Trial in Patients with Chronic Heart Failure With Reduced Ejection Fraction study) [12], DAPA-HF (Dapagliflozin and Prevention of Adverse Outcomes in Heart Failure trial) [13], and DEFINE-HF (Dapagliflozin Effects on Biomarkers, Symptoms, and Functional Status in Patients With Heart Failure With Reduced Ejection Fraction) trials [41]. Of the three trials, heart failure hospitalization was the only common reported outcome. Hence, due to the lack of randomized controlled trials, we could neither perform a sensitivity analysis in patients with heart failure with reduced ejection fraction nor for patients with heart failure with preserved ejection fraction. Future trials of SGLT2 inhibitors in heart failure patients should additionally focus on capturing the ejection fraction of heart failure patients, to further ascertain if the efficacy of SGLT2 inhibitors applies to both subgroups of patients.

In this study, we did not demonstrate a significant difference in treatment effects of different SGLT2 inhibitors across cardiovascular, renal, and metabolic outcomes in heart failure patients. While there is no universal definition of class effect, the definition of “class labeling” utilized by the United States Food and Drug Administration (FDA) states that “all products within a class are closely related in chemical structure, pharmacology, therapeutic activity, and adverse reactions” [42]. Due to the similar molecular structure, mechanism of action [1, 43,44,45], and clinical outcomes of SGLT2 inhibitors, some have postulated a potential class effect between the SGLT2 inhibitors, although our study could not conclude this. The number of studies included in this systematic review was small, and the power of comparison was low as demonstrated by the large confidence intervals of the outcomes. The inability to demonstrate differences between the SGLT2 inhibitors studied could also be due to the small sample size or heterogeneity between the studies. The ability to claim a class effect needs to be done with caution, as demonstrated previously with other drugs. The Carvedilol or Metoprolol European Trial (COMET) compared the effects of beta-blockers, namely carvedilol and metoprolol tartrate, on clinical outcomes in chronic heart failure patients [46]. The study demonstrated that carvedilol increased survival as compared with metoprolol tartrate and that there was no class effect between these 2 drugs. Adequately powered, head-to-head randomized controlled trials comparing SGLT2 inhibitors should be considered to explore the hypothesis of a class effect.

Limitations

Our study should be interpreted in due consideration of the limitations. First, there was a difference in the representation of individual SGLT2 inhibitors in clinical trials. Empagliflozin, dapagliflozin, canagliflozin, and ertugliflozin were the SGLT2 inhibitors used in three, four, two, and one trials respectively. Hence, we do not know if the results of the renal outcome apply to ertugliflozin and if that of the metabolic outcomes apply to ertugliflozin and canagliflozin. Second, many trials did not report the baseline characteristics of heart failure patients, such as the ejection fraction of patients. Hence, we were unable to examine if differences in patient characteristics affected study outcomes. Furthermore, we do not know if there is a differential treatment effect of SGLT2 inhibitors in both reduced ejection fraction and preserved ejection fraction patients, although this can be the focus of future studies.

Conclusion

In our frequentist, network meta-analysis of 10 randomized controlled trials in heart failure patients, we did not demonstrate a treatment difference across SGLT2 inhibitors across cardiovascular, renal, and metabolic outcomes. Future research of different SGLT2 inhibitors in head-to-head trials, and in different subgroups of heart failure patients with reduced and preserved ejection fraction, is warranted.