FormalPara Key Points

Dabigatran, rivaroxaban, and apixaban are associated with similar risks of ischemic stroke in patients with atrial fibrillation (AF).

Rivaroxaban is associated with an increased risk of major bleeding compared with dabigatran in patients with AF.

Apixaban is associated with a decreased risk of major bleeding compared with either dabigatran or rivaroxaban in patients with AF.

1 Introduction

Atrial fibrillation (AF) is a common cardiac arrhythmia that increases the risk of ischemic stroke five-fold [1]. While vitamin K antagonists (VKAs) have long been the primary oral anticoagulants for stroke prevention in AF, they are prone to drug–drug interactions and need frequent monitoring [2]. Direct oral anticoagulants (DOACs), including the thrombin inhibitor dabigatran and the factor Xa inhibitors rivaroxaban, apixaban, and edoxaban, recently expanded our pharmacologic arsenal. They were found to be either non-inferior or superior to the VKA warfarin for stroke prevention in large randomized controlled trials and have several advantages over VKAs, including more rapid onset of anticoagulation and decreased need for monitoring [3]. Consequently, treatment guidelines now recommend DOACs as first-line oral anticoagulation among patients with AF [4,5,6].

To date, there are no large, head-to-head trials comparing different DOACs in patients with AF. Moreover, there is a need to assess the comparative effectiveness and safety of DOACs in real-world settings. While four publications have systematically reviewed and meta-analyzed available real-world data so far [7,8,9,10], one used outdated tools for the assessment of the risk of bias [7], while others omitted bias assessment altogether [8, 9]. Moreover, numerous studies reporting head-to-head comparisons among DOACs that were recently published were not included in these earlier works [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27].

Thus, the objective of this systematic review and meta-analysis of observational studies was to provide an up-to-date synthesis of the available real-world evidence on DOAC comparative effectiveness and safety in patients with AF, while thoroughly assessing the risk of bias of the included studies.

2 Methods

This systematic review and meta-analysis was conducted according to a pre-specified protocol and is reported following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [28] and the Meta-Analysis of Observational Studies in Epidemiology (MOOSE) checklist [29].

2.1 Search Strategy

MEDLINE and EMBASE were systematically searched from inception to February 28, 2019 for observational studies published in English in the peer-reviewed literature and comparing DOACs to each other in patients with AF. The search strategy was tailored to each database and included index terms (MeSH [Medical Subject Headings] and Emtree) and text words related to AF and DOACs (see Electronic Supplementary Material eTable 1). We also scanned the bibliographies of the included articles and relevant reviews for further references.

2.2 Inclusion and Exclusion Criteria

Randomized controlled trials, cross-sectional studies, letters to the editor, commentaries/editorials, and previous reviews and meta-analyses were excluded. Conference abstracts were also excluded as their results are often preliminary and they contain insufficient information to adequately assess risk of bias. To minimize the potential effects of publication bias, we excluded studies with less than 1000 DOAC users. Studies looking at DOAC use in AF patients undergoing ablation were also excluded, as their results are not generalizable to AF patients in general.

Studies eligible for inclusion were cohort or case-control studies comparing DOACs (apixaban, dabigatran, rivaroxaban, or edoxaban) to each other in patients with AF. The primary effectiveness outcome was ischemic stroke, while the primary safety outcome was major bleeding. Secondary effectiveness outcomes were all-cause mortality, myocardial infarction, and systemic embolism. Secondary safety outcomes included intracranial hemorrhage, hemorrhagic stroke, gastrointestinal bleeding, and other bleeding events.

2.3 Study Selection

Two independent reviewers (either CMD/SY or AD/SY) performed study selection. Titles and abstracts were screened to identify potentially relevant studies and duplicates; all studies identified as potentially relevant by either reviewer proceeded to full-text review. Full-text review established the final set of included studies, with discrepancies resolved by consensus.

2.4 Data Extraction

Two independent reviewers (either CMD/SY or AD/SY) extracted data using a pilot-tested form, with discrepancies resolved by consensus (see Electronic Supplementary Material eTable 2). Study characteristics included study design, location, data source, study period, sample size (overall and by exposure group), follow-up duration, patient characteristics (age, sex, CHADS2 [congestive heart failure, hypertension, age ≥ 75 years, diabetes mellitus, prior stroke or transient ischemic attack] score [30] or CHA2DS2-VASc [congestive heart failure, hypertension, age ≥ 75 years, diabetes mellitus, prior stroke or transient ischemic attack, vascular disease, age 65–74 years, female sex] score [31] or their components, and HAS-BLED [hypertension, abnormal renal/liver function, prior stroke, bleeding history or predisposition, labile international normalized ratio, age > 65 years, drugs] score [32] and its components), and study outcomes. Other items extracted to describe the methodological approach and assess risk of bias included use of a new-user design, exposure definition (e.g., intention-to-treat, as-treated, time-dependent, etc.), and handling of treatment switch or discontinuation. The main summary measures of interest were hazard ratios (HRs) or odds ratios (ORs) with 95% confidence intervals (CIs). Effect estimates were presented for the comparisons rivaroxaban versus dabigatran, apixaban versus dabigatran, and apixaban versus rivaroxaban. For articles reporting effect estimates with a different DOAC as comparator (e.g., dabigatran vs. rivaroxaban), comparator was changed and reciprocal results were calculated.

2.5 Assessment of Risk of Bias

Two independent reviewers (AD/SY) assessed the risk of bias using the Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) tool [33]. Seven domains were assessed: bias due to confounding; bias in the selection of study participants; bias in the classification of interventions; bias due to departure from intended interventions; bias due to missing data; bias in the measurement of outcomes; and bias in the selection of the reported results. Based on the assessment of each domain, an overall risk of bias was assigned as low, moderate, serious, or critical, with the overall risk determined by the highest risk assigned in any individual domain [33]. Given the potential for confounding inherent in observational studies, the highest-quality studies were those with an overall moderate risk of bias. A moderate risk of confounding bias was ascribed to studies considering at least the following covariates in their design or analysis: age, sex, prior use of warfarin, use of antiplatelets, previous stroke (for stroke outcomes), CHADS2 or CHA2DS2-VASC score or their components (for stroke outcomes), previous bleeding (for bleeding outcomes), and HAS-BLED score or its components (for bleeding outcomes).

2.6 Data Analysis

Data were pooled across studies using DerSimonian and Laird random-effects models with Mantel-Haenszel weighting for each outcome reported by at least three studies at moderate risk of bias. Meta-analytic results are presented as pooled adjusted HRs with 95% CIs. The amount of heterogeneity that was present was estimated using the I2 statistic. All analyses were conducted using R version 3.2.2.

During the literature search, we observed that some studies used the same data sources. Thus, to avoid the duplicate inclusion of participants in the meta-analysis, we decided that, in cases of chronologically overlapping studies using the same data sources and assessing the same outcome, only the most recent one would be included. Moreover, given that one study combined five different data sources, resulting in overlaps with several other studies, we decided to exclude it from the meta-analysis [25]. However, the results of this study for the two primary outcomes were included in sensitivity analyses where the overlapping studies were excluded instead.

3 Results

3.1 Search Results

The search performed yielded 9512 studies, of which 9316 were excluded during title/abstract screening (see Electronic Supplementary Material eFigure 1). The remaining 196 studies underwent full-text review, and 25 of those were included in the systematic review [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27, 34,35,36,37,38,39,40,41].

3.2 Study Characteristics

All 25 included studies were cohort studies published between 2016 and 2019. They included a total of 1,079,565 patients (380,682 treated with dabigatran, 452,611 with rivaroxaban, and 246,272 with apixaban). The follow-up durations ranged from 89 to 422 days (Table 1). Overall, 15 studies were conducted in North America [11, 12, 16, 17, 19, 20, 23,24,25,26, 34, 36, 38, 39, 41], seven in Europe [14, 18, 21, 22, 27, 37, 40], and three in Asia [13, 15, 35]. Eighteen studies compared dabigatran with rivaroxaban [11,12,13, 15, 17, 18, 21,22,23, 25, 26, 34, 35, 37,38,39,40,41], while 17 also considered apixaban (Table 1) [11, 12, 14, 16, 18,19,20,21, 23,24,25,26,27, 34, 36, 40, 41]. No studies examined edoxaban. One study used two different databases and reported separate estimates for each [36]. While all 25 studies included patients with AF, 18 considered patients initiating oral anticoagulation with DOACs (i.e., new users of DOACs without previous VKA use) [12, 14, 16,17,18,19,20,21,22,23,24,25,26,27, 34, 37, 38, 40], four considered new users of DOACs with previous VKA use [11, 13, 39, 41], one considered new users of dabigatran or rivaroxaban with previous use of VKAs or other DOACs [35], and two considered both new and prevalent users of DOACs [15, 36] (Table 1). In nine studies there were separate analyses for standard-dose and low-dose treatment regimens [18,19,20, 22, 24, 25, 37,38,39].

Table 1 Characteristics of observational studies on effectiveness and safety of direct oral anticoagulants among patients with atrial fibrillation

Patient characteristics including age, heart failure, renal disease, and previous stroke or bleeding differed across studies (see Electronic Supplementary Material eTable 3). CHA2DS2-VASc scores ranged from 1.6 to 4.7, while HAS-BLED scores ranged from 1.2 to 3.7. In 19 studies exposure was defined in an as-treated fashion, where patients were considered continuously exposed until drug discontinuation [11,12,13, 16, 18,19,20,21,22,23,24,25,26,27, 34, 38,39,40,41], five studies used an intention-to-treat approach, where exposure was defined by treatment at cohort entry [15, 17, 35,36,37], and one used a time-dependent exposure definition (censoring follow-up upon discontinuation of oral anticoagulation) [14] in their main analyses. Five studies used alternative exposure definitions in sensitivity analyses [20, 24, 35, 37, 41]. Among the seven studies not explicitly excluding patients with previous VKA use [11, 13, 15, 35, 36, 39, 41], three accounted for it at the stage of statistical analysis [11, 13, 41], while the other four did not [15, 35, 36, 39].

3.3 Assessment of Risk of Bias

Based on ROBINS-I, 19 studies were assigned a moderate risk of bias [11, 12, 16,17,18,19,20,21,22,23,24,25,26,27, 34, 37, 38, 40, 41], four were assigned a serious risk of bias [13, 14, 35, 39], and two were assigned a critical risk of bias [15, 36] (see Electronic Supplementary Material eTable 4). As one of the studies at moderate risk of bias reported only absolute risk differences [18], its results are presented in the tables but not included in qualitative or quantitative data synthesis. One domain leading to a major increase in the risk of bias was ‘risk of bias due to confounding’, resulting from confounding by indication, contraindication, and/or severity associated with previous use of VKAs [15, 35, 36, 39], time-varying confounding due to VKA use during follow-up [14], or from residual confounding due to failure to adjust for important confounders [13]. Eighteen studies used propensity score-based approaches in their analyses to control for confounding [11, 13, 16, 17, 19,20,21,22,23,24,25,26, 34, 37,38,39,40,41]. A propensity score is defined as the probability of getting exposed to a medication, given a set of covariates [42]. As this score summarizes all patient characteristics into a single covariate, it reduces the potential for overfitting. However, the possibility of confounding due to unmeasured covariates cannot be excluded.

Another domain responsible for an increased risk of bias was ‘bias in selection of participants into the study’, resulting from the inclusion of previous users of VKAs [35, 39] or DOACs [15, 36], as well as from potential informative censoring in the setting of an as-treated exposure definition [11,12,13, 16, 18, 19, 21,22,23, 25,26,27, 34, 38,39,40,41]. Of note, no study using an as-treated definition included statistical approaches to address informative censoring (e.g., inverse probability of censoring weights). However, three studies using both as-treated and intention-to-treat definitions (in sensitivity analyses) while not having other sources of selection bias were ascribed a low risk in this respect given the complementary nature of these analyses [20, 24, 37]. Moreover, considering the short follow-up of the included studies (< 1 year) and the resulting low risk of exposure misclassification, studies using an intention-to-treat approach were ascribed a low risk of ‘bias in classification of interventions’. Finally, ‘bias in selection of reported results’ due to the absence of a prespecified study protocol also affected the quality of most of the included studies (see Electronic Supplementary Material eTable 4).

3.4 Direct Oral Anticoagulants (DOACs) and Ischemic Stroke

The results for ischemic stroke were heterogenous for all three comparisons (see Electronic Supplementary Material eTable 5). Fifteen studies compared rivaroxaban with dabigatran, with HRs ranging from 0.73 to 1.92 [12, 13, 15, 17, 18, 21,22,23, 25, 26, 35, 37,38,39, 41]. Nine studies compared apixaban with dabigatran, with HRs ranging from 0.40 to 1.22 [12, 18, 19, 21, 23, 25,26,27, 41]. Finally, eight studies compared apixaban with rivaroxaban, with HRs ranging from 0.67 to 1.27 [12, 18, 19, 21, 23, 25, 27, 41].

3.5 DOACs and Major Bleeding

Ten studies compared the risk of major bleeding between rivaroxaban and dabigatran, showing either a trend towards an increased risk or a significantly increased risk with rivaroxaban, with HRs ranging from 1.05 to 1.69 (see Electronic Supplementary Material eTable 6) [21, 22, 25, 26, 34, 35, 39,40,41]. Fourteen studies compared apixaban with dabigatran, showing either a trend towards a decreased risk or a significantly decreased risk with apixaban (HR range 0.50–0.94) [14, 16, 18,19,20,21, 24,25,26,27, 34, 36, 40, 41]. Finally, 13 studies compared apixaban with rivaroxaban, showing either a trend towards a decreased risk or a significantly decreased risk with apixaban (HR range 0.39–0.88) [14, 16, 18,19,20,21, 24, 25, 27, 34, 36, 40, 41].

3.6 DOACs and Secondary Effectiveness Outcomes

Eight studies compared the risk of all-cause mortality between rivaroxaban and dabigatran, with most of them showing either a trend towards an increased risk or a significantly increased risk for rivaroxaban, with HRs ranging from 0.99 to 1.52 (see Electronic Supplementary Material eTable 7) [13, 22, 23, 26, 35, 37,38,39]. Moreover, three studies compared apixaban with dabigatran, showing no statistically significance difference (HR range 0.91–1.14) [23, 26, 27]. Two studies compared apixaban with rivaroxaban, showing either a trend towards a decreased risk or a significantly decreased risk with apixaban (HR range 0.81–0.94) [23, 27].

Six studies compared the risk of myocardial infarction between rivaroxaban and dabigatran, yielding heterogenous results, with HRs ranging from 0.62 to 1.11 (see Electronic Supplementary Material eTable 8) [13, 17, 22, 26, 35, 38]. Moreover, one study compared apixaban with dabigatran, showing a strongly decreased risk with apixaban (HR 0.37; 95% CI 0.16–0.84) [26].

Five studies compared the risk of systemic embolism between rivaroxaban and dabigatran, showing either a trend towards an increased risk or a significantly increased risk with rivaroxaban, with HRs ranging from 1.09 to 1.47 (see Electronic Supplementary Material eTable 9) [13, 21, 22, 25, 39]. Two studies compared apixaban with dabigatran, showing a trend towards a decreased risk with apixaban (HR range 0.37–0.76) [19, 25]. Three studies compared apixaban with rivaroxaban, also showing a trend towards a decreased risk with apixaban (HR range 0.49–0.56) [19, 21, 25].

3.7 DOACs and Secondary Safety Outcomes

The results for intracranial hemorrhage were heterogenous for all three comparisons (see Electronic Supplementary Material eTable 10). Fourteen studies compared rivaroxaban with dabigatran, with HRs ranging from 0.73 to 3.45 [12, 13, 17, 18, 21,22,23, 25, 26, 34, 35, 38, 39, 41]. Ten studies compared apixaban with dabigatran, with HRs ranging from 0.65 to 1.43 [12, 18, 19, 21, 23, 25,26,27, 34, 41]. Finally, nine studies compared apixaban with rivaroxaban, with HRs ranging from 0.51 to 1.39 [12, 18, 19, 21, 23, 25, 27, 34, 41].

Four studies compared the risk of hemorrhagic stroke between rivaroxaban and dabigatran, showing either a trend towards an increased risk or a significantly increased risk with rivaroxaban, with HRs ranging from 1.70 to 4.55 (see Electronic Supplementary Material eTable 11) [21, 25, 26, 41]. Four studies compared apixaban with dabigatran, showing no statistically significant difference (HR range 0.72–1.08) [19, 21, 25, 41]. Finally, four studies compared apixaban with rivaroxaban, yielding heterogenous results, with HRs ranging from 0.32 to 1.49 [19, 21, 25, 41].

Fourteen studies compared the risk of gastrointestinal bleeding between rivaroxaban and dabigatran (see Electronic Supplementary Material eTable 12) [11,12,13, 17, 18, 21,22,23, 25, 26, 34, 35, 38, 39]. Except for one study showing a trend towards a decreased risk with rivaroxaban (HR 0.85; 95% CI 0.72–1.01) [34], the other studies showed either a trend towards an increased risk or a significantly increased risk with rivaroxaban, with HRs ranging from 1.12 to 1.60 [11,12,13, 17, 18, 21,22,23, 25, 26, 35, 38, 39]. Ten studies compared apixaban with dabigatran, showing either a trend towards a decreased risk or a significantly decreased risk with apixaban (HR range 0.39–0.86) [11, 12, 18, 19, 21, 23, 25,26,27, 34]. Finally, nine studies compared apixaban with rivaroxaban, showing either a trend towards a decreased risk or a significantly decreased risk with apixaban (HR range 0.33–0.94) [11, 12, 18, 19, 21, 23, 25, 27, 34].

Several studies assessed the risk of further bleeding outcomes, including any bleeding [12, 37, 39], major extracranial bleeding [23, 26, 38], hospitalized extracranial bleeding [38], clinically relevant bleeding [22], and urogenital bleeding [22, 27]. The results are shown in Electronic Supplementary Material eTable 13.

The results on DOAC comparative effectiveness and safety did not considerably change when comparing low-dose regimens (see Electronic Supplementary Material eTable 14) or using alternative exposure definitions (see Electronic Supplementary Material eTable 15).

3.8 DOAC Effectiveness and Safety in Higher-Quality Studies

When considering only the 19 studies at moderate risk of bias and only outcomes assessed by more than one study, qualitative data synthesis remained inconclusive regarding the risk of ischemic stroke (HR range for rivaroxaban vs. dabigatran: 0.73–1.12; HR range for apixaban vs. dabigatran: 0.40–1.22; HR range for apixaban vs. rivaroxaban: 0.67–1.27). Data suggested an increased risk of major bleeding for rivaroxaban versus dabigatran (HR range 1.05–1.69), and decreased risks for apixaban versus either dabigatran (HR range 0.50–0.94) or rivaroxaban (HR range 0.39–0.88).

Regarding all-cause mortality, we found a trend towards an increased risk for rivaroxaban versus dabigatran (HR range 0.99–1.52), a similar risk for apixaban versus dabigatran (HR range 0.91–1.14), and a trend towards a decreased risk for apixaban versus rivaroxaban (HR range 0.81–0.94). There was also a similar risk of myocardial infarction for rivaroxaban versus dabigatran (HR range 0.88–1.11). Moreover, data suggested an increased risk of systemic embolism for rivaroxaban versus dabigatran (HR range 1.09–1.39) and a trend towards decreased risks for apixaban versus either dabigatran (HR range 0.37–0.76) or rivaroxaban (HR range 0.49–0.56), albeit all studies had wide 95% CIs.

Regarding intracranial hemorrhage, data suggested an increased risk for rivaroxaban versus dabigatran (HR range 1.05–1.81), but data on apixaban were heterogenous (HR range vs. dabigatran: 0.65–1.75; HR range vs. rivaroxaban: 0.51–1.39). There was also a trend towards an increased risk of hemorrhagic stroke for rivaroxaban versus dabigatran (HR range 1.70–4.55), a similar risk for apixaban versus dabigatran (HR range 0.72–1.08), and heterogenous results for apixaban versus rivaroxaban (HR range 0.32–1.49). Finally, regarding gastrointestinal bleeding, the results were heterogeneous for rivaroxaban versus dabigatran (HR range 0.85–1.52) but suggested decreased risks for apixaban versus either dabigatran (HR range 0.39–0.86) or rivaroxaban (HR range 0.33–0.94).

3.9 Meta-Analysis of Higher-Quality Studies

There was a similar risk of ischemic stroke for rivaroxaban versus dabigatran (six studies; HR 0.93; 95% CI 0.83–1.04; I2: 0%), apixaban versus dabigatran (five studies; HR 0.94; 95% CI 0.82–1.09; I2: 0%), and apixaban versus rivaroxaban (four studies; HR 1.07; 95% CI 0.93–1.23; I2: 0%) (Table 2, Fig. 1). Regarding major bleeding, there was an increased risk for rivaroxaban versus dabigatran (six studies; HR 1.33; 95% CI 1.20–1.47; I2: 22%) and decreased risks for apixaban versus either dabigatran (eight studies; HR 0.71; 95% CI 0.64–0.78; I2: 0%) or rivaroxaban (eight studies; HR 0.56; 95% CI 0.48–0.65; I2: 69%) (Table 2, Fig. 2).

Table 2 Results of meta-analyses for the comparative effectiveness and safety of direct oral anticoagulants among patients with atrial fibrillation
Fig. 1
figure 1

Forest plots demonstrating individual and pooled relative risks of ischemic stroke for the comparison rivaroxaban versus dabigatran in patients with atrial fibrillation. CI confidence interval, HR hazard ratio

Fig. 2
figure 2

Forest plots demonstrating individual and pooled relative risks of major bleeding for head-to-head comparisons among different direct oral anticoagulants in patients with atrial fibrillation. CI confidence interval, HR hazard ratio

There was a borderline increased risk of all-cause mortality for rivaroxaban versus dabigatran (four studies; HR 1.13; 95% CI 1.00–1.28; I2: 38%) and a similar risk for apixaban versus dabigatran (three studies; HR 1.00; 95% CI 0.85–1.19; I2: 60%) (Table 2; see also Electronic Supplementary Material eFigure 2). There was also a similar risk of myocardial infarction for rivaroxaban versus dabigatran (four studies; HR 0.98; 95% CI 0.86–1.12; I2: 0%) (Table 2; see also Electronic Supplementary Material eFigure 3) and of systemic embolism for the same comparison (three studies; HR 1.19; 95% CI 0.77–1.82; I2: 0%) (Table 2; see also Electronic Supplementary Material eFigure 4).

Regarding intracranial hemorrhage, there was an increased risk for rivaroxaban versus dabigatran (seven studies; HR 1.71; 95% CI 1.46–2.01; I2: 0%) but a similar risk for apixaban versus either dabigatran (six studies; HR 1.27; 95% CI 0.98–1.63; I2: 10%) or rivaroxaban (five studies; HR 0.80; 95% CI 0.59–1.08; I2: 37%) (Table 2; see also Electronic Supplementary Material eFigure 5). The studies assessing hemorrhagic stroke observed similar estimates (Table 2; see also Electronic Supplementary Material eFigure 6). Regarding gastrointestinal bleeding, there was an increased risk for rivaroxaban versus dabigatran (seven studies; HR 1.17; 95% CI 1.02–1.33; I2: 69%) and decreased risks for apixaban versus either dabigatran (six studies; HR 0.59; 95% CI 0.46–0.75; I2: 72%) or rivaroxaban (five studies; HR 0.56; 95% CI 0.36–0.86; I2: 92%) (Table 2; see also Electronic Supplementary Material eFigure 7). Finally, the results for the two primary outcomes did not change when including the study by Lip et al. [25] (see Electronic Supplementary Material eFigures 8, 9).

4 Discussion

The objective of our study was to synthesize the available real-world evidence on the comparative effectiveness and safety of DOACs. Overall, we identified 25 studies that met our inclusion criteria. Considering only 19 higher-quality studies, our meta-analyses suggest a similar risk of ischemic stroke for rivaroxaban versus dabigatran (HR 0.93; 95% CI 0.83–1.04), apixaban versus dabigatran (HR 0.94; 95% CI 0.82–1.09), and apixaban versus rivaroxaban (HR 1.07; 95% CI 0.93–1.23). Moreover, we observed an increased risk of major bleeding for rivaroxaban versus dabigatran (HR 1.33; 95% CI 1.20–1.47) and decreased risks for apixaban versus either dabigatran (HR 0.71; 95% CI 0.64–0.78) or rivaroxaban (HR 0.56; 95% CI 0.48–0.65).

Some studies included in this systematic review had several limitations that warrant consideration. Using the ROBINS-I tool, we found that 19 studies were assigned a moderate risk of bias [11, 12, 16,17,18,19,20,21,22,23,24,25,26,27, 34, 37, 38, 40, 41], while six studies were assigned a serious or critical risk of bias [13,14,15, 35, 36, 39]. A potential limitation observed in all studies with a serious or critical risk of bias was confounding by indication, contraindication, and/or severity related to previous use of VKAs. The remaining studies considered previous VKA use in their design, either by matching on propensity scores that included previous VKA use as a variable or by excluding previous VKA users. While the first approach does not eliminate the possibility of residual confounding since aspects such as duration of previous VKA use are not taken into consideration, the second approach may yield findings of decreased generalizability as many DOAC users are previous VKA users [43]. The prevalent new-user study design, a newly developed approach incorporating both new users and switchers from previous medications that considers the duration of previous treatment, could offer an alternative in this setting [44]. Another major limitation was the indiscriminate inclusion of prevalent users [15, 36], which may result in under-ascertainment of early adverse events, depletion of susceptibles, and in the adjustment for covariates that are on the causal pathway [45, 46].

Our findings of a similar risk of ischemic stroke among the DOACs as well as the decreased risk of major bleeding with apixaban compared with either rivaroxaban or dabigatran are congruent with those of a recent systematic review of network meta-analyses of randomized controlled trials [47]. Moreover, our findings that rivaroxaban could be associated with an increased risk of major bleeding and all-cause mortality compared with dabigatran are congruent with those of the meta-analysis by Bai et al. [7]. However, while Bai et al. [7] reported no differences between rivaroxaban and dabigatran regarding intracranial hemorrhage (HR 1.22; 95% CI 0.85–1.59), our pooled estimate suggested a 71% increased risk for rivaroxaban. A possible explanation for this discrepancy is that Bai et al. [7] also included two studies that suggested a decreased risk for rivaroxaban which were assigned a serious risk of bias in our quality assessment [35, 39].

The higher risks for different types of bleeding observed with rivaroxaban compared with dabigatran or apixaban could be a result of the dosing regimens. Indeed, while DOACs have similar plasma half-lives [48], rivaroxaban is given once daily whereas dabigatran and apixaban are given twice daily. It is conceivable that once-daily regimens could lead to higher peak concentrations and to an increased risk of bleeding. However, to our knowledge, a correlation between rivaroxaban plasma concentrations and bleeding events has yet to be shown.

Our study has several strengths. First, it provides an up-to-date synthesis of the available literature in a dynamically evolving field, including several recent studies not captured in previous systematic reviews and considering almost half a million DOAC users overall. Second, this study presents robust data on the comparative effectiveness and safety of apixaban, a relatively recently approved DOAC. Finally, we used ROBINS-I to evaluate the quality of the included studies, a tool that enables a robust assessment of the risk of different biases such as confounding or selection bias, and restricted meta-analysis to higher-quality studies.

Our study also has some limitations. First, our review is affected by the limitations of the included studies, such as residual confounding due to clinical data not typically captured by administrative databases (e.g., smoking, diet). Second, while the exclusion of studies with < 1000 DOAC users provides an objective, pre-specified threshold based on underlying event rates, there is a possibility that some underpowered but potentially eligible studies could have been excluded. Finally, as the included studies were conducted using computerized healthcare databases from different jurisdictions, confounding due to jurisdiction-specific factors such as formulary restrictions cannot be excluded.

5 Conclusions

Our systematic review and meta-analysis suggest no major differences in the risk of ischemic stroke, all-cause mortality, myocardial infarction, or systemic embolism between dabigatran, rivaroxaban, and apixaban in patients with AF. However, rivaroxaban is associated with an increased risk of bleeding compared with dabigatran, while apixaban is associated with a decreased risk of bleeding compared with either dabigatran or rivaroxaban. Thus, current observational evidence supports the notion that while differences among DOACs regarding effectiveness appear to be small, apixaban should be preferred in AF patients at higher risk of bleeding.