Introduction

Stroke is a leading cause of disability worldwide and impacts multiple aspects of health-related quality-of-life (HRQoL) [1]. Approximately 15–20% of strokes are attributable to atrial fibrillation [2, 3]. Among patients with ischemic strokes, patients with atrial fibrillation have worse functional outcomes compared to patients without atrial fibrillation [4].

Currently, guidelines recommend using the CHA2DS2-VASc score to guide anticoagulation use in patients with atrial fibrillation, with the goal of preventing stroke [5]. The proportion of stroke occurrence attributable to atrial fibrillation increases with age [6]. As such, age is weighted significantly in the CHA2DS2-VASc score and the subsequent decision to initiate anticoagulation [7]. Yet, there is an increasing focus on alternate endpoints like HRQoL, treatment satisfaction, and mortality rather than solely stroke incidence [8, 9]. These primary endpoints are influenced by multiple time-dependent factors and there is a greater need to account for competing risks and differential attrition over time. For example, while atrial fibrillation accounts for more than one-third of all strokes in adults over 80 years, the effectiveness of anticoagulation in preventing death decreases significantly with age after accounting for the competing risk of other causes of death [10].

As has been shown with analyses of the association between digoxin use and mortality in Atrial Fibrillation Follow-up Investigation of Rhythm Management (AFFIRM) [11,12,13], variation in longitudinal models of time-varying factors across the life course and selection bias from competing risks can lead to contradictory results. Failure to account for death as a competing risk may lead to an overestimation of stroke incidence in those with atrial fibrillation [14]. Notably, it is unknown how competing risk affects the impact of anticoagulation on HRQoL over time. The decision to initiate or discontinue anticoagulation likely is both affected by and affects other determinants of HRQoL. If not properly accounted for, this time-varying interplay between anticoagulation and other determinants of HRQoL could lead to biased effect estimates [15].

The purpose of this study is not to directly impact clinical practice today as management of atrial fibrillation has significantly changed since the publication of the AFFIRM trial. Rather, we intend to demonstrate methodological considerations in conducting longitudinal research on anticoagulation and patient-reported outcomes, especially among the elderly. Accordingly, this study aims to demonstrate potential biases in the longitudinal study of patients with atrial fibrillation and how to account for them. Using causal inference methods, we investigate if anticoagulation improves HRQoL over time in patients with atrial fibrillation after accounting for time-varying covariates and selection bias from attrition in a post-hoc analysis of the AFFIRM Quality of Life Substudy. We also examine if the aforementioned relationship differs as a function of age and if stroke seems to mediate the association between anticoagulation and HRQoL over time. Ultimately, we hope this study provides a framework for future analytic approaches to longitudinal research on patient-reported outcomes in individuals with atrial fibrillation.

Methods

Study population

The AFFIRM study is a randomized clinical trial published in 2002 comparing all-cause mortality between rhythm and rate control in patients with atrial fibrillation [16]. Details of the study design have been previously published [16]. In a subset of the original study, 25% of the original sites were randomly selected to participate in the Quality of Life substudy to assess quality of life differences between those randomized to rate versus rhythm control. Participants of these substudy sites were then asked to participate in the Quality of Life substudy. 716 individuals from these substudy sites chose to participate in the quality of life substudy, representing 17.6% of the 4060 participants in the original AFFIRM trial [17]. Participants in the substudy were given a battery of questionnaires at baseline and each follow-up visit on various aspects of quality of life measures including perceived health-related quality of life. This measure is a non-specific, generic measure of quality of life. Follow-up visits occurred at 2 months after randomization, 1 year, and annually for up to 6 years. Patients’ data were administratively censored at the time of last contact or withdrawal from the study. We limited the analytic sample to those with more than 1 visit. Primary results of the AFFIRM HRQoL substudy have been previously published and demonstrated there was no significant difference in HRQoL between those on rate vs. rhythm control [17].

Exposure

Time-varying warfarin use, dichotomized into use versus no use, was measured at each follow-up visit. Self-reported warfarin use was assessed at each visit by the yes–no question “Current anticoagulation with warfarin?”. In AFFIRM, continuous anticoagulation with warfarin was encouraged but could be stopped at the physician's discretion in the rhythm control group. In the rate-control group, continuous anticoagulation was mandated in the original study. However, in both treatment arms, there was a deviation from the study’s protocol and at various time points participants were not taking warfarin. Up to 15% of patients in the rate-control arm were not taking warfarin at each time point. Approximately 30% of patients were not taking warfarin throughout the trial duration in the rhythm-control arm. Reasons for discontinuation of warfarin included interval history of bleeding, physician discretion, patient discretion, frailty or fall risk, upcoming surgery, or ‘other’ (which was not further delineated in the original study).

As a sensitivity analysis, we used INR as our exposure rather than warfarin use. The variable was dichotomized to therapeutic (INR 2–3), versus not therapeutic (INR < 2 or INR > 3) or not taking warfarin (see Supplemental Table 1 for a percentage of patients within each exposure group at the beginning and end of study). As with our primary exposure measure, this measurement was time-varying at each time point in the study. We used this measure at each time point rather than the summative composite measure of time in therapeutic range (TTR), as TTR does not account for time-varying covariates impacting INR at each discrete time point throughout the duration of the study.

Outcome

Our primary outcome was perceived health-related quality of life (HRQoL), a time-varying variable ascertained at each visit. Participants were asked, “In general, would you say your health is...” and instructed to select one of the following five possible responses: “excellent,” “very good,” “good,” “fair,” or “poor.” In our primary analysis, we treated the variable as an ordinal variable using the original 5-point scale. Lower scores represent better quality of life. Death was treated as the worst possible HRQoL score (“poor,” a 5 on the ordinal Likert scale) in the above measure and at all subsequent visits following death to minimize possible selection bias [10].

An additional sensitivity analysis was performed dichotomizing the outcome to “fair” or “poor” vs. “good”, “very good”, or “excellent” to help with comparability with the original quality of life substudy [15].

Covariates

Participant characteristics were collected by self-reported responses to questionnaires at the initiation of the study prior to randomization, as well as at the follow-up visits through the duration of the study. Demographic variables included gender, race (white vs. not-white), and age. Socioeconomic status was captured via education level (less than completion of high school, completion of high school or GED, completion of college or more), and employment status (part-time or full-time employment vs. not employed).

Variables collected on medical conditions prior to randomization included the history of the following conditions: coronary artery disease, angina pectoris, myocardial infarction, congestive heart failure, hypertension, cardiomyopathy, valvular heart disease, congenital heart disease, symptomatic bradycardia or heart block, stroke or TIA, peripheral vascular disease, diabetes, hepatic or renal disease, pulmonary disease, and smoking within 2 years. To capture multimorbidity, these conditions were summed into a composite number of comorbidities (range 0–9) as has been done in prior studies [18]. Information was also collected for the occurrence of chest pain, dizziness, dyspnea, edema, fatigue, palpitations, panic, or syncope while in atrial fibrillation. These symptoms were summed to create a composite number of atrial fibrillation symptoms (range 0–10) as is commonly done in atrial fibrillation studies [19].

Variables collected at repeated follow-up visits included aspirin use, discontinuation of rate or rhythm-controlling medication since the last visit, use of an anti-arrhythmic medication, blood pressure class (class 1: SBP < 130 and DBP < 90, class 2: SBP 130–139 or DBP 80–89, and class 3: SBP ≥ 140 or DBP ≥ 90), angina status per the Canadian Cardiovascular Society (dichotomous presence of angina) [20], congestive heart failure status per the NYHA classification system (Class I–IV, higher number being worse) [21], number of non-arryhthmia medications (range 0–8), electrical cardioversion since last visit (yes/no), hospitalization since last visit (yes/no), minor bleeding since last hospitalization (yes/no), use of anti-arrythmia medication since last visit (yes/no), and atrial fibrillation or flutter documented by EKG since last visit.

As incident stroke may occur temporally after anticoagulation, we did not adjust for it in our primary analyses to avoid bias that may occur when adjusting on a mediator [22]. In our non-causal exploratory mediation analysis, we included the variable ischemic stroke since last visit into our models to see if associations were attenuated after accounting for stroke. This variable was collected at every visit.

Statistical analysis

Descriptive statistics were calculated for covariates by baseline warfarin use at timepoint 0 (study initiation), as well as at the last time-point for each individual prior to censorship or the end of the study period. Differences in categorical covariates were calculated using the chi-square test. Differences in continuous variables were calculated using ANOVA for normally distributed covariates or the Kruskal–Wallis test for non-normally distributed covariates. Normality was determined with visual investigations of Q-Q plots and confirmed empirically using the Shapiro–Wilk test.

The directed acyclic graphs (Supplemental Figs. 1, 2) demonstrate the potential for bias both from time-varying covariates and differential attrition. From these, we operationalized our statistical approach. To assess the impact of these potential biases on the association between warfarin use and HRQoL, we modelled the association in various ways. First, we examined the unadjusted association. Next, we used a generalized mixed-effects ordinal regression (a generalized mixed model, abbreviated henceforth as a ‘GMM’) to show the association after adjusting for a-priori confounders. We then fit a confounder-adjusted generalized estimating equation (GEE), an alternative longitudinal analysis method that allows for inference at the population level rather than the individual level. GMMs allow for individual inference by allowing for random effects for a given individual’s repeated measures. Conversely, GEEs allow for population-averaged inference rather than individual inference by employing a quasi-likelihood function. Both GMMs and GEEs are often used in longitudinal repeated-measure analyses.

To examine the effect of time-varying confounders, we then repeated the analysis using the inverse probability of the treatment weights (IPTW). As prior warfarin use and adjusted covariates can both mediate and confound each other overtime, IPTW creates an idealized pseudopopulation at each timepoint that balances on covariates between the two exposure groups. The difference in estimates between the adjusted and weighted models reflects the bias introduced when not accounting for the time-varying nature of our covariates and warfarin use [23].

Variables used to create the weights were based on a priori theoretical selection (see supplemental material for a list of a priori selected variables), with a stepwise backward regression used to optimize model fit and minimize overadjustment and collinearity. After shrinkage, the final weights were based on blood pressure, comorbidity score, number of atrial fibrillation symptoms, age, gender, and angina status. We additionally incorporated into the analysis amiodarone use, other anti-arrythmia use, and atrial fibrillation status into the adjusted and weighted models given their high a priori potential to impact warfarin use, quality of life, or drop-out from the study. Both graphical assessment and weighted standardized differences before and after weighting were performed to ensure the balance of the covariates by warfarin use in the weighted models [23].

We created the inverse probability of the censored weights (IPCW) using the same covariates used to create the IPTW weights to ensure consistency across treatment and censorship weight building. These weights were then applied to demonstrate the bias occurring when not accounting for differential attrition secondary to competing events or other causes of differential loss-to-follow-up. In the final model, both IPCW and IPTW weights were applied to create a causal model to estimate the association of warfarin use with HRQoL over time, after accounting for both time-varying covariates and differential attrition. As a secondary exploratory analysis, we repeated the final model but adjusted it on incident ischemic stroke to explore if the relation between anticoagulation and HRQoL may be mediated by ischemic stroke. With respect to all variables, there was < 5% with missing values. To deal with this missingness, we carried over the last known value forward.

The above analyses were then repeated but in stratified groups of age at enrollment (< 70 vs ≥ 70) to examine the differential association between anticoagulation and HRQoL by age. Age was dichotomized to 70 to account for non-linearity between age and HRQoL as well as preserve the sample sizes in both age groups. We then estimated the change in probability of each HRQoL level if an individual stays on warfarin the entire time up to a given time point versus not being on warfarin up to that point.

Sensitivity analyses

Multiple sensitivity analyses were conducted to ensure the consistency of our results across different exposure and outcome classifications. First, rather than using a carry-forward method to address our missing data, we repeated our primary analyses using multiple imputations with chained equations (MICE). Five imputed datasets were used to estimate the missingness models. We then completed a sensitivity analysis using a dichotomized outcome rather than an ordinal outcome to assist with comparability with other studies. We next used the exposure of INR 2–3 vs. not 2–3 or not taking warfarin to help with assessing the idealized therapeutic exposure. While there may be heterogeneity of effect between an INR > 3 and INR < 2, we kept the sensitivity analysis as a dichotomized exposure as it better represents the clinical decision heuristic of a patient being within the therapeutic range or not. Lastly, we repeated our main analyses but removed those with baseline ischemic strokes from the analytic sample to see whether results changed when limited to stroke-free participants.

Data access and analysis

Alen Delic had full access to all the data in the study and takes responsibility for its integrity and data analysis. The analysis was completed in Stata v17 [24]. We have provided our statistical code in the supplemental material (see Supplemental appendix).

Results

Descriptive statistics

From the original AFFIRM quality of life study of 716 individuals, 60 were removed for having only a baseline visit, yielding a final analytic sample of 656 individuals. The mean number of visits, including baseline visit, per individual was 4.6 visits with a standard deviation of 1.0 [range 3–7, median: 5, interquartile range (IQR) 4–5]. The mean number of follow-up years per individual was 2.99 with a standard deviation of 0.97 (range 0.36–5.01, median 3.02 years, IQR 2.04–3.98 years). At the baseline visit, 601 individuals were taking warfarin while 55 were not taking warfarin. At the final visit, 519 individuals were taking warfarin while 137 were not. Supplemental Fig. 3 shows the number of individuals in the analytic sample at each time point based on their initial warfarin use status. Complete descriptive statistics, including baseline covariate characteristics and time-varying follow-up variables stratified by baseline warfarin use are available in Table 1. Those not on warfarin at baseline were more likely to be younger, less likely to be in atrial fibrillation or atrial flutter at baseline, be taking less non-arrythmia medications, more likely to die, and less likely to be censored for non-death reasons.

Table 1 Descriptive statistics by baseline warfarin use at baseline and final visit

Association of warfarin with HRQoL overtime

Table 2 displays the association of warfarin use over time with the ordinal outcome of HRQoL via the unadjusted model, fully adjusted generalized mixed model, generalized estimating equation model, IPTW model, IPCW model, and IPCW*IPTW cross-product weighted model. In the unadjusted, univariate model, warfarin was marginally associated with improved HRQoL though not at a statistically significant level (OR 0.71, 95% CI 0.44–1.14). Following adjustment, the association strengthened between warfarin and HRQoL, reaching statistical significance (OR 0.61, 95% CI 0.38–0.97). In both the IPCW and IPTW model, the association strengthened further. In the final cross-product model, warfarin use had a significant association with better HRQoL (OR 0.30, 95% CI 0.14–0.55).

Table 2 Association of warfarin use, time, and warfarin use over time with perceived health-related quality of life using different modeling strategies

With respect to the interaction of years*warfarin (i.e. the effect of time on the association between warfarin and HRQoL) the OR magnitude varied based on the model and at times was not statistically significant (Table 2), but remained < 1.0 in all models, suggesting warfarin’s association with better HRQoL may improve over time. Figure 1 displays the probability of each HRQoL score over time for a given individual on warfarin.

Fig. 1
figure 1

The estimated change in marginal probability over time of achieving each HRQoL score while on warfarin

Figure 2 displays the same information, though now stratified by participants 70 years and older vs. participants younger then 70 years old respectively. Upon stratification by age, effect modification by time on the association was statistically significant only in those older than 70 years (Fig. 2). Among those older than 70 years, there was an estimated roughly 30% decrease in the probability of reporting “poor” HRQoL if one took warfarin for 1500 days from study initiation compared to someone who didn’t take warfarin during that time. Inclusion of new incident ischemic stroke did not significantly affect the association in our primary model (results not shown).

Fig. 2
figure 2

The estimated change in marginal probability over time of achieving each HRQoL score while on warfarin, stratified by age

Sensitivity analyses

When employing MICE rather than a carry-forward method to deal with missing values, the direction of effect of warfarin remained consistent in our final causal IPTW*IPCW model, though the association of warfarin over time (warfarin*time interaction term) was attenuated (see Supplemental Table 2 for full results when using MICE). When using MICE, the age*warfarin interaction found in our IPTW*IPCW model persisted (OR 0.03, 95% CI 0.003–0.24). In sensitivity analyses using a dichotomous outcome, warfarin use led to an odds ratio of 0.37 (95% CI 0.12–1.12) of fair or poor HRQoL in our final weighted model. Using the therapeutic INR range of 2–3 did not significantly change the association magnitudes or direction of effect. Removal of participants with a history of stroke prior to study enrollment also did not significantly alter the results.

Discussion

The current study examined the association of warfarin with HRQoL over time in participants with atrial fibrillation using data from the AFFIRM trial. Over the trial duration, warfarin use was associated with improved HRQoL in adjusted models. There was no statistically significant univariate relationship, but the association strengthened with standard adjustment methods and further strengthened when accounting for time-varying covariates and differential censoring. The association effect estimate was similar when using a dichotomous outcome rather than the original ordinal scale, when using INR 2–3 as the exposure, as well as when using multiple imputations with chained equations to deal with missing values. The association strengthened over time in patients 70 years or older at study initiation. Surprisingly, ischemic stroke did not appear to mediate the association, suggesting possible unmeasured confounding by indication, a key limitation to the study that warrants replication using our analytic framework in a population-based cohort study. Nonetheless, our results suggest that accounting for time-varying covariates and differential attrition may alter the effect estimates of warfarin use over time with HRQoL in those with atrial fibrillation, particularly in those 70 years or older.

While we acknowledge that the treatment paradigm of atrial fibrillation has significantly changed since the publication of the AFFIRM trial, there are two key methodological findings of the current study that add to the current understanding of how anticoagulation impacts quality of life in those with atrial fibrillation: (1) demonstrating the potential biasing impact of time-varying covariates and differential attrition and (2) demonstrating anticoagulation use over time may have differential effects on HRQoL by age.

Past research has shown competing risks may overinflate estimates of stroke risk in traditional analyses [14]. Additionally, a decision-analysis study determined the net clinical benefit of both warfarin and apixaban decreases with age, with the decreasing association being driven largely by the competing risk of death [10]. Our study builds upon these findings, highlighting the importance of different modelling strategies in assessing the impact of warfarin use on HRQoL in adults with atrial fibrillation. First, in comparison to prior studies [10, 14], we included death into our composite outcome as the worst possible outcome rather than treating it as a competing risk. Further, we accounted for the fact that the decision to initiate or discontinue warfarin is likely both impacted by, while also impacting other, covariates that influence HRQoL. The differences between the traditionally adjusted models and the marginal structural approach highlight the importance of potential biasing effects by time-varying covariates [25]. Similarly, we highlight how differential attrition, from competing events or otherwise, can also lead to biased estimates [26]. Unlike the decision-model-based study, we used primary participant responses to determine HRQoL rather than standardized quality-adjusted life years (QALYs). This is relevant as the standardized QALY-based method may not account for variability in reporting based on other factors [27]. Additionally, there is a wide variability of HRQoL for various conditions that may be hard to capture using surrogate standardized methods rather than direct, survey-based methods [28]. This theoretically could lead to measurement error or information bias if non-differential in nature. Cumulatively, if not accounted for, these methodological considerations may lead to underestimations on the influence of anticoagulation on HRQoL.

Nonetheless, this study has many important limitations. The most prominent limitation of this study is the decision to initiate or discontinue warfarin (or anticoagulation more generally) may not be representative of the general population, but rather influenced by the original trial protocol. The stringent study protocol may be particularly important in relation to this study, as only 55 individuals were not on warfarin at the onset of the study. As such, we urge caution in overinterpreting our results and suggest it be used as a framework for conducting future longitudinal atrial fibrillation and anticoagulation population studies rather than our findings being interpreted as causal. Additionally, direct oral anticoagulants (DOACs) have become the standard of treatment for most cases of nonvalvular atrial fibrillation, and our methodological framework needs to be replicated with individuals on DOACs. Modern day, prospective population-based cohort studies would provide a more accurate representation of this association in a real-world clinical population not bounded by trial protocols, especially if examining current standards of treatment for atrial fibrillation rather than warfarin use.

The primary analytic sample in this study may be healthier than the general population of adults with atrial fibrillation population due to the enrollment and exclusion criteria of the original AFFIRM trial. A similar methodologic framework could also be implemented to build off of prior quality-of-life research following ablation [29, 30].

As with all observational research, there may be unmeasured confounding or unmeasured covariates that we did not account for in our weights or adjustment-based analyses, particularly the inability to ascertain comorbidities at subsequent visits. These unmeasured variables may explain the lack of influence of ischemic stroke adjustment on influencing the effect estimates. Furthermore, the lack of attenuation of effect estimates when adjusting for stroke does suggest a high likelihood of unmeasured confounding by indication. Especially in the context of a clinical trial where the decision to come on or off of anticoagulation is dictated by strict clinical trial protocols, there is likely some unmeasured indication to be off of anticoagulation that subsequently improved quality of life in these individuals independent of stroke risk, such as higher bleeding risk while on anticoagulation or another contraindication. We urge replication using our study as a framework for prospective, population-based cohorts to avoid the constraints of clinical trial protocols to attain more accurate effect estimates. Additionally, we could not test the association of anticoagulation with HRQoL in individuals without atrial fibrillation; future studies should also consider repeating this analysis in individuals without anticoagulation to act as a negative control to better improve the robustness of our findings.

There may be heterogeneity of effect within the analytic sample limited to those 70 years and older, especially with respect to the oldest old. There may be outcome misclassification bias secondary to our use of a generic HRQoL measure in place of an anticoagulation or atrial fibrillation-specific HRQoL measure that would potentially be more sensitive to the effects of anticoagulation (or lack thereof) on HRQoL. However, we would expect this bias to be non-differential by our exposure, and therefore would more likely bias the association towards the null [31]. Despite these limitations, we believe the current analysis and methodology can serve as a framework for consideration when conducting future observational research on optimal anticoagulation regimens in atrial fibrillation. Additionally, the consistent direction of effect in our various statistical models using different exposure and outcome measurements suggests the association is likely internally valid and robust. Nonetheless, we urge caution against the overinterpretation of our findings and reiterate this study be used as an analytic framework for future studies.

Conclusion

In this secondary analysis of the AFFIRM Trial, we found warfarin use is associated with improved HRQoL over time in adjusted models and strengthened after accounting for time-varying covariates and potential differential attrition from competing events, though we urge replication using prospective data. The effect of anticoagulation on HRQoL in patients with atrial fibrillation may be heterogenous, particularly in understudied populations such as elderly patients and those with multiple comorbidities. We encourage prospective analyses to expand upon the influence of time-varying covariates and differential attrition on anticoagulation’s impact on HRQoL in these populations, accounting for the complex causal pathways between anticoagulation and endpoints like mortality and HRQoL.