Introduction

It has been reported that among the general heart failure population, over half have heart failure with preserved ejection fraction (HFpEF), and this prevalence has been on the rise in the aging population [1,2,3]. Some studies have even concluded that the morbidity associated with HFpEF (i.e., mainly the rate of hospitalization) is similar to that associated with systolic heart failure (SHF) [4]. The practical gold standard of grading of diastolic dysfunction (DD) is echocardiography, which has been recommended by major echocardiographic societies [5, 6]. The conventional diastolic function parameters including mitral inflow ratio, mitral inflow deceleration time, left atrium volume index (LAVI), and early transmitral velocity/tissue Doppler mitral annular early diastolic velocity (E/e’) have been applied to the grading of diastolic dysfunction and their prognostic importance has been proven in a variety of populations [7, 8]. Otherwise, more recent studies have emphasized the influence of right ventricular (RV) function, which is caused by both RV contractile impairment and afterload mismatch from pulmonary hypertension. Echocardiographic RV dysfunction is also considered to be the strongest predictor of mortality [9]. The latest recommendations for the evaluation of LV diastolic function from the American Society of Echocardiography (ASE) and the European Association of Cardiovascular Imaging (EACVI) suggest assessing the LV diastolic function grade using the aforementioned parameters plus peak tricuspid regurgitation (TR) jet [10]. Whether these classifications are associated with long-term outcomes remains unknown. In addition, there was no study applying the newest recommendations to grade DD severity and investigate the prognostic value in an Asian cohort. In the current study, we followed an Asian cohort of patients with HFpEF and analyzed the long-term prognostic factors, including medication and other comorbidities. Additionally, we classified the patients into three groups according to their diastolic dysfunction grade on the basis of the 2009 and 2016 guideline recommendations [5, 10] and examined the prognostic value of these two different grading systems for the long-term survival and identification of major cardiovascular risks in patients with HFpEF.

Methods

Study subjects

Subjects in this study were enrolled from registrants of the Taiwan Diastolic Heart Failure Registry (TDHFR) who were added from January 2008 to October 2016. Patients with a diagnosis of HFpEF (as defined in previous reports as well as by the consensus statement of the European Society of Cardiology) were enrolled from the TDHFR [6]. Details of the inclusion and exclusion criteria of TDHFR have been reported previously [11]. Patients who had renal failure, significant hepatic disease, secondary hypertension, pericardial disease, severe valvular heart disease, cancer, chronic obstructive pulmonary disease, and/or chronic atrial fibrillation were excluded. In order to exclude subjects with critically ill status or end-stage heart failure and ensuring stable outpatient follow-up, individuals who died or experienced cardiovascular events within 60 days after enrollment were also excluded. Finally, 451 patients from the TDHFR were enrolled in the current study. The algorithm was listed in Fig. 1.

Fig. 1
figure 1

Patient flow diagram. Asterisk indicates renal failure, significant hepatic disease, secondary hypertension, pericardial disease, severe valvular heart disease, cancer, chronic obstructive pulmonary disease, and/or chronic atrial fibrillation. E, early mitral inflow velocity; e’, early diastolic mitral annular velocity; LAVI, left atrium volume index; TR, tricuspid regurgitation

Demographic data were collected from the patients’ medical chart records. Hypertension was defined as a systolic blood pressure of ≥ 140 mmHg, a diastolic blood pressure of ≥ 90 mmHg, or the use of at least one class of antihypertensive agents. Non-insulin-dependent diabetes mellitus was defined as a fasting blood glucose concentration > 126 mg/dL and/or the use of at least one oral antihyperglycemic agent. Information regarding medications, such as the use of angiotensin-converting enzyme inhibitors (ACEI) and/or angiotensin II receptor blockers (ARB), calcium channel blockers (CCB), diuretics, nitrates, and/or beta-blockers, was also recorded.

Echocardiographic diastolic dysfunction grade

Subjects were divided into normal diastolic function and other three DD groups according to their grade, as newly proposed by ASE and EACVI in 2016 [10]. Owing to the lack of invasive hemodynamic investigation, indeterminate diastolic function based on 2016 algorithm was considered normal. Therefore, the normal diastolic function was defined as subjects who only meet one or two of the four criteria, i.e., (1) average E/e’ > 14, (2) septal e’ velocity < 7 cm/s or lateral e’ velocity < 10 cm/s, (3) TR velocity > 2.8 m/s, and (4) LAVI > 34 mL/m2(Fig. 1). The 2016 DD grade was evaluated using several parameters including the ratio of mitral inflow velocity to mitral peak velocity of late filling (E/A), peak E velocity, peak velocity of TR jet, medial and lateral e’, E/e’ ratio, and LAVI.

Likewise, subjects with septal e’ ≥ 8 or lateral e’ ≥ 10 and LAVI < 34 mL/m2 were classified as normal diastolic function group according to the 2009 DD grading recommendation [5]. The rest of the subjects were divided into three groups with different grades of DD (Fig. 1). Certain different echocardiographic parameters were evaluated, including deceleration time (DT), difference between the duration of atrial reversal velocity and mitral A-wave (Ar-A), and isovolumetric relaxation time (IVRT).

Endpoints

The primary outcome of this study was defined as all-cause mortality and hospitalization for heart failure.

Follow-up

The follow-up period ended on December 31, 2017. All of the patients visited our outpatient clinic at least every 3 months; otherwise, they were interviewed by telephone annually. All of the patients were carefully followed up. Information regarding the primary and secondary study outcomes was documented in chart records and/or via telephone interviews. For each patient, the time to death or cardiovascular event(s) was calculated from the initial date of diagnosis of HFpEF to the date on which the primary or secondary outcome occurred.

Statistical analysis

Data is expressed either as mean ± SD or as frequencies and/or percentages. To enable a comparison of the baseline characteristics between the three groups of different diastolic dysfunction grades, we performed one-way analysis of variance or Kruskal–Wallis test for continuous variables and the χ2 test or Fisher’s exact test for categorical variables, as appropriate. Then, for double comparison, the post hoc pairwise t test was used which was corrected for multiple testing by the Bonferroni correction. We firstly performed a univariate Cox regression analysis to examine the factors associated with all-cause mortality and HF hospitalization. Predictors in the multiple Cox model were selected from the set of variables that reached statistical significance in the univariate analysis via a forward selection procedure, with the significance limit to enter the model set at 0.05. The survival time was defined as the duration between enrollment and the occurrence of an event (defined as either a primary or secondary endpoint). Survival curves were estimated using the Kaplan–Meier method while the log-rank test was used to compare survival differences. Using grade I group as the reference, multivariate Cox’s proportional hazard regression analyses were performed to derive the adjusted HRs for the risk of outcomes in the different groups. We adjusted for age, sex, comorbidities (hypertension, diabetes, hyperlipidemia, coronary artery disease [CAD], and renal failure), medication usage, left ventricle mass index (LVMI), and LAVI. The incremental discriminatory ability of application of the 2016 guideline for predicting mortality and HF hospitalization above the 2009 guideline was evaluated with net reclassification index [12]. The 95% confidence interval (CI) of net reclassification index was calculated as well [12]. A receiver operating characteristic (ROC) curve and Harrel’s C statistic were constructed to assess the prognostic accuracy of 2016 and 2009 DD grading algorithms. In addition, 95% CI of these C statistics were calculated by the “somersd” package in STATA [13]. Statistical analysis was performed using IBM SPSS Statistics version 21.0 (IBM) and STATA version 14 (StataCorp LP). Two-sided p values < 0.05 were considered statistically significant.

Results

Baseline characteristics

Overall, 451 patients with HFpEF were enrolled in the present study. The median follow-up period was around 8 years (median 2976 days, mean 3012 ± 512 days). Patients were classified into normal diastolic function and three DD groups according to 2016 DD grades, and their baseline characteristics and echocardiographic and clinical data are presented in Table 1. The most common comorbidity was hypertension (65%), which is consistent with previous studies on diastolic heart failure. On the other hand, patients with diastolic dysfunction grade II (n = 308) or grade III (n = 37) in this study were found to be significantly older and predominantly female, compared with those with grade I (n = 66). Furthermore, patients with more advanced DD were more likely to have other cardiovascular-associated risks, including hypertension, diabetes, hyperlipidemia, and CAD. The renal function was normal and there was no difference among the four groups. While there was missing data of pro-BNP in 107 (23%) patients, the overall pro-BNP was high in our cohort (3653 ± 161 pg/mL). Subjects with grade III DD had significantly increased pro-BNP than those with grade I and II. The most common medications used were diuretics (52%); 35% of patients had been prescribed ACEI or ARB and approximately 46% of patients were taking beta-blockers or CCB. Notably, up to 80% of patients with DD grade III were taking diuretic agents, reflecting elevated left atrial pressure (LAP) and a greater frequency of congestive symptoms. In general, patients with diastolic dysfunction had an intact LV ejection fraction of > 60%. With regard to other echocardiographic findings, patients with advanced DD showed significantly higher early mitral inflow velocity (E) and E/E’, E/A ratios, and tricuspid regurgitation pressure gradient (TRPG). In terms of cardiac size, there was no difference in left ventricular mass index, but large LA size was noted in patients with grade III DD (Table 1).

Table 1 Baseline characteristics among the different diastolic dysfunction grade groups in accordance with the 2016 ASE and EACVI guideline

2009 and 2016 DD grades and other clinical risk factors as predictors of outcomes

Table 2 summarizes the factors associated with mortality and HF hospitalization, as determined via univariate and multivariate analyses. The 2016 DD grade was an independent predictor of both mortality (p = 0.038) and HF hospitalization (p = 0.006), whereas the 2009 DD grade was not. Additional parameters associated with mortality were older age, hypertension, and LAVI. Likewise, those parameters plus diabetes were associated with HF hospitalization.

Table 2 Predictors of mortality and major cardiovascular events

DD grade and outcomes

After approximately a median of 2976 days of follow-up, 119 patients (26.4%) experienced mortality with an incidence of 29 events per 1000 patient-years, and 93 patients (20.6%) experienced HF hospitalization with an incidence of 27 events per 1000 patient-years (Table 3). Compared with the 2009 classification of DD grades, a greater number of patients were reclassified into DD grade II according to the 2016 recommendation.

Table 3 Incidence and hazard ratio (95% CI) of mortality and hospitalization for heart failure with grade I as the reference group

After controlling for the influence of age, sex, comorbidities (hypertension, diabetes, hyperlipidemia, CAD, and renal failure), medications, LVMI, and LAVI on mortality and HF hospitalization, multivariate Cox analysis demonstrated that in comparison with patients with DD grade I, patients with DD grade III were associated with a higher risk of mortality (hazard ratio [HR] = 1.806, 95% CI = 1.554–2.982) and HF hospitalization (HR = 2.103, 95% CI = 1.099–3.982) when they were classified according to the 2009 DD grade recommendation. Likewise, patients with DD grade III were associated with higher mortality (HR = 2.209, 95% CI = 1.144–4.266) and HF hospitalization (HR = 2.047, 95% CI = 1.348–3.870) when they were reclassified according to the 2016 DD grade recommendation (Table 3). When comparing patients with DD grades I and II, no difference was found in the risks of HF hospitalization regardless of whether the 2009 or 2016 recommendation was applied. However, according to 2016 recommendation, patients with DD grade II were associated with higher risk of mortality (HR = 1.538, 95% CI = 1.313–1.924) than those with grade I, while there was no difference in mortality risk between the two according to the 2009 recommendation (HR = 1.109, 95% CI = 0.627–1.963) (Table 3). On the other hand, treated subjects with normal diastolic function as the reference group, those with grade II and III DD, were associated with increased risk of mortality based on 2016 recommendations while only subjects with grade III DD remained as the finding based on 2009 recommendations. Nevertheless, there was significantly higher risk of HF hospitalization among subjects with grade I to grade III DD based on 2016 recommendations and subjects with grade III DD based on 2009 recommendations when compared with those with normal diastolic function (Supplement Table 1).

In the Kaplan–Meier analysis, patients with DD grade III showed higher mortality (log-rank p < 0.001) (Fig. 2a) as compared with those in DD grade I according to both the 2016 and 2009 algorithms. Notably, according to the 2016 DD grade, patients with grade II DD still experienced higher mortality than those with grade I, while there was no difference according to the 2009 DD grade (Fig. 2b).

Fig. 2
figure 2

Kaplan–Meier analysis of mortality according to: 2016 DD grade (a) and 2009 DD grade (b). DD, diastolic dysfunction

Net reclassification index and receiver operating characteristic curve

We treated the classification of DD grade based on the 2009 guideline as reference and reclassified our cohort in accordance with the 2016 guideline. The resultant net reclassification index was significant for mortality (index = 0.106, 95% CI = 0.057–0.192, p = 0.006) but not for HF hospitalization (index = 0.029, 95% CI = 0.014–0.091, p value = 0.24). A complete overview is shown in Table 4. In view of the ROC curves based on the 2009 and 2016 DD grade in predicting mortality, the difference between the areas under the curves (AUC) reached statistical significance (2016 DD grade AUC = 0.645 vs. 2009 DD grade AUC = 0.573, p = 0.02). On the other hand, there was no difference between the AUC in the prediction of HF hospitalization (2016 DD grade AUC = 0.573 vs. 2009 DD grade AUC = 0.558, p = 0.22). In multivariate analyses, the application of the latest 2016 DD grading algorithm resulted in incremental improvement in the predictive performance for mortality (Harrel C statics, 0.667 vs. 0.714; p = 0.012 for 2009 algorithm vs. 2016 algorithm) but not for HF hospitalization (Harrel C statics, 0.674 vs. 0.658; p = 0.012 for 2009 algorithm vs. 2016 algorithm) (Supplemental Table 2).

Table 4 Reclassification of diastolic dysfunction grade among patients with events and control based on 2009 and 2016 guidelines

Discussion

In the present study, the prognostic value of the 2009 and 2016 DD grading recommendations was compared, and the 2016 recommendation was validated independently and externally for the relation between DD grade and outcomes. The important findings are the 2016 grading algorithm downward reclassified nearly half subjects with grade III DD based on 2009 algorithm and better-identified subjects with grade II DD independently associated with higher risk of mortality. To the best of our knowledge, this is the first study to verify and compare the usefulness of the 2009 and 2016 DD grading systems in predicting mortality and HF hospitalization in patients with HFpEF after adjustment for simple clinical, demographic, and echocardiographic variables [14].

Prior studies have varied in characterizing the degree of risk of mortality according to the stage of DD. As compared with normal diastolic function, there was no significant association between diastolic function and outcomes in the I-PRESERVE trial [15]. And a number of cohorts demonstrated increased risk of mortality associated even with mild DD [7, 16]. Nevertheless, the majority of studies reported patients with moderate and severe DD independently conferred higher mortality risk as compared with those with normal or mild DD [8, 17, 18]. The severity of DD was identified based on 2009 algorithm or previous classification criteria, which often-used echocardiographic variables were the E/A and E/e’ ratios. In our analysis, based on the 2016 DD algorithm, we observed subjects with grade III DD are associated with a significantly higher risk of mortality as compared with those with grade I according to both DD grading algorithms, which was consistent with the restrictive pattern of DD associated with poor outcomes [8, 16]. The prognostic markers identified in our study are consistent with the previously published literature, including age, male gender, hypertension, and diabetes [17, 19]. In respect of echocardiographic indices, our analyses showed that LAVI, but not the 2009 DD grade, was independently associated with poor outcomes, which was also reported in the subanalysis of the I-PRESERVE trial [15]. The presence of moderate or severe DD based on 2016 grading algorithm was associated with increased risk of mortality and MACE in our analysis. The application of 2016 recommendation to reclassify the DD grade above the 2009 recommendation improved two metrics used to determine the ability for mortality: the net reclassification index and AUC for ROC curve. The AUC showed higher predictive value of 2016 recommendation than that of the 2009 DD grade, though it was modest overall for both grading systems. While the differences in the AUC are marginal, applying the 2016 recommendation showed a 10% increased net reclassification index for mortality. Based on the 2016 recommendation, the survival analysis showed the subjects with grade II DD was significantly associated with mortality as compared with grade I DD while there was no significant increased risk based on the 2009 recommendation. Therefore, our results demonstrated the newly proposed algorithm could provide superior predictive ability for mortality.

In our analysis, a part of subjects with grade I DD based on 2009 recommendation was reclassified into grade II DD according to the 2016 recommendation. The possible explanation was the addition of evaluation of TRPG in the new grading algorithm. In previous evaluations of DD, including the 2009 DD grading system, the severity of DD was only assessed by the pattern of mitral inflow and tissue Doppler image. In the 2016 algorithm for determining grade II DD, the evaluation criteria included peak velocity of TR jet, E/e’, and LAVI as an E/A ratio < 0.8 along with a peak E velocity of > 50 cm/s or an E/A ratio > 0.8, but < 2. Elevation of these three parameters suggests grade II DD with increased left atrium pressure. In particular, increased peak velocity of TR jet indicates elevation of TRPG, pulmonary artery systolic pressure, pulmonary capillary wedge pressure, and in turn, left ventricular filling pressure [20]. Elevated TRPG and PASP also suggest pulmonary hypertension, which is common and associated with mortality in HFpEF [21].

Emerging evidence has shed light on the prognostic importance of right ventricle dysfunction or afterload in patients with HFpEF, which is assessed via right ventricular catheterization as well. The presence of right ventricle dysfunction is associated with increased mortality and heart failure hospitalization rates [9, 22]. Abnormality of coupling between right ventricle contraction and pulmonary circulation indicated worse outcomes [23]. Furthermore, several studies demonstrated that pulmonary capillary wedge pressure (PCWP) but not LV end-diastolic pressure was associated with mortality in HFpEF [24]. The 2009 and 2016 DD grading algorithms both emphasize the measurement of mitral inflow velocities in order to estimate LAP. Regarding the 2016 algorithm, the addition of peak velocity of TR jet to evaluate the right ventricle afterload and PCWP of 2016 DD grading algorithm could refine the risk stratification. The use of echocardiography to evaluate DD is a non-invasive application of the new algorithm to grade DD that provides prognostic information in most patients with HFpEF.

In respect of HF hospitalization, there was no difference of risk of HF hospitalization in patients with grade II or III DD compared with grade I DD irrespective of 2009 and 2016 algorithms. Diuretics are the major decongestive therapy to rapidly attain stable euvolemic status of patients with acute decompensation. Our cohort showed significantly more prevalent use of diuretics in patients with grade III DD. As a result, it may prevent further HF hospitalization after physicians increased the dose of diuretics according to echo results. However, there was significantly increased risk of HF hospitalization when compared with the normal group (Supplemental Table 1). Of note, even patients with grade I DD according to 2016 algorithm had higher risk of HF hospitalization when compared with those with normal diastolic function. However, only patients with grade III DD had higher risk of HF hospitalization when applying 2009 DD grading algorithm. Compared with 2009 algorithm, the latest 2016 algorithm could more efficiently and reliably estimate the increased LV filling pressure, usually preceding clinical congestion [25, 26]. Therefore, physicians were able to increase the dose of diuretics owing to more advanced DD grade and severe congestive symptoms. On the other hand, applying the 2016 grading algorithm could more accurately classify and diagnose patients with advanced DD, especially for those patients previously classified with grade I DD according to 2009 algorithm [27, 28]. Our results showed higher value of pro-BNP of subjects with advanced DD stage and significantly increased NRI, which were consistent with the aforementioned studies and previous observations [29,30,31]. Furthermore, our findings demonstrate the latest 2016 DD grading algorithm possess better prognostic ability for mortality, which really matters in the long-term follow-up.

Study limitations

Our study had several limitations. First, there was a certain proportion of data missing among some patients during the decade of follow-up. Some limitations of our study also arose from the type of data (i.e., registry-derived) that we analyzed. On the other hand, the lack of the invasive hemodynamic investigation made the classification of DD difficult when echocardiographic indices were incomplete or ambiguous. Moreover, the comparison between different DD evaluation algorithms is limited due to the small numbers of outcomes. Second, echocardiography was not systematically performed on the index date. Although the Doppler tissue image, LAVI, and peak TR velocity were recorded, some indices were not fully evaluated, including regional wall motion abnormality, the function of right heart, and LV strain. Third, in line with current the recommendations, we adopted ejection fraction (EF) ≥ 50% as the cut-off value to diagnose HFpEF, but the laboratory data were lacking [24]. We also excluded patients with HF and borderline ejection fraction (40–50%). Fourth, the referral bias of a hospital-based study cannot be eliminated because our hospital is a tertiary center. The restriction of the study to include hospitalized patients might have introduced a bias, since the results from this population may not reflect larger trends in disease prevalence in the community.

Conclusion

The recommendations for the 2016 DD grading system are based on expert consensus and have not yet been validated [10]. Our study is the first to evaluate the prognostic value of the algorithm with respect to outcomes among Asian patients with preserved ejection fraction. The present study shows that the 2016 DD grading algorithm significantly improves the classification of DD patients at higher risk of mortality. The non-invasive assessment of LAP is the premise of the algorithm and it correlated well with clinical symptoms and outcomes. Hence, the echocardiographic indices of the new algorithm should be obtained and applied to effectively evaluate DD.