Introduction

Echocardiography is widely used to evaluate cardiac structure and function, and the non-invasive evaluation of diastolic dysfunction (DD) has always been a cornerstone in Doppler echocardiography. The progressive understanding of the echocardiographic expression of DD physiopathology has expanded its application not only for the elucidation of the etiology of cardiac symptoms and its repercussion, but also as a biomarker to prognosticate the risk of heart failure (HF) and cardiovascular events.

Standardization of DD criteria is paramount, as its prevalence may vary from 12 to 84% in the same sample, depending on how echocardiographic parameters are weighted [1, 2]. This led to the elaboration, in 2009 and 2016, of recommendations for assessing diastolic function, published jointly by the ASE/EACVI [3, 4]. The 2009 recommendation was considered laborious and complex, and led to a large proportion of patients being diagnosed with mild diastolic dysfunction, particularly among the elderly [5,6,7]. This is a matter of concern, as the independent prognostic value of mild DD is less consistent compared to more advanced grades [8, 9]. The 2016 guidelines assume abnormal diastolic function in the presence of depressed systolic dysfunction, myocardial disease, and left ventricular (LV) structural abnormalities, and also reorganize Doppler parameters, prioritizing LV filling pressure estimates.

There is a large proportion of patients with pre-clinical heart failure (AHA Stage B HF) [10] identified by the presence of cardiac structural abnormalities that largely overlap with DD algorithms. On the other hand, approximately half of symptomatic HF patients have preserved ejection fraction (HFPEF), [11, 12] with their symptoms attributed to diastolic dysfunction (DD) with elevated LV filling pressures [13]. The balance between structural and functional abnormalities is particularly noteworthy in the elderly, where HFPEF and DD are more prevalent, reaching 4.9% and 36%, respectively [7]. Considering that therapeutic studies in HFPEF populations were developed before the establishment of DD guidelines, understanding the evolution of its diagnostic criteria is still relevant.

The purpose of our study was to compare the concordance of diastolic dysfunction between the 2009 and 2016 ASE/EACVI guidelines classifications in an elderly population and to investigate the impact of the inclusion of left ventricular structural abnormalities in the reclassification.

Methods

Study population

We consecutively included the first clinically indicated transthoracic echocardiogram performed between January and February of 2017, in individuals older than 60 years-old in a tertiary general teaching hospital, which also provides ancillary exams to outpatients referred from community clinics. We excluded patients with conditions that could interfere with DD assessment (arrhythmia, mitral valve prosthesis, mitral stenosis, pacemaker, E and A wave fusion), or with incomplete echocardiographic image datasets.

Demographic and clinical data were retrospectively extracted from medical and echocardiographic reports. The presence of obesity, hypertension, diabetes, and coronary artery disease was based on clinical history, medication use and physician reported diagnosis. Patients with these conditions were classified as at risk of developing heart failure (AHA Stage A HF) [10].

This study was conducted in accordance with the Declaration of Helsinki standards and was approved by the Institutional Review Board.

Echocardiographic analysis

Echocardiographic images were acquired with commercially available equipment (EPIQ7 and IE33, Philips Healthcare; Vivid7, GE Healthcare; and Aplio XG, Canon Medical Systems) and digitally archived in the hospital imaging system (IMPAX, AGFA HealthCare). A single investigator (VLG) reviewed all archived images of patients fulfilling enrollment criteria in a dedicated workstation (QLab3.3.2; Philips Healthcare, USA). Measurements necessary for diastolic function evaluation were performed and extracted by the investigator, blinded to the original echocardiographic clinical report. These measurements were tabulated and later classified using the 2009 and 2016 guidelines algorithms detailed below. We defined LV structural abnormalities (LVSA) as abnormal LV ejection fraction, regional wall motion, or LV geometry [14]. These are the same criteria we used to classify patients as AHA Stage B heart failure [10].

Figure 1 describes the criteria used for the detection of DD. According to the 2009 guidelines classification, DD was present if any of the following abnormal parameters was found (septal e’ velocity < 8 cm/s, OR lateral e’ velocity < 10 cm/s, OR left atrial volume index ≥ 34 ml/m²). The remaining cases were classified as normal diastolic function. According to the 2016 recommendation, DD was present if the patient had any LVSA or at least 3 abnormal parameters of the flowchart. If only septal or lateral E/e’ ratio was available, cut points of E/e’ 15 cm/s and 13 cm/s were considered, respectively. Patients with only one abnormal parameter and without LVSA were classified as normal diastolic function. Those cases without LVSA and with 2 normal and 2 abnormal parameters were considered as indeterminate diastolic function.

Fig. 1
figure 1

Criteria for diastolic dysfunction diagnosis according the 2009 and 2016 guidelines. LV, left ventricular; LA, left atrial; TR tricuspid regurgitation; * Left ventricular structural abnormalities (LVSA): Wall motion abnormalities, reduced ejection fraction (EF < 52% in men or < 54% in women), LV hypertrophy (> 115 g/m2 in men or > 95 g/m2 in women), or concentric remodeling (RWT > 0.42).

The grading of DD severity is shown on Fig. 2. According to the 2009 guideline, grading was ascertained if at least 2 out of 3 parameters (E/A ratio, mitral flow deceleration time (DT) and E/e’ ratio) were concordant, otherwise, they were classified as DD that could not be graded. The 2016 guideline was applied with the following criteria: E/A ratio greater than 2 was defined as grade III DD; E/A ratio lower than 0.8 and E wave velocity less than 50 cm/s was defined as grade I DD. For classification of the remaining cases we considered E/e’ ratio, left atrial volume, and tricuspid regurgitation velocity: DD grade I if at least two were normal, DD grade II if at least two were abnormal. If only two were available and divergent, it was considered as indeterminate DD. For study purposes, we aggregated the DD classification individual criteria into two separate components: the presence of LV structural abnormalities (LVSA) and Doppler/LV filling (DOP) criteria, according to the 2016 flowchart shown in Fig. 1.

Fig. 2
figure 2

Grading of diastolic dysfunction (DD) according to the 2009 and 2016 guidelines criteria

The estimation of left ventricular filling pressures according to 2009 recommendations follows two separate flowcharts, based initially on E/e’ ratio for preserved EF and on E/A ratio for reduced EF, considering left atrial volume or E/e’ ratio for intermediate values of E/e’ or E/A, respectively. By the 2016 definition, patients with grade II or III DD were considered to have elevated filling pressures.

Statistical analysis

Demographic and echocardiographic data were described as prevalence (%) or mean and standard deviation (SD) where appropriate. The proportion of individuals within each group was compared with McNemar’s test. Concordance between each guideline on the prevalence of DD (globally and among its strata) and of elevated filling pressures was compared through Cohen’s Kappa coefficients. Venn-Euler diagrams were used to compare the overlapping of DD components [15]. The analyses were performed using SPSS V18 (SPSS Inc. Chicago, IL.).

The sample size was estimated from a study with 100 patients that found a prevalence of DD of 27% and 13% according to 2009 and 2016 guidelines, respectively [16]. A sample of 300 patients (276 exams plus a margin of 10%), which corresponded to a two-month period, was estimated to detect a similar reduction of 50% in DD prevalence with a p = 0.05 and a Beta = 0.8.

Results

Of the 1438 consecutive echocardiograms performed during the study period, 470 were transthoracic studies in elderly patients, and 308 of these studies fulfilled the inclusion criteria (Fig. 3). The studied population was 70.4 ± 7.7 years old, consisting mostly of outpatients, females, with hypertension (Table 1). Echocardiographic characteristics are detailed in Table 2. The prevalence of echocardiographic abnormalities considered for diastolic dysfunction classification, and the distribution of each of these abnormalities in each grade of diastolic dysfunction in either 2009 and 2016 classifications are shown in Table 3. The most prevalent abnormalities in the whole sample were reduced annular tissue doppler velocities (81.5%), left ventricular hypertrophy (43.8%), and enlarged left atrium (42.2%).

Fig. 3
figure 3

Flowchart of participants included in the study

Table 1 Clinical characteristics of the studied population
Table 2 Echocardiographic characteristics of the studied population
Table 3 Prevalence and distribution of left ventricular structural abnormalities (LVSA) and specific findings for classification, according to 2009 and 2016 Guidelines Diastolic Dysfunction grades

The prevalence of DD according to the 2009 guidelines was 91% and was reduced to 64% when applying the 2016 criteria (p < 0.001), with a poor concordance between them (Kappa = 0.21; p < 0.001). There was an increase in the proportion of echocardiographic studies with normal diastolic function and a respective reduction in all grades of DD when applying the 2016 criteria, with 7.5% being classified as indeterminate diastolic function. Only 101 (32.8%) patients remained in the same category in both recommendations, and the distribution along each grade of diastolic function is shown in Fig. 4. Differences in prevalence were due to disagreements in all individual grades, not only a downgrade of DD severity, resulting in an even lower concordance when the grade of DD was considered (Kappa = 0.118, p < 0.001).

Fig. 4
figure 4

Prevalence N (%) of DD grades according to the 2009 guideline and its reclassification according to the 2016 recommendations. DD, diastolic dysfunction; DF, diastolic function

The Venn-Euler diagrams in Fig. 5 depict the impact of LV structural abnormalities (LVSA) and Doppler/LV filling (DOP) abnormalities in DD classification, according to each guideline criteria. Firstly, we can notice that the prevalence of DD (black circles) is higher by the 2009 guidelines in the whole sample (grey circles). As expected by the 2009 definition, all cases were considered to have DD by the presence of DOP abnormalities (blue filling). The 2016 DOP component is less prevalent due to the more restrictive cutoff for septal e’ (7 cm/s instead of 8 cm/s) and the inclusion of E/e’ ratio and TR jet velocity. In addition, a large proportion of 2016 DD cases is defined solely by the presence of LVSA (red filling). If LVSA were not considered a component of DD evaluation in the 2016 guidelines, the prevalence of any grade of DD would be 32% instead of 64%. According to previous recommendations, DOP was the only and less restrictive criterion, resulting in a high proportion of DD (black circle). The recent recommendations, on the other hand, had a more restrictive DOP criterion (blue filling), resulting in a lower prevalence of DD, which was driven mostly due to LVSA (red filling).

Fig. 5
figure 5

Prevalence of Doppler (DOP) and left ventricular structural abnormalities (LVSA) components on the diagnosis of diastolic dysfunction (DD) according to 2009 and 2016 criteria

Differences in component distribution are more noticeable across DD grades, as shown in Fig. 6. Across the 2009 criteria categories, LVSA is more frequent in the most advanced degrees of DD. Conversely, across the 2016 criteria categories, the presence of LVSA was enough to identify 97.4% of grade I DD (yellow circles), with increasing importance of DOP in DD grades II and III (orange and red circles).

Fig. 6
figure 6

Contribution of left ventricular structural abnormalities (LVSA) and Doppler (DOP) along DD grades. DOP criteria of 2009 resulted in a larger number of patients in all degrees of DD. LVSA are constant between guidelines representations, and played a decisive role especially in grade I DD of the new recommendations, in which virtually all cases were diagnosed exclusively by these abnormalities, with DOP impacting grades II and III. Doppler abnormalities cutoffs for 2009 or 2016 criteria are detailed on Fig. 1

The prevalence of DOP abnormalities was lower in the 2016 document due to the stricter definition criteria. Elevated LV filling pressure prevalence was halved by the 2016 guideline criteria compared to the 2009 guidelines (20.9% vs. 39.2%, respectively; p < 0.001), with a greater concordance between the guidelines for this measurement (Kappa = 0.56, p < 0.001). Among all cases with elevated filling pressures established by the 2016 guideline (n = 64), only one had not been considered as such by the 2009 guideline.

Discussion

In this study we demonstrated large discrepancies between the 2009 and 2016 diastolic function classifications in a sample of elderly patients. We found that DD was almost universal in the elderly using the 2009 criteria, and this prevalence was reduced to 64% using the updated recommendations. Interestingly, disagreements between guidelines were scattered among all grades of DD. Roughly, the 2016 guidelines classified the less severe DD cases by the presence of LV structural abnormalities, while grades II e III DD were based on elevated filling pressures.

The reduction in DD prevalence using the 2016 guidelines was described in other studies, across a wide range of DD prevalences [16,17,18,19]. A systematic review found a 36% prevalence of diastolic dysfunction in the elderly [20]. The higher prevalence of DD found in our study is likely explained not only by the profile of our population (which was made up of individuals with advanced age and a high prevalence of comorbidities and LVSA), but also by the criteria adopted to consider diastolic dysfunction. The 2009 guideline presents great heterogeneity in the interpretation and application of its criteria. Although it has a flowchart only for the diagnosis of DD and another for classification, many studies ended up merging the available variables in a single decision tree [2]. We performed the analysis in two steps, as recommended, even though the first flowchart itself is already a potential source of inconsistencies, since the guideline does not make it clear how the 3 parameters (septal e’, lateral e’, and LA volume index) should be considered. In this regard, we chose to use the form that most studies adopted in Selmeryd et al. systematic review (any one of the variables being sufficient and not a combination of all) [2]. When using this interpretation in a population of high-risk elderly people, a prevalence of DD of 94% was found [2], which is in line with that found in our study (91%).

The 2016 guideline presents a clear advance since it specifies which variables to consider and how the resulting combinations should be interpreted both in the diagnosis and in the categorization of DD, which increases the potential reproducibility of its findings [21]. Despite this, there is a potential source of ambiguity when the guideline says that myocardial disease or even clinical comorbidities could already identify cases that should go straight to the second flowchart (degree of DD and estimation of filling pressures), but without making it clear which abnormalities these would be. In another work by the guideline author’s group, the clinical criteria described are hypertensive heart disease, diabetes mellitus, chronic kidney disease, and coronary artery disease with segmental dysfunction. If that were the case, studies with their entire population consisting of diabetics and chronic kidney disease that found a prevalence of 7.2% [22] and 32% [23] of DD had adopted this interpretation, they would have to modify it to 100%. In this regard, we chose to use only abnormalities that can be clearly identified on the two-dimensional echocardiogram (reduced ejection fraction, abnormal left ventricular geometry, and regional wall motion abnormality) as by recommendations for chamber quantification [14]. Moreover, we evolved the clinical-echocardiographic concept of hypertensive heart disease, which rely on clinical information, into the pure echocardiographic concept of abnormalities in the geometry of the left ventricle, which has a prognostic impact [24, 25], has a clear definition [14] and is also part of the HFPEF diagnostic score [26].

A highlight of our study was to separately evaluate the contribution of LV structural abnormalities and of the Doppler/LV filling components in each degree of DD classification. In our study, 95% of those classified with DD by 2016 criteria would be so by the presence of LV structural abnormalities. Considering that LVSA is relevant for diagnosing DD according to the new recommendations and that they already carry an intrinsic cardiovascular risk, [27, 28] it is not possible to know for sure whether the better performance of the new criteria in predicting events [19, 29] is really due to diastolic dysfunction or is a matter of a mere association with comorbidities and structural abnormalities. In this sense, new proposals have emerged, removing the LVSA from the initial flowchart, that would make the evaluation simpler, [30] but that could increase the proportion of inconclusive diagnoses [17].

Previous studies have shown a worse prognosis only with DD grades II and III [8, 9]. An exception was the Olmstead cohort which showed an association even in grade I [31]. However, among these DD cases, there was a high prevalence of obesity, hypertension, diabetes, cardiovascular disease, and LV dysfunction, also calling into question whether the prognosis was due to comorbidities. In another study, while only grade III DD of the previous recommendations was able to predict events, independently of comorbidities, all grades of the new recommendations were able to [29]. In our study, we can see that individuals with grade III diastolic dysfunction according to the 2009 criteria, present in their entirety the presence of LV structural abnormalities. Likewise, by the 2016 criteria, with the worsening in the degrees of DD, there is a greater overlap between LVSA and DOP abnormalities.

Our study is the first to our knowledge to compare the 2009 and 2016 guidelines in the elderly. These findings in DD classification are of special interest in this population, since elderly patients have a higher risk of symptomatic HF [31], particularly HFPEF [32] and a high prevalence of LVSA. As such, most elderly patients would be classified both as AHA Stage B HF and Grade I DD, demonstrating how these different classifications overlap in this population. Moreover, Doppler criteria used for the identification of elevated filling pressures in HF patients are more prevalent in older people [33] and sensitive to aging [34]. Age-adjusted Doppler criterion, instead of a single cutoff, produces a more stable distribution of DD across age groups [5]. Under this perspective, we found that the 2016 guidelines halved the prevalence of elevated filling pressures compared to the 2009 criteria. Similar results were described in a multicenter study, which found a reduction from 35 to 15% [35]. In this report, the 2016 guideline criteria were more accurate compared to invasive measurements (ROC area below the curve 0.78 vs. 0.68), [35] a finding replicated in other studies [36]. This improvement in accuracy was largely due to a reduction in false-positives, as appears to be the case in our study.

It should be noted that the application of different parameters and support criteria is still a concern and may affect the distribution and concordance of DD across guidelines. The low concordance in the diagnosis of DD that we found (k = 0.21) has already been identified by different investigators (k = 0.18 [18] and k = 0.43 [17]). Even though the new recommendations proved to be more specific than the previous ones, [17,18,19] our study shows that the reclassification is not a simple step down in each degree of DD dysfunction, but rather that each case took a heterogeneous path through the new classification (Fig. 4).

We acknowledge that the single center nature limits the external validity of our findings, and that the cross-sectional design impaired the assessment of incident outcomes associated with the classifications. Moreover, the elevated proportion of comorbidities and structural abnormalities could limit its generalization to lower cardiovascular risk populations; however, it corresponds to the population more likely to benefit from echocardiographic information. It should be noted that all clinically requested echocardiograms were executed, recorded, and interpreted by different echocardiographers, which may explain the failure to systematically include other DD measures in our analysis, such as pulmonary veins’ Doppler findings, global longitudinal strain and mitral inflow parameters under Valsalva maneuver. On the other hand, by not including less available parameters, we make our results easier to reproduce in other clinical scenarios, increasing their external validity. Furthermore, the investigators, blinded for the echocardiography report, reviewed all images and measured all relevant parameters for the current analysis. This is a considerable strength of the study, as it avoided most of the variability that could be attributed to inter-reader issues and subjective judgment.

Conclusions

Our study demonstrates that DD is highly prevalent in elderly subjects irrespective of the diagnostic criteria used. However, a lower rate of diastolic dysfunction was observed applying the 2016 updated criteria compared to the 2009 recommendations, with a poor agreement in all individual DD grades. Elevated filling pressure is less prevalent by the 2016 criteria but seems to be a more consistent parameter evaluated in both guidelines. Our results show the impact of incorporating LV structural abnormalities on discrepancies in DD categorization, especially at classifying grade I DD. Longitudinal studies are necessary to investigate the independent role of the individual components, for diagnostic and prognostic purposes. If not properly addressed, these differences may undermine the scientific background supporting the widespread use of DD as a cardiovascular prognostic tool. Finally, the set of findings from different studies may help in the elaboration of future diastolic dysfunction guidelines.