Introduction

Cardiovascular disease (CVD) is the leading cause of death for both men and women in every major ethnic group in the USA, accounting for 32 % of all deaths in 2010 and, on average, claiming more than 2150 lives each day [1]. In 2010, an estimated 83.6 million adults in the USA had CVD at a cost of $315.4 billion [1]. CVD risk scores are used in routine clinical practice to identify patients who would most benefit from intervention. Three common risk scores, the Framingham Risk Score (FRS) [24] and atherosclerotic cardiovascular disease (ASCVD) [5] in the USA and the QRISK [6, 7] in the UK, aid in identifying individuals at the highest level of risk. Misclassification of risk remains a serious concern, given the effective preventative strategies available [8].

The FRS was originally developed in 1998 from the exclusively white Framingham Heart Study population [2]. The algorithm used in the FRS has since been updated with the inclusion of additional risk factors and is widely used in the USA to determine risk based on age, sex, total cholesterol, high-density lipoprotein (HDL), blood pressure/antihypertensive treatment, diabetes status, and smoking status [24]. However, concerns about the high rates of misclassification, the overestimation of risk in contemporary populations with less CVD, underestimation in high-risk subgroups (e.g., low socioeconomic status), performance in ethnically diverse populations, and usefulness in the face of missing data prompted the creation of the QRISK score [7]. The QRISK was developed in the UK using electronic health data from the QResearch® database for risk factor information on ethnicity, family history of early coronary heart disease (CHD), chronic kidney disease (CKD), atrial fibrillation (AF), rheumatoid arthritis (RA), deprivation, and body mass index (BMI), in addition to the FRS risk factors [9]. In 2008, the QRISK was updated to QRISKII, which incorporated more relevant conditions into the algorithm, and was based on more recent population data [9].

In 2013, the ACC/AHA Task Force on Practice Guidelines developed a new score to assess the 10-year risk of ASCVD to address deficiencies in existing scores using data from several geographically diverse research cohorts comprised of blacks and whites [5]. The data from these population-based cohorts were used to create race- and sex-specific risk scores, which incorporate the same risk factors as the FRS [5]. The new risk score calculates 10-year and lifetime risk for ASCVD, which is defined by coronary death or nonfatal myocardial infarction (MI), or fatal or nonfatal stroke.

Comparisons of the performance of US-developed risk scores and QRISK/QRISKII have previously been limited to populations in Europe due in part to the lack of population-based electronic health data in the USA. However, electronic health record (EHR) systems are becoming ubiquitous due to the meaningful use standards implemented as part of the Health Information Technology for Economic and Clinical Health (HITECH) Act [10]. The secondary use of EHR data has shown to be a robust and cost-effective strategy for epidemiologic, genomic, and translational research [1115]. Specifically, leveraging EHR data for disease risk scores allows for additional risk factors to be incorporated and the range of outcomes increased [16].

High CVD morbidity, mortality, and associated health-care costs justify the continued effort to improve risk prediction so preventative interventions can be implemented. The impact of incorporating EHR data beyond the limited set of risk factors used in the FRS and ASCVD to improve risk prediction in the USA is unknown. Therefore, we carried out a study to compare the performance of the research cohort-based scores, ASCVD and FRS, to the electronic health data-based QRISKII by applying all three risk scores as they are used in clinical practice in a US population.

Methods

Participants

A cohort of 13,188 participants enrolled in the Mayo Clinic Biobank, recruited from the Mayo Clinic Primary Care Internal Medicine practice on the index date of November 29, 2012, aged 30–75 years, was initially selected for the study. Details and history of the Mayo Clinic Biobank have been described by Olson et al. [17]. We excluded 3546 patients who had prevalent CVD and 392 patients missing the data elements needed to calculate the risk scores. We further excluded 467 patients who were either non-white or of unknown race/ethnicity because the FRS was developed in an exclusively white population. A final sample size of 8783 patients was included in the study.

Measurements

Demographic data, including date of birth and sex, were collected from the EHR. Likewise, EHR laboratory test data for total cholesterol (mg/dL) and HDL cholesterol (mg/dL) measured between January 1, 2005, and December 31, 2012, were extracted and values closest to the index date of November 29, 2012, were used in the risk score calculations. EHR-derived measures of systolic blood pressure (mmHg) and hypertension treatment (yes/no) measured within 2010–2012 were considered, and values closest to the index date were used. An incidence of one ICD-9 code was used to identify persons with a history of CKD (ICD-9 codes 403–404), RA (ICD-9 code 714), and AF (ICD-9 code 427.3). Diabetes mellitus was identified by an EHR-based algorithm [18]. BMI was calculated as weight (kg)/height2 (m2) for all patients.

The biobank questionnaire was used to obtain risk factor information for smoking status (yes/no) and family history of heart attack in a first-degree relative (yes/no). The Rochester Epidemiology Project (REP) housing data (HOUSES) index [19] was used as a surrogate for Townsend scores of deprivation in the QRISKII calculation. Since the HOUSES index is negatively correlated with the Townsend scores, the values were multiplied by a value of −1.00 to make the data congruent. The HOUSES index is a measure of socioeconomic status with a high HOUSES index score indicating affluence and a low score indicating deprivation [19].

CVD Risk Scores and Events

Risk scores were applied in the same way they are used in clinical practice. Specifically, a 10-year CVD risk was calculated on the index date using the QRISKII-2008, as described by Hippisley-Cox et al. [9]; 2008 FRS, described by D’Agostino et al. [4]; and 2013 ASCVD, described by Goff et al. [5]. As described above, the QRISKII was calculated using the HOUSES index as a surrogate for the Townsend score when available in the subset analyses or by assigning the average risk for deprivation in the full cohort analyses. Risk scores were categorized as low risk (≤5 %), intermediate-low risk (>5 and ≤10 %), intermediate-high risk (>10 and ≤20 %), or high risk (>20 %) [4, 5]. Patients were followed up for the primary outcome of the first recorded diagnosis of a CVD event from November 29, 2012, through June 1, 2015. Our definition of CVD events included angina pectoris, acute MI, CHD (ICD-9 codes 410–414), chronic ischemic cerebral infarction, stroke, and transient ischemic attack (TIA) (ICD-9 codes 430–435) as these outcomes were used for two of the three risk scores (FRS and QRISKII). One or more ICD-9 codes for a patient indicated a diagnosis of a CVD event.

Statistics

Patient characteristics were summarized using mean (SD), median, and range for continuous variables and frequency (percent) for categorical variables. FRS, QRISKII, and ASCVD risk scores were stratified into risk categories according to published guidelines [4, 5]. Cohen’s kappa statistic was used to assess the agreement between score categories [20]. Discrimination of each risk score was assessed using the area under the receiver operating characteristic (ROC) curve.

Results

A total of 8783 patients (35 % men and 65 % women) aged 30–75 years were included in the study. Table 1 summarizes the baseline characteristics of the cohort by sex for cardiovascular risk factors used in the risk scores. A high prevalence was observed for several risk factors. For example, 82 % of men and 63 % of women were overweight or obese (BMI >25), 14 % of men and 9 % of women had type 2 diabetes, and approximately a third of all patients were treated for hypertension. In contrast, the prevalence of current smoking, RA, AF, and CKD was low for both sexes. During 3 years of follow-up, 246 men and 247 women experienced a CVD event.

Table 1 Baseline characteristics

Table 2 summarizes the comparison between the QRISKII and FRS by sex. The concordance between the QRISKII and FRS was 41 % in men and 56 % in women. However, the reclassification to higher risk based on the QRISKII was evident. For example, the QRISKII classified 24 % of men as high risk, whereas only 3 % were high risk according to the FRS. The kappa statistic was 0.22 (95 % confidence interval (CI) 0.20–0.24) for men and 0.23 (95 % CI 0.21–0.25) for women. Figure 1a, and b compares the CVD risk score estimates for the QRISKII versus the FRS; subjects with observed events are indicated with bold symbols. The QRISKII classified persons with events as higher risk compared to the FRS. Both the FRS and QRISKII performed suboptimal in men and women, categorizing patients with events into low-risk groups. Similar results were observed when using the HOUSES index as a surrogate for the Townsend score in the calculation of the QRISKII (Supplementary Table 1).

Table 2 Comparison of the Framingham Risk Score and QRISKII in women and men
Fig. 1
figure 1

Comparison of the QRISKII and Framingham Risk Score (FRS) estimates in a women (n = 5690) and b men (n = 3093). Red/Bold points indicate observed events (246 in men and 247 in women)

Table 3 summarizes the comparison of the ASCVD and FRS risk categories by sex. The FRS and ASCVD classified 44 % of men and 62 % of women in the same risk categories. The ASCVD classified more men and women in the higher-risk groups. The ASCVD also identified more women as intermediate-high risk and high risk than the FRS. For men, 12 % were classified according to the FRS as intermediate-low risk, while these same men were classified according to the ASCVD as high risk (Table 3, men). The kappa statistic was 0.26 (95 % CI 0.24–0.28) for men and 0.25 (95 % CI 0.23–0.27) for women. In Fig. 2a, and b, predicted CVD risk score estimates are shown between the ASCVD and FRS.

Table 3 Comparison of the Framingham Risk Score and ASCVD in women and men
Fig. 2
figure 2

Comparison of the Framingham Risk Score (FRS) and atherosclerotic cardiovascular disease (ASCVD) risk estimates in a women (n = 5690) and b men (n = 3093). Bold points indicate observed events (246 in men and 247 in women)

Table 4 summarizes the comparison of the QRISKII and ASCVD by sex. The concordance was highest for these two risk scores (62 % in men and 69 % in women). In general, the QRISKII classified more men and women in intermediate-risk groups than the ASCVD. The kappa statistic was 0.49 (95 % CI 0.47–0.52) for men and 0.51 (95 % CI 0.49–0.53) for women. Figure 3a, and b shows that the QRISKII and the ASCVD have similar predicted CVD risk score estimates. Similar results were observed when using the HOUSES index as a surrogate for the Townsend score in the calculation of the QRISKII (Supplementary Table 2).

Table 4 Comparison of the ASCVD and QRISKII in women and men
Fig. 3
figure 3

Comparison of the QRISKII and atherosclerotic cardiovascular disease (ASCVD) risk estimates in a women (n = 5690) and b men (n = 3093). Bold points indicate observed events (246 in men and 247 in women)

Figure 4a, and b shows the distribution of risk categories for the ASCVD, FRS, and QRISKII by sex for those who had a CVD event. In women who experienced a CVD event during follow-up, 19 % were classified as high risk by the QRISKII and 22 % by the ASCVD. However, only 2 % of women who had an event were classified as high risk by the FRS. Furthermore, 56 % of women with an event were considered low risk by the FRS as compared to 21 % for the QRISKII and 33 % for the ASCVD. For men who experienced a CVD event, the ASCVD and QRISKII classified 41 % and 39 % as high risk, respectively, whereas only 8 % were considered high risk by the FRS. Similar patterns were observed when using the HOUSES index in the calculation of the QRISKII (Supplementary Figs. 1 and 2).

Fig. 4
figure 4

Distribution of risk in a women and b men who had a cardiovascular disease event during follow-up. The Framingham Risk Score (FRS), QRISKII, and atherosclerotic cardiovascular disease (ASCVD) risk estimates were stratified into risk categories

The discrimination of the QRISKII, FRS, and ASCVD is shown in Fig. 5. The QRISKII shows higher area under the ROC curve (AUC) statistics than the FRS and ASCVD in both men and women with an AUC of 0.65 (95 % CI 0.61–0.68) and 0.71 (95 % CI 0.68–0.74), respectively. The AUC for the ASCVD was 0.63 (95 % CI 0.59–0.66) for men and 0.69 (95 % CI 0.66–0.72) for women. The FRS has the lowest AUC of 0.59 (95 % CI 0.55–0.62) for men and 0.66 (95 % CI 0.63–0.70) for women.

Fig. 5
figure 5

Receiver operating characteristic (ROC) curves for the discrimination of the QRISKII, atherosclerotic cardiovascular disease (ASCVD), and Framingham Risk Score (FRS) in women (left) and men (right)

Discussion

The electronic health data-based QRISKII classified individuals who experienced adverse CVD events into higher-risk groups more often than the FRS or ASCVD. Although all three risk scores placed those with CVD events in higher risk categories more often for men than in women, the FRS, in particular, misclassified risk in women with CVD. These results demonstrate that improvements in risk prediction are needed and provide evidence that incorporating additional health data beyond the limited set of variables included in the FRS and ASCVD may improve CVD risk stratification.

The QRISKII was derived from electronic health data in the national UK database, QResearch®, which was comprised of 531 practices in England and Wales [9]. The QRISK includes more risk factors, accounts for missing data by assigning an average value, and is representative of the underlying population, thus allowing a risk score to be calculated, even when data are missing. In contrast, the FRS was derived from the Framingham Heart Study cohort and has subsequently been widely applied in the clinical setting to predict a risk of CVD and CHD. The FRS was derived from a middle-class white population and used risk factor data measured during a research study exam and thus does not account for missing data. The FRS has a good predictive value for people who fit into the original Framingham population but does not predict well for women and different ethnic groups [7]. Despite these issues, the FRS has been widely applied in clinical practices across the USA.

The QRISKII and FRS were previously compared in different UK populations. One of the first studies compared the performance of the QRISKII and the modified FRS equation recommended by NICE [21] in the national QResearch® database [9]. Another validation comparing the FRS [21] and the QRISKII was carried out using the THIN database in the UK [22]. Both comparisons found that the QRISKII performed better than the FRS. The QRISKII had improved discrimination and calibration and was more accurate in identifying high-risk populations [9, 22]. The results reported herein support the superior predictive value of the QRISKII over the FRS. Furthermore, we demonstrate that the QRISKII accurately predicts outcomes of patients in higher risk categories, while misclassification was most extensive in the FRS.

Similar to the FRS, the ASCVD was developed using data from research cohorts in the USA. The ASCVD classified more patients as higher risk and was better at predicting outcomes than the FRS. The ASCVD was designed to include more cardiovascular outcomes and was developed in a more representative population. For these reasons, the ASCVD is replacing the FRS in clinical practice today. For the classification of patients, the ASCVD performed similar to the QRISKII; however, the ASCVD is not able to account for missing data and is specific to non-Hispanic African Americans and non-Hispanic whites [5].

Comparisons of the QRISKII, ASCVD, and FRS by sex have noted differences in classification and predictive value. One of the original FRS studies found that the FRS was a good predictor of CHD but misclassified risk more often in women than in men [2]. This was validated in a second study done by D’Agostino et al., where the FRS performed better with sex-specific CVD functions but underperformed in women [4]. QRISK studies have also shown better performance in predicting risk in men than in women [7, 9, 22, 23]. Studies of the FRS and QRISKII have shown the overestimation of low risk categories in women and decreased calibration and discrimination [2, 4, 7, 9, 22, 23]. In a recent validation, the ASCVD was found to overestimate risk in women [24]. The Reynolds Risk Score, another research-based risk score, is recommended for improved risk stratification in women [25]. The results of the current study support sex differences in risk prediction as all three risk scores performed better in men compared to women. Therefore, the use of EHR data would open new opportunities to add sex-specific information such as pregnancy, reproductive history, and other sex-specific factors to improve risk prediction in women.

Strengths of our study include the population, the availability of biobank questionnaire data, and the use of a surrogate for socioeconomic status to calculate the QRISKII. Data came from a single community in which EHR data was available for all participants and the risk scores were applied as in clinical practice. The biobank data also provided a link to family history, which is a component of the QRISKII and not widely available in clinical practice. The HOUSES index provided socioeconomic status information to use as a surrogate for the Townsend deprivation scores in the QRISKII. The correlation between these two measures of socioeconomic status and how well HOUSES index works as a surrogate is unknown, which is a limitation to the study. Differences exist by score in regard to target age ranges, race/ethnicity, exclusion criteria, and event type and definition. Therefore, the suboptimal use of a risk score could result in spurious claims regarding accuracy; however, since we applied the risk scores as is done in clinical practice, our results reflect the performance in the real world and represent a unique comparison of common risk scores used in the USA (FRS and ASCVD) and an electronic health data-based risk score (QRISKII). Furthermore, the QRISKII, FRS, and ASCVD are based on 10-year predictions of CVD events, and because of the decreased follow-up time, not all CVD events could have been included. Although we found that the QRISKII has a higher AUC score, the confidence intervals for the AUC scores overlapped, which could be due to the small amount of events. Finally, the potential volunteer bias within the biobank, the inability to obtain data about the age at the time of a heart attack in first-degree relatives, the exclusively white population, and the use of only ICD-9 codes to define events may limit our study. However, diagnosis codes have been shown to be specific for stroke and myocardial infarction [26, 27].

The study showed that the QRISKII, an electronic health record-based risk score, tended to classify persons who experienced an incident CVD event into higher-risk groups as compared to the FRS. Furthermore, the comparison of all three risk scores demonstrates that the risk stratification is worse in women than in men. Leveraging EHR data for the use of additional clinical risk factors to develop and implement improved sex-specific risk scores would enhance our ability to target truly high-risk groups for preventative measures.