INTRODUCTION

Of the 19 million Americans with type 2 diabetes, 6 million remained undiagnosed.1,2 Early diagnosis, lifestyle modification, and tight glycemic control reduce complications; however, these cannot occur if diabetes is undiagnosed. Indeed, diabetes may be present for up to 7 years before diagnosis.3

Although improving diabetes detection remains prominent in the nation’s public health agenda,4 there is insufficient evidence for or against routine diabetes screening.5,6 One reason may be the burden and inconvenience of fasting visits to meet the diagnostic criteria. Diabetes is usually established by fasting plasma glucose (FPG), values which require confirmation on a second visit.6

Opportunistic screening for high-risk individuals during unscheduled outpatient, urgent care, or hospital visits may improve rates of diagnosis. Hemoglobin A1c (HbA1c) offers the potential for a more practical alternative, as it is an established measure of long-term glycemia and correlates directly with subsequent development and progression of microvascular complications.7,8 In addition, HbA1c measurement does not require fasting and remains unaffected by transient hyperglycemia from acute stress or illness.9

For these reasons, investigators have examined HbA1c for diabetes screening in undifferentiated populations,1021 but utility as a diagnostic or screening test remains controversial, primarily due to limited diagnostic accuracy.2226 For instance, Rohlfing et al. found 63% sensitivity and 97% specificity for threshold HbA1c ≥ 6.2%, and 83% sensitivity and 84% specificity for HbA1c ≥ 5.7%, using the NHANES III dataset.11 These studies do not, however, explore diabetes risk factor data to additionally stratify populations, and many were limited by prior lack of HbA1c reliability, a problem rectified by the National Glycohemoglobin Standardization Program.27 In the present analysis, we evaluated the performance of HbA1c as a screening test for undiagnosed diabetes in a risk-stratified, nationally representative sample of the US population.

METHODS

Study Design and Participants

The National Health and Nutrition Examination Survey (NHANES) is a nationally representative probability sample of the noninstitutionalized US civilian population, released data in two-year cycles. We received a waiver from our institutional review board to analyze the three most recent cycles of released NHANES data (i.e., data collected between January 1999 and December 2004).

Details of survey methodology are described elsewhere.28 Briefly, the sample is obtained by using a complex, stratified, multistage design, with oversampling of certain subgroups. NHANES uses a four-stage sampling strategy covering geographic primary sampling units (PSUs).

During 1999 to 2004, NHANES collected household interview data, including demographic characteristics, medical conditions, and physical activity, for 31,126 (87%) of 35,960 invited participants. The majority (29,402 [94%]) subsequently received physical and laboratory examination, including body measurements, plasma glucose, and HbA1c. NHANES randomly assigned these participants to morning, afternoon, or evening examination sessions. Of the 7,768 participants aged ≥ 18 years assigned to the morning session, 7,345 (95%) had valid FPG results; the remaining 5% had fasted <8 hours or had missing data. For the present analysis, we excluded 622 participants who reported a physician-diagnosis of diabetes. Thus, we analyzed 6,723 adult participants without known diabetes, who represent 195 million Americans.

Diabetes Risk Factors

From the household interview data, we analyzed information on self-reported age, sex, race/ethnicity, education, and income. We also examined two healthcare access variables: presence of health insurance and healthcare visit within the past 3 years. Finally, we evaluated six factors that are strongly related to diabetes risk: 1) physical activity based on moderate or vigorous activity for at least 10 minutes during the past 30 days, 2) body mass index (BMI, kilograms/meters2[kg/m2]) 3) hypertension (defined as systolic blood pressure ≥ 130 mmHg, diastolic blood pressure ≥ 85 mmHg, or current use of anti-hypertensive medications, 4) elevated waist circumference (≥102 cm for males or ≥88 cm for females), 5) elevated triglycerides (≥1.7 mmol/l [150 mg/dl]), and 6) low high-density lipoprotein (HDL) cholesterol (<1.0 mmol/l [40 mg/dl for males or <1.3 mmol/l [50 mg/dl] for females). The latter four criteria used thresholds defined by the metabolic syndrome.29 Nearly all of the study subjects had data on these variables (>97% for all). We excluded missing values as recommended by NHANES; derivation of imputed values was not required. Family history of diabetes was missing in nearly half of the subjects so was not included in the analysis.

Laboratory Measurements

The Diabetes Diagnostic Laboratory at the University of Missouri-Columbia measured plasma glucose using a hexokinsase enzymatic method. Based on FPG results, we classified participants as normal (<5.6 mmol/l [100 mg/dl]), impaired fasting glucose (5.6 to 7.0 mmol/l [100–125 mg/dl]) or diabetes (≥7.0 mmol/l [126 mg/dl]), according to standard diagnostic criteria endorsed by the American Diabetes Association (ADA).6 Our primary outcome measure was undiagnosed diabetes, as defined by FPG results.

The same laboratory also performed the HbA1c measurement using Primus CLC 330 and Primus CLC 385 (Primus Corporation, Kansas City, MO). The instruments utilize a boronate affinity high performance liquid chromatography (HPLC) system, which has been standardized to the reference method used for the Diabetes Control and Complications Trial.7 The affinity chromatographic method has demonstrated excellent, long-term precision with coefficient of variation <3%, and is not affected by the presence of hemoglobin variants S, C, D or elevated fetal hemoglobin (HbF).

Statistical Analysis

We performed statistical analyses using Stata 9.0 (College Station, TX). We applied the recommended subsample weights to account for unequal probabilities of selection and to represent the US population based on 2000 Census data. We estimated variance based on NHANES-provided masked variance units, using Taylor Series linearization method. All p-values are two-tailed with p < 0.05 considered statistically significant.

We calculated proportions with 95% CI for diabetes risk factor data for the entire sample and stratified by FPG results (undiagnosed diabetes (FPG≥7.0 mmol/l [126 mg/dl]) and impaired fasting glucose (FPG ≥ 5.6–6.9 mmol/l [100–125 mg/dl]). To improve interpretability of the analysis, we converted age and BMI from continuous to categorical variables. We determined univariate associations between predictors and FPG outcome using Pearson’s chi-square test.

Variables with p < 0.10 on univariate analysis were eligible for entry into a weighted multivariate logistic regression model for the outcomes of undiagnosed diabetes. We utilized a backward elimination technique and with reference strata defined as age 18–44, non-Hispanic white, high school graduate, and BMI 20–24.9 kg/m2). Only significant predictors (p < 0.05) were retained in the final model, unless a variable significantly confounded the estimated associations. No interaction terms were evaluated for this model. We presented the results of two final models, one that evaluated all variables and the second that evaluated only variables that could be obtained during a non-fasting visit (excluded triglycerides and HDL). We reported results as OR with 95% CI for significant predictors of each outcome.

Using the results of the multivariate analysis, we derived a risk score for predicting undiagnosed diabetes based on the ORs in the final model, using previously described methodology.30 We created a point score by dividing multivariate OR by 1.8 and rounding to the nearest 0.5 point. To determine categories for risk scores (low, moderate and high risk), we examined the prevalence of undiagnosed diabetes for each risk score and created categories to maximize homogeneity within each group.

We measured the positive predictive value (PPV) and negative predictive value (NPV) for a range of threshold HbA1c values (5.2 to 6.4%) in the total sample and risk-stratified subgroups (≥threshold value considered abnormal, <threshold value considered normal). We recommended a positive threshold to approximate a true positive rate (i.e., PPV) of 50%, and a negative threshold to approximate a false negative rate (i.e., 1 – NPV) of 1%. Our rationale was to recommend further confirmatory FPG for patients above the positive threshold and no further testing for those below the negative threshold. We demonstrated these recommendations by creating Standards for Reporting of Diagnostic Accuracy (STARD) flow diagram for the total sample.31

Additionally, we determined sensitivity and specificity with 95% CI, and plotted a receiver operating characteristic (ROC) curve. We calculated area under the curve (AUC) as a measure of performance and determined 95% CI by the Bamber method.32

RESULTS

The overall weighted prevalence was 6.3% for diagnosed diabetes, 2.8% for undiagnosed diabetes, and 27% for impaired fasting glucose. Subsequent analysis is based only on participants without known diabetes.

Risk Stratification

Table 1 shows the prevalence and univariate associations for diabetes risk factors, stratified by FPG results. Table 2 presents the results of the weighted multivariate logistic regression models and risk score development, based on the ORs in Model 3 (all factors evaluated for inclusion). We calculated risk scores for participants with all required data (306 [4%] had at least one missing element) to predict the outcome of undiagnosed diabetes.

Table 1 Prevalence of Impaired Fasting Glucose and Undiagnosed Diabetes Stratified by Risk Factors
Table 2 Significant Predictors and Risk Score Development for Undiagnosed Diabetes by Multivariate Logistic Regression

We assigned points based on the OR of significant predictors—age 45–64 (2.5 points), age ≥65 (4 points), male sex (1.5 points), black race (1 point), hypertension (1 point), elevated waist circumference (1.5 points), elevated triglycerides (1 point), and low HDL (1 point). We stratified resulting scores into low (0 to 4.5 points), moderate (5 to 6.5 points), and high (≥7.0 points) risk groups. The prevalence of undiagnosed diabetes increased across the risk groups: 0.44% for low, 4.1% for moderate, and 11.1% for high. Thus, the high risk group was 25 times more likely to have undiagnosed diabetes than the low risk group. The distribution of risk scores, stratified by risk factor data are presented in Table 3.

Table 3 Distribution of Diabetes Risk Score by Demographic Variables and Significant Predictors of Undiagnosed Diabetes

Hemoglobin A1c

The weighted mean HbA1c was 5.32% for the overall sample, 5.19% for normal FPG, 5.46% for impaired fasting glucose, and 7.07% for undiagnosed diabetes. Table 4 shows the PPV and NPV for a range of HbA1c threshold values (5.2–6.4%) in predicting undiagnosed diabetes for the entire sample and stratified by risk groups, using FPG as the gold standard. Additionally, we present the sensitivity and specificity values in Table 4 and plot the ROC curve for threshold values of HbA1c in predicting undiagnosed diabetes in Figure 1. The AUC of 0.93 (95% CI, 0.91–0.95) revealed a high diagnostic accuracy for HbA1c.

Figure 1
figure 1

Receiver operating characteristic (ROC) curve for accuracy of hemoglobin A1c in predicting undiagnosed diabetes. Based on gold standard of fasting plasma glucose ≥7.0 mmol/l (126 mg/dl) to define undiagnosed diabetes. Area under the curve = 0.93 (95% CI, 0.91–0.95).

Table 4 Predictive Validity of Hemoglobin A1c in Screening for Undiagnosed Diabetes, Based on Threshold Values and Risk Stratification

We chose a positive threshold of HbA1c≥6.1%, which retained a PPV of approximately 50% for predicting undiagnosed diabetes in the entire sample. Additionally, the PPV for detecting impaired fasting glucose or undiagnosed diabetes (i.e., FPG ≥ 5.6 mmol/l [100 mg/dl]) was 91% at this threshold. This threshold also had an overall 99% NPV, 98% specificity and 68% sensitivity. We chose the negative threshold of HbA1c ≤ 5.4%, which maintained a false negative rate of less than 1% for the entire and had an overall 73% specificity and 91% sensitivity. The intermediate values of HbA1c (5.5 to 6.0%) had variable performance depending on pretest probability, and risk stratification was useful in interpreting results in this range.

Figure 2 is a STARD flow diagram for this proposed algorithm of risk stratification and HbA1c in screening for undiagnosed diabetes. This figure displays data from individual NHANES participants to demonstrate the algorithm’s potential utility in clinical practice.

Figure 2
figure 2

STARD diagram for detection of undiagnosed diabetes by risk stratification and hemoglobin A1c. Abbreviations: STARD, Standards for Reporting of Diagnostic Accuracy; HbA1c, hemoglobin A1c. Risk groups based on significant predictors of undiagnosed diabetes in multivariate analysis.

DISCUSSION

We report the predictive validity of HbA1c as a screening test for undiagnosed diabetes in a large, nationally representative and risk-stratified population. Given that the overall prevalence of undiagnosed diabetes is relatively low (2% to 3%), risk stratification, although not performed in prior HbA1c screening studies,1021 is an important tool to stratify pre-test probabilities in order to better estimate the true predictive value of HbA1c for subgroups of patients.

We found similar overall performance of HbA1c as in earlier reports. For example, in a meta-analysis of 18 smaller studies prior to 1996, Peters et al. found a comparable 66% sensitivity and 98% specificity for HbA1c ≥ 6.1%, with PPV 63%, using oral glucose tolerance test (OGTT) as the gold standard10. Rohlfing et al. analyzed the NHANES III dataset (collected 1988 to 1994) using FPG as the gold standard and found an area under the ROC curve of 0.90, which was similar but slightly lower than the current NHANES analysis.11 Additionally, they reported comparable 63% sensitivity and 97% specificity for threshold HbA1c ≥ 6.2%, and 83% sensitivity and 84% specificity for HbA1c ≥ 5.7%. Davidson et al. also evaluated NHANES III but with OGTT as the gold standard, and found similar results.12 These reports generated significant controversy on both sides of the debate.2226 Although other investigators have found similar parameters for HbA1c in a variety of clinical settings,1321 these data have been insufficient to recommend its implementation in clinical practice.6

To help address these concerns, we have presented a strategy of stratification by risk profile, before HbA1c testing, and interpretation of HbA1c based on the pre-test probability of diabetes. First, we recommend risk stratification by the proposed or other mechanisms to create risk groups of low (<0.5%), moderate (4% to 6%), and high (>10%) prevalence rates. Patients in the low risk group may generally be excluded, without further testing, based on the very low pre-test probability of diabetes. The moderate and high risk groups may receive HbA1c screening at a variety of healthcare visit since fasting is not required.

We propose that patients in the moderate risk group receive FPG testing for HbA1c values in the positive range (≥6.1%), and the diagnosis may be reliably excluded for intermediate and negative values (≤6.0%). Patients in the high risk group may receive FPG testing if HbA1c ≥ 6.1%; the diagnosis can be reliably excluded if HbA1c ≤ 5.4%. For high-risk patients with intermediate HbA1c values (5.5% to 6.0%), clinical judgment and cost-effectiveness analysis are needed to guide decision-making, since the 6% prevalence of undiagnosed diabetes is high enough that diabetes may not be excluded, but low enough that further testing may be relatively low yield. One strategy could be to recommend diet and lifestyle modification and aggressive management of hypertension and dyslipidemia for this group and then repeat screening (either HbA1c or FPG) after 1 to 3 years.

The risk factors of age, black race, and elements of the metabolic syndrome (hypertension, waist circumference, triglycerides, and HDL cholesterol) are commonly recognized as significant risk factors for diabetes. Interestingly, male sex was an independent risk factor for undiagnosed diabetes. Although not explored further, this finding has been reported in prior NHANES analysis.33,34 While the prevalence of undiagnosed diabetes varied by sex, the overall prevalence of diabetes (diagnosed plus undiagnosed) was similar. This disparity may merely reflect that males have 50% fewer outpatient preventive care than females,35 but warrants further exploration. Although the overall prevalence of diabetes is higher among Hispanic individuals, we did not find an association between Hispanic ethnicity and the rate of undiagnosed diabetes, which is consistent with prior analyses.34

While providers may choose to use different tools for risk stratification, the principle of deriving a low (<0.5%), moderate (4% to 5%) and high pre-test probability (>10%) could remain similar. Edelman et al. suggested a similar concept of the utility of HbA1c in a risk-stratified cohort from one outpatient clinic.14,17 Since fasting and well visits are not required for reliable HbA1c testing, the proposed screening algorithm would be more convenient for patients and providers and have potential for generalizability to a variety of unscheduled healthcare settings. Because of limited availability of some diabetes risk factor data in NHANES (i.e., family history, gestational diabetes, and history of IFG/IGT), we chose to risk stratify based on best available data, which included fasting lipid profile. However, the concept of risk stratification prior to ordering and interpreting HbA1c still holds, and providers should risk stratify based on data available to them.

Although HbA1c is more expensive than serum glucose, opportunistic screening with HbA1c may prevent unnecessary fasting office visits in over two-thirds of moderate and high risk patients. Moreover, these strategies might advance two important Healthy People 2010 objectives: to increase the rate that asymptomatic diabetes is diagnosed and to increase the overall proportion of adults whose condition has been diagnosed.4

Prior reports of diabetes screening in community and clinical venues have yielded mixed results, often limited by low prevalence rates and poor follow-up.3638 Similar to any disease screening, patient adherence with confirmatory testing and subsequent therapy is vital to the successful implementation. Additionally, the cost-effectiveness of opportunistic diabetes screening is unclear and will require further investigation.39 However, the US Preventive Services Task Force and ADA recommend diabetes screening for patients with defined diabetes risk factors, with ADA adopting broader criteria.5,6 Implementation of HbA1c as a screening test for risk-stratified populations may enhance screening efforts in appropriate high-risk patients.

The study has several potential limitations. NHANES 1999–2004 used FPG as the gold standard for diabetes, and OGTT were not performed. Although endorsed by the ADA in the diagnostic algorithm for diabetes,40 FPG is not as accurate as OGTT for detecting glucose intolerance, especially in older adults.41 However, OGTT is more cumbersome and rarely used in US clinical practice. The algorithm developed was based on persons who had apparently escaped diabetes diagnosis and may not perform similarly if used as a primary diagnostic strategy. We derived a new risk score based on the data available in NHANES in order to risk stratify the population, which would need to be validated in an independent population before implementation.

Additionally, several important known risk factors for diabetes, including family history, history of gestational diabetes, and history of IFG/IGT, were not available in this dataset. Nevertheless, we utilized this risk-stratification to create subgroups of different levels of risk—i.e., the results of our HbA1c analysis should apply regardless of the exact method of risk stratification. The proposed algorithm of risk stratification and HbA1c relied on pragmatic reasoning and interpretation of the data; others may suggest thresholds corresponding to different predictive values, and cost effectiveness analysis would further clarify optimal thresholds for clinical practice. Finally, this analysis provides a proposed algorithm, which, if validated, can serve as a guideline for providers, but should not substitute for sound clinical judgment for individual patients.

In summary, risk stratification improves the predictive validity of HbA1c in screening for undiagnosed diabetes in the US population. Opportunistic screening with HbA1c may be utilized in a variety of healthcare encounters and probably would improve detection of undiagnosed diabetes. Validation of our clinical algorithm and decision analysis should be performed to further understand the accuracy and optimal threshold values of HbA1c and to determine the cost-effectiveness of early detection of asymptomatic diabetes on a population level.