Introduction

Many national organizations agree that women should be screened for osteoporosis in women aged 65 years or older [15]. However, indications for osteoporosis screening in women ages 50–64 years old vary by organization. The US Preventative Services Task Force (USPSTF) recommends osteoporosis screening in women between 50 and 64 years of age whose fracture risk is equal to or greater than that of a 65-year-old Caucasian woman with no additional risk factors [2]. The World Health Organization’s (WHO) fracture risk assessment tool (FRAX) was developed to calculate fracture risk based on entry of risk factors in the online tool [6]. The FRAX tool estimates that a 65-year-old Caucasian woman with no other risk factors will have a 9.3 % 10-year risk for a major osteoporotic fracture [7]. Using these guidelines, in women ages 50 to 64 with a FRAX-calculated major osteoporotic fracture (MOF) risk of 9.3 % or greater, osteoporosis screening (most commonly with dual-energy X-ray absorptiometry, DXA) would be appropriate.

The Choosing Wisely initiative of the American Board of Internal Medicine Foundation initially called on specialty societies to create their “Top Lists” of unnecessary tests or treatments commonly overused by their membership. It encourages patients and physicians to evaluate medical tests and procedures that may be unnecessary and, in some instances, could cause harm [8]. The goal of this initiative is to provide high-value care, where the potential health benefits of an intervention justify its harms and costs [9]. The American Academy of Family Physicians (AAFP) identified DXA as an overused test in primary care and recommended avoiding its use in women younger than 65 years of age without risk factors [8].

In this study, we addressed the sensitivity and specificity of the USPSTF criterion in identifying women with osteoporosis. Additionally, using the USPSTF criteria and several additional risk factors, we evaluated the extent of potentially inappropriate DXA use in women ages 50 to 64 years in a large primary care practice in an academic medical center.

Materials and methods

We retrospectively reviewed the records of all women between the ages of 50 and 64.5 years who underwent DXA during a 6-month period (March 1, 2012–August 31, 2012) and were enrolled in a primary care practice of the Mayo Clinic in Rochester, MN. The primary care practice sites include a clinic located on the central medical campus, two satellite community clinics, and one rural clinic. The study was classified as exempt by the Mayo Institutional Review Board review because existing data were collected for the purpose of quality improvement.

Data abstracted from the medical record included the primary care practice (family medicine or internal medicine), DXA ordering provider and specialty, ordering provider type (staff physician, mid-level provider, and resident physician), any prior DXA results, and history of osteopenia (T-score ≤ −1 but > −2.5 on a previous DXA or ICD-9 billing code of 733.90) and osteoporosis (T-score ≤ −2.5 or ICD-9 billing codes 733.0-733.09). Clinical risk factors for osteoporosis abstracted included previous fragility fracture (defined as a hip, vertebral or radial fracture occurring from standing height or less without major trauma), body mass index (BMI), ethnicity, hyperparathyroidism, celiac disease, commencing or currently taking an aromatase inhibitor, and history of bariatric surgery.

For women who had no history of using osteoporosis medications and were within the weight parameters of the FRAX tool, the 10-year fracture risk estimate (without incorporating femoral neck density data) was determined and the sensitivity and specificity of fracture risk as calculated by FRAX in identifying women with osteoporosis (T-score ≤ −2.5 at the femoral neck or lumbar spine) were calculated and receiver-operating characteristic curves were constructed. Clinical risk factors and data required to calculate the FRAX estimate were obtained from a questionnaire that the women completed prior to DXA examination.

The DXA was classified as meeting our pre-specified criteria for obtaining a DXA if the subject’s estimated MOF risk by FRAX (determined without femoral neck bone density) was 9.3 % or higher or if the subject had known osteopenia or osteoporosis, hyperparathyroidism, celiac disease, was starting or taking an aromatase inhibitor, and had a history of a fragility fracture or bariatric surgery. If none of these criteria were met, the DXA examination was classified as not meeting our pre-specified criteria for DXA screening.

Data were entered in Excel 2010 (Microsoft Corp., Redmond, WA), and analysis was performed with JMP 9.0.1 (SAS Institute Inc., Cary, NC). A control P-chart was constructed of the proportion of inappropriate DXA examinations by week to assess any temporal trends.

Results

A total of 464 women between the ages of 50 and 64.5 years (mean age 57.4 years) underwent a total of 465 DXA tests during the 6-month study period and were included in the analysis (Table 1). Ethnicity was listed as Caucasian for 450 (96.7 %). A total of 278 (59.7 %) DXAs were ordered by primary care providers, and the remainder was ordered by other specialist clinicians.

Table 1 Characteristics of study population

Using the criteria defined in this study, 371 (79.8 %) of the DXA tests were classified as meeting the pre-specified ordering criteria, and 94 (20.2 %) were classified as not meeting pre-specified ordering criteria. The mean age of women not meeting the pre-specified criteria for obtaining a DXA (55.4 ± 3.8 years) was significantly lower than that of women who did meet the pre-specified criteria (57.8 ± 3.8 years, p < 0.001). The proportion of DXAs not meeting our pre-specified criteria was greater in women who had never had a previous DXA (65/141, [46.1 %]) than in those with a prior DXA (29/324 [8.9 %], p < 0.001). Ordering provider type (staff/resident/mid-level provider) and BMI were not significantly associated with DXAs meeting the pre-specified criteria. A total of 22 (11.7 %) DXA orders by specialty clinicians were classified as not meeting pre-specified criteria for ordering compared with 72 (25.9 %) ordered by primary care clinicians (p < 0.001). A control P-chart, which is a standard statistical method in quality control, showed no temporal trend or special cause related to the proportion of DXAs not meeting pre-specified ordering criteria over the time frame evaluated.

A total of 120 women (25.8 %) had a T-score of −2.5 or less, indicating osteoporosis at the femoral neck and/or lumbar spine. Of these, 51 (10.9 %) and 96 (20.6 %) had osteoporosis of the femoral neck and the lumbar spine, respectively. Of the 120 women with osteoporosis at the hip and/or spine based on T-score values of −2.5 or less, 13 DXAs (10.8 %) were classified as not meeting ordering criteria. Conversely, of the 94 DXAs that did not meet pre-specified ordering criteria, osteoporosis was present in 13 (13.8 %).

Of the 293 patients who met the criteria to use the FRAX tool for fracture risk prediction, 82 (27.9) % subjects had an estimated MOF risk of 9.3 % or greater prior to undergoing DXA (i.e., FRAX calculation performed without femoral neck density measurement). The sensitivity and specificity of the 10-year MOF risk estimate of 9.3 % or greater (calculated without including femoral neck bone density) for detecting osteoporosis of the femoral neck and/or lumbar spine were calculated for these 293 women who met the height/weight criteria for FRAX calculation and had not taken osteoporosis medications. The overall sensitivity and specificity of a FRAX-calculated MOF risk ≥9.3 % was 37 and 74 %, respectively, for the detection of any osteoporosis of the hip or spine. The receiver operator characteristic curve (ROC) demonstrated an area under the curve of 0.58, demonstrating relatively poor test performance of the USPSTF recommended 9.3 % MOF risk threshold for the detection of osteoporosis by DXA (Fig. 1). In this population, lowering the FRAX-calculated MOF risk threshold to 5.5 % would increase the sensitivity of detecting osteoporosis to 80.4 % while reducing the specificity to 26.8 %.

Fig. 1
figure 1

Receiver operating characteristic curve for an estimated FRAX major osteoporotic fracture risk ≥9.3 % for discriminating between persons with and without BMD T-score ≤ −2.5. The diagonal line is tangent to the ROC where sensitivity (1-specificity) is the highest

Discussion

FRAX was developed to predict fracture risk in order to guide treatment decisions rather than to identify those likely to have osteoporosis by DXA, and it appears to perform poorly in this regard. Indeed, in our study population, about 11 % of women ultimately diagnosed with osteoporosis by bone density criteria did not meet the criteria for testing by the 9.3 % MOF risk threshold. Although the goal of the Choosing Wisely campaign is to decrease the frequency of unnecessary testing, we found that the FRAX threshold of 9.3 % 10-year probability of a major fracture as an indication to obtain a DXA in women ages 50–64 has a relatively low sensitivity for the detection of densitometrically defined osteoporosis. Underdiagnosis results in a missed opportunity for fracture prevention. Lowering the FRAX MOF risk threshold to 5.5 % would increase the sensitivity of detecting densitometrically defined osteoporosis in our population from 37 to 80 % while reducing the specificity from 74 to 27 %. Lowering the FRAX threshold to obtain a DXA in this age group would lead to increased DXA testing. However, in a disease like osteoporosis with clinically important morbidity and mortality, a screening test should have relatively greater sensitivity than specificity.

Other osteoporosis risk assessment instruments (Simple Calculated Osteoporosis Risk Estimation [SCORE], Osteoporosis Self-assessment Tool [OST], Osteoporosis Risk Assessment Instrument [ORAI], and Age Body Size No Estrogen [ABONE]) generally have much higher reported sensitivities (83–100 %) and lower specificities (10–47 %) in identifying osteoporosis in post-menopausal women compared to what we found with using the FRAX-calculated MOF risk threshold of 9.3 % [1013]. While the majority of these reports are not limited to women less than 65 years of age, most included women down to age 45 years [1013].

In a comparison of FRAX, SCORE, and OST in women ages 50–64 years old, Crandall et al. found the current FRAX threshold to have a lower sensitivity than the SCORE and OST tools in this age group. When evaluating the FRAX MOF risk threshold of 9.3 % for detecting osteoporosis at the femoral neck, they found a sensitivity of 34.1 % and specificity of 85.8 % in this age group with the current threshold, which are similar to our observations. In their study population, they noted that a FRAX threshold of 5.04 % would improve the sensitivity of FRAX to detect osteoporosis in the femoral neck in this age group to 80.2 % while decreasing specificity to 40.9 % [14]. This threshold is close to the FRAX MOF risk threshold of 5.5 % in our study population that would lead to a sensitivity of 80.4 % and specificity of 26.8 % to detect osteoporosis of the femoral neck and/or lumbar spine.

In our study population, approximately one out of five DXAs was classified as not meeting our pre-defined criteria, derived in part from the USPSTF recommendations, to obtain a DXA. At our institution, we found that primary care providers were more likely than specialty clinicians to order DXAs that did not meet our pre-specified criteria for ordering. In our academic multi-specialty group practice, approximately 40 % of the DXAs in our study were ordered by non-primary care specialists. Most specialty clinicians ordered DXAs for specific indications (namely after fragility fractures or to monitor women with osteopenia/osteoporosis, hyperparathyroidism, and those taking aromatase inhibitors), which met our pre-specified criteria which may have led to this finding. Interestingly, however, approximately one in eight of women in this study who did not meet our pre-specified criteria were found to have osteoporosis suggesting that providers likely are using criteria beyond the FRAX threshold to determine when to order DXA in this age group.

Our study had several limitations. It was limited to primarily Caucasian women in an academic setting, and our practice may not be generalizable to other populations. Given that Caucasian women have a greater risk of osteoporosis than other races, the calculated fracture risk with FRAX would generally be lower in women of other races with similar risk factors. Additionally, we did not evaluate the sensitivity or specificity of the 9.3 % FRAX MOF risk threshold to identify women with osteopenia who might still benefit from treatment (i.e., who had a 10-year probability of major osteoporotic fracture of ≥20 % or hip fracture of ≥3 % once their femoral neck bone density was incorporated into the FRAX score) [1]. Another limitation was using a retrospective review of women who had undergone DXA screening. This design did not permit us to evaluate underutilization of DXA screening, which would represent a missed opportunity for fracture prevention. Interestingly, however, even in this population pre-selected by their providers, the sensitivity of the 9.3 % FRAX threshold for detection of osteoporosis was low.

An additional limitation is that we did not exclude women who had a previous DXA and, in fact, found that the proportion of inappropriate DXAs was lower in women who had had a previous DXA than in those undergoing their first DXA, suggesting that women with serial DXAs are undergoing them for definitive indications (such as known osteoporosis or osteopenia). Women in this age group with a normal bone density may be less likely to undergo subsequent DXAs prior to age 65 (i.e., a normal DXA in this age group is not followed up with repeat DXA in this age group). Last, we are cognizant that our pre-specified criteria may not include all clinical scenarios where a DXA may be indicated and appreciate that individual clinical judgment can take into account risk factors that might not have been included in our criteria nor documented in the medical record.

Conclusion

The USPSTF-recommended MOF risk threshold of 9.3 % for osteoporosis screening in this age group may be overly conservative, and a lower risk threshold could increase the sensitivity of detecting densitometrically defined osteoporosis in this age group. The role of a screening test is to detect silent disease in which early intervention can prevent a poor outcome. In this era of rising health-care costs and finite resources, primary care providers may benefit from tools and systems to identify risk factors that warrant osteoporosis screening in women younger than 65 years.