Introduction

Osteoporosis, defined operationally from bone mineral density (BMD T-score − 2.5 or lower) or clinically (from a typical low-trauma fracture), is highly prevalent among older individuals and those with predisposing health conditions [1]. BMD measured with dual-energy X-ray absorptiometry (DXA), a technology developed 30 years ago, robustly predicts major fractures, particularly those affecting the hip, and its performance is enhanced when combined with additional clinical risk factors [2, 3]. Once osteoporosis is identified, there are effective pharmacologic and non-pharmacologic interventions that can meaningfully reduce the likelihood of fractures, including hip fractures, in both primary and secondary prevention settings [4, 5].

Strategies on how to identify individuals at high fracture risk and target them for treatment have lagged behind other developments [6]. Secondary prevention is an obvious opportunity for case identification and intervention but has been a struggle to incorporate into clinical care systems. As a result, “usual care” following a low-trauma fracture (including hip and vertebral) rarely leads to evaluation and/or treatment of the osteoporosis that contributed to the fracture event [7]. There is increasing recognition that systems need to be created that facilitate case identification, investigation, and intervention [8]. Fracture liaison services (FLS) and other case management approaches have the potential to address the intractable post-fracture “care gap” [9, 10].

Although secondary prevention is an obvious priority, the “holy grail” is primary prevention at the population level. This review examines some of the opportunities and challenges related to primary prevention and screening for osteoporosis, with a focus on publications in the last 3 years. PubMed was searched from January 1, 2016, for English language papers specifically mentioning osteoporosis in conjunction with screening or primary prevention (reviews excluded), leading to the following narrative summary.

DXA vs. Risk-Based Screening

To date, most strategies have focused on targeted testing of individuals based upon clinical risk factors (most commonly age and sex) but are relatively inefficient and with limited uptake [11, 12•]. Although guidelines do not advocate universal DXA screening for women younger than age 65, rates of screening remain suboptimal even among older at-risk women. Using administrative claims data 2008–2014 for over 1.6 million older women with no prior history of osteoporosis diagnosis or treatment, DXA screening rates were low: 26.5% and 12.8% among women ages 65–79, and 80+ years, respectively [12•]. In addition to the pronounced under-screening of older women, there were marked socioeconomic gradients in screening probabilities, although these decreased between 2008 and 2014.

Controversy continues regarding which approach is optimal: DXA- or risk-based screening strategies. DXA-based osteoporosis screening strategies have been the focus of cost-effectiveness modeling studies from Japan and China. Yoshimuru et al. [13] performed a model-based cost-effectiveness study of osteoporosis screening and drug therapy for postmenopausal women in the Japanese health care system. Assuming a willingness to pay of $50,000 per quality-adjusted life year gained, the probability that screening was cost-effective improved with increasing age. The study concluded that screening and alendronate therapy for 5 years for persons with BMD < 70% of the young adult mean (current Japanese guideline, roughly equivalent to T-score − 2.56) would be cost-effective for postmenopausal Japanese women older than 60 years old. Su et al. [14] studied the Mr. OS and Ms. OS Hong Kong cohort assuming 5 years of alendronate treatment for persons with T-score ≤ − 2.5 at the hip or spine. All screening strategies examined, including universal screening with DXA, pre-screening with FRAX, or quantitative ultrasound prior to DXA, were more cost-effective over a 10-year horizon than no screening for men and women aged ≥ 65 years old.

There are insights into the frequency of DXA testing in young postmenopausal women. Gourlay et al. [15••] performed a retrospective competing risk analysis of 54,280 postmenopausal women aged 50 to 64 years in the Women’s Health Initiative who had not taken osteoporosis therapy or experienced hip or clinical spine fracture. Among women with scores below the FRAX screening level at baseline, the study determined the time required for 10% of women to reach the USPSTF screening threshold (9.3% 10-year risk of MOF) or FRAX-based treatment thresholds (10-year risk of hip fracture 3% and/or 10-year risk of MOF 20%). Among women aged 50–54 years at baseline, so few women progressed to the USPSTF screening threshold that the time required to progress could not be calculated. Among women aged 55–59 years, 15.8 years were required to reach screening thresholds, and too few reached FRAX treatment thresholds to calculate. Even among women aged 60–64 years, time to reach the FRAX treatment threshold was considerable—7.6 years.

Regarding risk-based strategies, two RCTs testing clinical fracture prediction tools were recently completed: the Screening for Osteoporosis in Older Women for the Prevention of Fracture (SCOOP) and Risk-Stratified Osteoporosis Strategy Evaluation Study (ROSE). Although both tested fracture risk-based screening strategies, SCOOP and ROSE had different approaches in that risk assessment was based on FRAX-predicted hip fracture risk in SCOOP versus FRAX-predicted MOF risk in ROSE.

The aim of SCOOP was to assess the effectiveness of a FRAX-based community screening program in reducing the incidence of fractures over a 5-year period [16••]. The SCOOP study design was a pragmatic, unblinded RCT of 12,483 women aged 70–85 years identified from 100 general practitioner practices in the United Kingdom. SCOOP compared an intervention (recommending BMD testing only for women with an elevated FRAX-predicted risk of hip fracture) with “standard care” (a letter documenting study participation). The definition of elevated hip fracture risk was based on age-specific thresholds ranging from 5.18% (age 70 years) to 8.39% (age 85 years). If predicted hip fracture risk was greater than the age-specific BMD testing threshold, predicted fracture risk was recalculated with BMD information, and treatment recommended if predicted hip fracture risk exceeded the treatment threshold. Overall, 14% of the intervention group met criteria for treatment. Although screening did not reduce the primary outcome (all osteoporosis-related fractures), there was a statistically and clinically significant 28% decrease in the secondary pre-specified endpoint of hip fractures (odds ratio intervention vs. control arm 0.72, 95% CI 0.59–0.89). The lack of demonstrated benefit for the primary outcome may have been influenced by healthy selection bias (low mortality rates based on the age distribution of participants). The SCOOP intervention was demonstrated to be highly cost-effective [17•]. The incremental cost per quality-adjusted life year gained for the intervention group vs. the control group was £2772, and the intervention arm prevented fractures at a cost of £4478 per osteoporosis-related fracture. The effect of the SCOOP intervention increased with baseline FRAX hip fracture probability [18•]. At the 10th percentile of baseline FRAX hip probability (i.e., 2.6%), hip fractures were not significantly reduced in the intervention vs. control groups, whereas at the 90th percentile (i.e., 16.6%), the intervention group (vs. control group) experienced a 33% reduction in hip fractures.

The ROSE trial also tested a community-based screening program based on predicted fracture risk in older women [19••]. ROSE enrolled a randomly-selected sample of women aged 65-80 years in Denmark, all of whom received a questionnaire for fracture risk calculation. Ascertainment of fracture outcome and osteoporosis medication use were based on Danish Health Registry data. A two-step screening intervention program (BMD testing offered only if FRAX-predicted 10-year risk of MOF was ≥ 15%, with BMD results provided to the general practitioner and participant) was compared with control (questionnaire only, with routine care). In the pre-specified intention to treat analysis, there was no difference in the incidence of osteoporosis-related fractures (primary outcome) between the intervention (screening) and control groups (median follow-up 5 years). However, in a pre-specified per-protocol analysis performed among participants with FRAX MOF risk ≥ 15% in the intervention group (who received DXA scans) and control groups there was clinically and statistically significant reduction in MOF, hip fractures, and all fractures, with adjusted hazard ratios ranging from 0.74 to 0.89. Of note, only 48% of the intervention group and 25% of the control group actually underwent DXA measurement. Lower likelihood of participant in ROSE was associated with older age, living alone, lower education, low income, and higher comorbidity [20•]. Women with a previous fracture or a history of parental hip fracture were more likely to accept DXA screening, whereas higher alcohol consumption, older age, current smoking, and physical impairment were associated with dropping out when DXA was offered.

Comparison of Strategies for Osteoporosis Screening

Head-to-head comparisons of various osteoporosis screening strategies have recently been performed in postmenopausal women and in men. In postmenopausal women, the US and Canadian osteoporosis screening strategies were compared using data from the Women’s Health Initiative (n = 117,707 participants aged 50–79 years who provided 10-year follow-up for incident MOF) [21••]. Under the Canadian screening strategy, women aged 50–64 years are recommended for BMD testing if they have 1 or more clinical risk factors (fragility fracture after age 40, prolonged glucocorticoid use, parental hip fracture, aromatase inhibitor use, vertebral fracture, high alcohol intake, current smoking, low body weight, major weight loss, or other disorders strongly associated with osteoporosis). Under the USPSTF strategy, BMD testing is recommended for women aged 50–64 years who have FRAX-predicted risk of MOF ≥ 8.4% (using FRAX without BMD information). In women aged 50–64 years who subsequently experienced MOF, the Canadian screening strategy had higher sensitivity than the US strategy. For example, of women aged 50–54 years who experienced MOF, the Canadian screening strategy identified 54% to receive BMD testing, compared with 7% for the US (USPSTF) strategy. Sensitivity of both screening strategies increased with age. These results highlight that better screening algorithms are needed for women aged 50–64 years.

A comparison of the US (National Osteoporosis Foundation, NOF) and UK (National Osteoporosis Guideline Group, NOGG) treatment thresholds was recently performed [22••]. Expected (FRAX-predicted) and observed (incident) 10-year fractures were well-calibrated for both hip fractures and MOF. Femoral neck T-score and FRAX (with and without BMD) as continuous measures predicted hip fractures and MOF equally well at all ages. However, for identifying women who experienced MOF during the follow-up period, the sensitivity (positive predictive value) was 26% (24%) for femoral neck T-score ≤ − 2.5, 20% (26%) for FRAX (with BMD)-predicted 10-year MOF risk ≥ 20% (NOF threshold), 27% (22%) for FRAX-predicted 10-year MOF risk exceeding the age-dependent cutoff (NOGG threshold), 59% (19.0%) for the NOF treatment algorithm, and 29% (18%) for the NOGG treatment algorithm. Sensitivity of the various threshold-based approaches (NOF thresholds, NOGG thresholds, femoral neck BMD T-score ≤ − 2.5, FRAX-predicted MOF risk ≥ 20%, NOF algorithm, NOGG algorithm) for identifying incident MOF varied by age, ranging from 0 to 26% in women 40–49 years old and from 49 to 93% in women aged 80+. Because the sensitivity and positive predictive value of the strategies based on dichotomous cutoffs were low (especially in women aged 40–49 years), these results highlight that threshold-based approaches should be reassessed, particularly in younger postmenopausal women.

Two recent studies examined fracture risk assessment tools and strategies in the Osteoporotic Fractures in Men (MrOS) study (ambulatory community-dwelling men aged ≥ 65 years not taking osteoporosis medication and without prior hip or spine fracture). The discriminative ability (area under the receiver operating characteristic curve, AUC) for incident fracture was similar for FRAX (with BMD information), the Garvan tool (with BMD information), age plus femoral neck BMD T-score, and femoral neck BMD T-score alone, with AUC values 0.72–0.77 for major osteoporotic fractures and 0.76-0.79 for hip fracture [23]. Therefore, in relatively healthy untreated older men without fragility fractures, age plus femoral neck BMD T-score accurately identifies men at risk for incident fracture. This may not apply to more complicated patient populations, such as those with a greater prevalence of risk factors. The Osteoporosis Self-Assessment Tool (OST) (AUC 0.68), which was originally constructed to detect low BMD, performed modestly better than the FRAX tool (without BMD information, AUC 0.62) in identifying older men with BMD T-score ≤ − 2.5 [24•]. Conversely, the FRAX tool, constructed to measure fracture risk, performed better than OST at identifying older men qualifying for treatment under NOF guidelines (AUC 0.79 versus 0.68, respectively). Both strategies reduced the proportion of men referred for BMD testing compared to universal screening.

Finally, a study compared the cost-effectiveness of osteoporosis screening strategies among Chinese persons aged 65 or over in Hong Kong [14]. The strategies compared were DXA screening, pre-screening prior to DXA (using FRAX thresholds or calcaneal ultrasound), and no screening over a 10-year horizon. Treatment was assumed to be 5-year treatment with alendronate in persons with hip or spine BMD T-score ≤ − 2.5. The study found that osteoporosis screening strategies based on DXA with or without pre-screening (by FRAX or calcaneal ultrasound) were each cost-effective compared with no screening.

Challenges in Osteoporosis Screening

Crandall et al. [25] examined the performance of the FRAX and Garvan fracture risk calculators in the large prospective observational cohort from the Women’s Health Initiative (63,723 postmenopausal women age 50–64 years at baseline). Incident fractures were observed over 10 years and were, as expected, quite low for this relatively young and healthy population. The area under the curve (AUC) for prediction of incident hip fracture was 0.62 (95% CI 0.59–0.65) for Garvan fracture risk calculator and 0.68 (95% CI 0.65–0.70) for the FRAX tool, both used without BMD inputs. Performance for prediction of incident MOF was even lower (Garvan 0.57, 95% CI 0.57–0.58, FRAX 0.58, 95% CI 0.57–0.59). These performance measures are significantly lower than have been obtained in older women. At a sensitivity threshold for hip fracture of 80%, specificity for the tools was < 50%. The authors concluded that for postmenopausal women age 50–64 years, neither FRAX nor the Garvan fracture risk calculator used without BMD provided good prediction of incident fractures during 10 years of follow-up, and that no useful threshold could be proposed for either tool.

Colón-Emeric [26] reported the impact of BMD screening with DXA among older men (age 65–99 years) without prior fracture was assessed in a nationally representative sample from the U.S. Veterans Affairs. Men undergoing DXA screening (N = 153,311) were matched with controls (N = 390,258) using a propensity score for probability of DXA testing within the next year. During mean follow-up of 4.7 years (maximum 10 years), DXA screening was associated with higher (not lower) fracture risk in the overall cohort (HR 1.19, 95% CI 1.16–1.21), likely due to unmeasured risk factors and low rates of anti-osteoporosis medication initiation and adherence among those meeting treatment thresholds (12% of follow-up time). Among pre-specified subgroups there was evidence of lower fracture risk compared with the overall population: androgen deprivation therapy (HR 0.77, 95% CI 0.66–0.89), glucocorticoid users (HR 0.77, 95% CI 0.72–0.84), age 80 years and older (HR 0.85, 95% CI 0.81–0.90), one or more VA guideline risk factors (HR 0.91, 95% CI 0.87–0.95) and high FRAX score without BMD in the calculation (HR 0.90, 95% CI 0.86–0.95). These results can be viewed as “half glass full” since there is evidence that DXA screening was associated with reduced fracture risk in high-risk subgroups of men, and that there is an opportunity to address the overall population if treatment uptake and adherence can be improved. The “glass half empty” interpretation is obvious: treatment uptake and adherence are disappointing in the overall population resulting in a failure of DXA screening to translate into lower fracture risk. Among screened men meeting National Osteoporosis Foundation treatment criteria, 43% did not initiate anti-osteoporosis medication during follow-up. In addition to the observational nature of the study with likely unmeasured confounders, the study was limited by cross-over among the controls with 25,422 undergoing DXA testing and 1,617 initiating treatment. The authors rightfully conclude with an appeal to administrators to create and implement system interventions to assist clinicians in identifying men at risk and support treatment decision making and adherence.

Paradoxically, data from Manitoba, Canada, suggests that DXA testing itself may be a barrier to appropriate treatment initiation in some situations. Using a population-based BMD registry, 3735 women age 50 years and older who are not receiving anti-osteoporosis therapy and underwent BMD screening 2006–2015 were found to qualify for anti-osteoporosis treatment under the national guidelines [27•]. Treatment initiation in the subsequent year was only 50%, and was largely determined by the presence/absence of a BMD T-score in the osteoporotic range (adjusted OR for treatment 0.10, 95% CI 0.09–0.12 for osteopenic BMD, 0.02, 95% CI 0.01–0.04 for normal BMD compared with osteoporotic BMD). Disappointingly, prior major fracture was not a predictor of treatment initiation (OR 1.00, 95% CI 0.84–1.19) and there was no improvement over time despite guidelines highlighting the importance of treating high fracture risk rather than BMD alone (P = 0.294).

Similar data were recently reported from Sweden, highlighting the international nature of the failure to initiate treatment in high-risk individuals. Lorentzon et al. [28] identified a population-based cohort of older women age 75–80 years (mean age 77.8 years) living in Gothenburg, Sweden. Among the 2983 women with complete data, 1107 (37%) were eligible for treatment according to Swedish Osteoporosis Society guidelines yet only 341 (21.8%) was actually receiving treatment. Equally low rates were seen for women qualifying for treatment under NOF guidelines (12.6%) or NOGG guidelines (15.5%). Predictors for receiving osteoporosis medication were glucocorticoid treatment (odds ratio 2.88, 95% CI 1.80–4.59) and prior fracture (2.58, 1.84–3.61).

From Missed Opportunities to Opportunistic CT

Although the primary focus of this review is on primary prevention and screening, it is important to highlight some recent developments in secondary prevention that have potential for population-wide implementation. Two recent systematic reviews and meta-analyses from Wu et al [29, 30] examined the effectiveness cost-effectiveness of Fracture Liaison Services, showing significant FLS-associated improvements in process of care (BMD testing increased 24%, treatment increased 20%, adherence improved 22%), fracture outcomes (absolute risk of re-fracture reduced 5%, mortality reduced 3%) with FLS cost-effective in comparison with usual care or no treatment across a range of program designs and countries.

Unfortunately, in the absence of targeted interventions, there is still more gap than care following a fracture, a missed opportunity to intervene and prevent another fracture. This is particularly evident for individuals with vertebral fractures, most of which do not come to clinical attention as was highlighted many years ago by Gehlbach [31]. An updated analysis from the hospital setting examined 2933 patients age 50 years of age or older presenting to an Emergency Department with new vertebral fracture (2008–2014) [32]. Remarkably, 98% did not undergo DXA testing in the 2 years before or the year after fracture; 21% had taking antiresorptive medication before the fracture but only 7% were started on antiresorptive therapy after the fracture; 38% developed a new vertebral fracture within the next 2 years.

Use of previously acquired CT imaging for opportunistic screening, first proposed in 2013, is gaining strength [33, 34]. Initial procedures were operator-dependent and required in-plane phantom calibration, but there is now the promise of a fully automated tool for prospective or retrospective opportunistic assessment that can also monitor BMD. Pickhardt et al. [35] retrospectively applied their software to non-contrast abdominal CT scans in 1603 consecutive asymptomatic adults undergoing longitudinal screening (mean interval, 5.7 years). Successful segmentation and BMD estimation (based upon L1-L2 CT attenuation values) was achieved in 99.8 % with only four failed cases. The generalizability of using opportunistic CT screening in other populations was suggested in a study from China. Among 109 patients who concomitantly underwent abdominal CT and DXA within 6 months, CT attenuation ≤ 136 HU gave positive predictive value 81.2% with AUC 0.86 for diagnosis of osteoporosis from DXA [36]. Another group reported lower performance. Among 302 patients (mean age 57.9), diagnosis of osteoporosis from DXA using vertebral CT attenuation (L1 or CT) measured on examinations of the chest or abdomen gave a maximal AUC of 0.74 (95% CI 0.68–0.80), with corresponding sensitivity 62% (51–72%) and specificity 79% (74–84%) [37].

The feasibility of using machine learning methods to automate vertebral fracture identification from previously acquired CT images is particularly attractive. An algorithm was developed and internally validated using three processes: spinal column segmentation and sagittal patch extraction; binary classification using a Convolutional Neural Network (CNN); prediction of whether a vertebral fracture is present in the series of patches using a Recurrent Neural Network (RNN) [38]. After training, the algorithm achieved 89.1% accuracy, 83.9% sensitivity, and 93.8% specificity. A workflow has been developed whereby the algorithm can operate in the background on de-identified studies received from a hospital’s picture archiving and communication system (PACS), alerting the facility when a vertebral fracture may be present for scan review further action as required (https://www.zebra-med.com/solutions/bone/).

Administrative Screening

As noted earlier, the SCOOP trial and ROSE trials provide support for the potential usefulness of a strategy based upon risk prediction tools for population-based screening with selective use of DXA [16••, 17•, 18•, 19••]. However, these approaches are still dependent upon collection of clinical risk factors at the individual level. This is problematic since non-respondents are actually individuals at higher fracture risk, and there is potential recall bias in some of the clinical information required. Therefore, the ability to use administrative data for passive screening of the population is attractive. This has been addressed in several recent studies. A study from the U.S. found that electronic records versus manual risk factor collection for FRAX risk estimation were comparable [39]. Insurance data was used in Israel to compute FRAX and Garvan scores, and shown to be predictive of fracture outcomes similar to other conventional cohort studies [40, 41]. Similar results were seen from Germany using claims-based risk estimation [42]. A BMD registry for Manitoba, Canada showed that additional risk factors could be used to complement some of the missing FRAX data at the population level and achieve a satisfactory performance of “administrative FRAX” for risk assessment [43]. Finally, the group from Denmark has looked at the full range of ICD codes for creating a risk assessment tool called FREM (forward in Danish) [44••].

Individuals living in nursing homes are a unique and often overlooked group who are at high fracture risk for multiple reasons including age, impaired mobility, falls, dementia, and other comorbidities. Ten-year outcomes are not relevant in this population where life expectancy is limited but where serious fractures can have a major adverse effect on quality and quantity of life. Two groups have developed risk assessment methods applicable to this population that can be applied to routinely collected Minimum Data Sets (MDS) [45, 46]. Berry et al [46] develop the Fracture Risk Assessment in Long-term Care (FRAiL) instrument to predict the 2-year risk of hip fracture in nursing home residents using the Minimum Data Set and Part D claims (derivation cohort 419,668 residents in fee-for service Medicare). During 1.8 years mean follow-up, 14,553 residents (3.5%) experienced a hip fracture. Characteristics in the final model associated with hip fracture included dementia severity, ability to transfer and walk independently, prior falls, wandering, and diabetes. The concordance index in the derivation sample was 0.69 in men and 0.71 in women with similar performance in internal and external validation samples. The Canadian Fracture Risk Scale (FRS) was developed to assess 1-year incident hip or all clinical fractures [45]. The overall discriminative properties of the FRS were similar between three different provinces (c-statistics from 0.644 to 0.673).

Conclusions

Despite the high prevalence and health impact of osteoporosis, screening and treatment rates remain low with fewer than 1 in 4 privately insured women age 65 years and older utilizing osteoporosis screening for primary prevention [12•]. A recent systematic assessment of the quality and content of 33 osteoporosis screening guidelines published between 2002 and 2016 identified variable quality in their recommendations [47]. Authors called for guideline developers to work together to improve the quality and consistency of recommendations in order to improve the likelihood that guidelines will be used in practice. Notably, several high-quality osteoporosis guidelines were identified in a systematic assessment of 421 clinical practice guidelines for the management of common diseases in primary care [48•].

A recent systematic review and meta-analysis identified 43 clinical trials evaluating the efficacy of osteoporosis quality improvement strategies [49•]. Most studies examined patients with recent or prior fracture, and meta-analyses identified several effective strategies for improving DXA testing and/or osteoporosis treatment. For populations that included those without prior fracture, the only quality improvement strategy for which meta-analysis demonstrated significant improvement in osteoporosis care was patient self-scheduling of DXA plus education (increased BMD testing, risk difference 13%, 95% CI 7–18%). Unfortunately, meta-analyses found no significant impact on osteoporosis treatment from the following strategies: multifaceted intervention targeting providers and patients, patient education and/or activation, or pharmacist initiated screening. Much more work is needed to develop and validate effective primary screening and prevention strategies, and translate these into high-quality guidelines.