Introduction

Osteoporosis (OP) and its most important clinical consequence, hip fractures, affect a disproportionate number of women compared to men, with a major risk factor being menopause [1]. Whether premature, natural, or iatrogenic, the abrupt decline in estrogen levels at menopause accelerates bone loss, such that about a quarter of women >65 years have OP at the lumbar spine or femur neck [2]. Almost half of all postmenopausal women will have an osteoporotic-related fracture during their lifetime, with the resulting disability a public health priority.

Testing for, and treating, women with OP identified by dual energy X-ray absorptiometry (DXA) can decrease fracture risk [3]. The main drawbacks for routine DXA screening center are around cost and accessibility [4]. Current clinical guidelines issued from USA, UK, and European health authorities [5, 6] recommend the use of population-based DXA screening to women ≥65 years. For women aged 50 to 64 years, DXA is recommended if the Fracture Risk Assessment Tool (FRAX®) score is above a 9.3% 10-year risk for major osteoporotic fracture (MOF) or by simply identifying risk factors [7]. This threshold value is equivalent to that of a 65-year-old white woman with no other FRAX® clinical risk factors and thus may not be generalizable to non-Caucasian populations.

Prior to FRAX®, a number of risk assessment tools were available to identify women with low bone mineral density (BMD) and/or estimate risk of fracture—ABONE (age, body size, no estrogen); ORAI (OP risk assessment instrument); OST (OP self-assessment tool); SCORE (simple calculated OP risk estimation tool); SOF (study of OP fractures-based screening tool); and OSTA (OP-screening tool for Asians) [8]. The OSTA includes age and weight and was developed about 15 years ago in order to identify Asian women aged 45–89 years at risk for OP [9]. Its use to triage high-risk subjects for BMD testing is recommended in Asian countries including Singapore [10]. Although OSTA generally performed as well as, or better, than more complex instruments with sensitivity for OP detection approaching 90%, there is always a trade-off in specificity [11]. A systematic review of studies on available risk prediction tools reported considerable heterogeneity and low methodological quality. It recommended restricting OSTA to women ≥65 years [12]. Overall, it is acknowledged that there is a need for more evidence-based clinical recommendations regarding DXA for women less than 65 years.

It has been estimated that obstetric and gynecological variables could account for up to 24% of BMD variance [13]. Besides menopausal vasomotor symptoms [14], age at menarche, time since last period, weight, pregnancy, and hysterectomy influence BMD [13]. Reproductive conditions such as hypothalamic hypogonadism increase osteoporotic risk [15]. Demographic factors affecting osteoporotic risk include ethnicity, marital status, family incomes, housing type, and educational levels [16]. The shift from traditional to contemporary dietary patterns has increased rates of obesity, diabetes, and hypertension worldwide which along with lifestyle choices, such as smoking, alcohol consumption, sedentary lifestyle, lack of physical activity, and sleep deprivation, influence an individual’s bone health [17]. There is evidence that simple assessment tools that measure physical performance such as the short physical performance battery (SPPB) can predict OP and fracture risk [18]. Despite the link between these conditions and OP, there is a paucity of studies on factors that correlate with OP risk in middle-aged Asian women.

Our study aimed firstly to identify novel correlates associated with low spinal BMD in mid-life women. To do this, we examined a large number of lifestyle and medical and performance measurements to firstly describe a wide range of characteristics in mid-life Singaporean women that correlate with low bone mass (T-score between −1 and −2.5) and OP (T-score ≤ −2.5). Secondly, we explored the feasibility of incorporating any new independent variables identified into a prediction model for triaging mid-life women to BMD scanning.

Methods

In order to identify novel correlates associated with low spinal BMD in mid-life women, we recruited women attending gynecology clinics at the National University Hospital (NUH), Singapore. A large number of lifestyle and medical factors, biophysical and performance measurements, were collected in order to identify independent factors correlated with low bone mass (T-score between −1 and −2.5) and OP (T-score ≤ −2.5).

Study design and subjects

The Integrated Women’s Health Program (IWHP) is a cross-sectional study of women aged 45–69 years attending gynecology clinics at NUH. Women attending well-women checks or management of non-cancer gynecological symptoms, including menopause, were recruited. Eligibility criteria included (1) female age between 45 to 69 years, (2) willingness to follow study procedures, (3) willingness to provide blood sample, and (4) ability to understand and sign an informed consent. Women with the presence of (1) a terminal or life-threatening condition, (2) pregnancy, and (3) low literacy were excluded. Recruitment was targeted at Chinese, Malay, and Indian women to represent the main ethnic groups in Singapore. Other ethnicities were excluded. Pre-study workshops were conducted by experienced investigators (SS, JAC) to train research personnel on use of assessment tools. Information on ethnicity, age, and reason for refusal was requested from those who declined to participate for non-response analysis. The protocol was approved by the Domain Specific Review Board of National Healthcare Group, Singapore, and all participants gave written informed consent.

Questionnaire of validated instruments

The questionnaire was available in English and Chinese, the major languages used by Singaporeans. Literacy (on a 0–7 scale) was first assessed using a validated health literacy tool—Rapid Estimate of Adult Literacy in Medicine-Short Form (REALM-SF) [19]. Subjects with low literacy scores (≤3) were excluded. The self-reported questionnaire totaled 281 items including demographic characteristics, reproductive history, medical history, and alcohol and smoking history adapted from the Mobility and Independent Living in Elders Study (MILES) study [20]. Depending on menstrual cycle characteristics, patients were classified as postmenopausal if they reported 12 consecutive months of amenorrhea. For hysterectomy with bilateral oophorectomy, date of surgery was considered as the date of menopause. For hysterectomy with unilateral oophorectomy, menopausal categorization was based on age at recruitment. Those 49 years and older, (equating to the average age of menopause in Singapore [21]), were classified as postmenopausal. Internationally validated self-reported questionnaires covering all aspects of health were used and scored as per original author instructions. For biological issues, the Menopause Rating Scale identified and evaluated menopausal symptoms/complaints and severity based on the number of days experienced at the time of study visit [22]. The Pelvic Distress Inventory Short Form 20 scored urinary, fecal, and pelvic organ prolapse distress over the last 3 months with the summary score proportionate to the impact of pelvic floor dysfunction on quality of life [23]. The Female Sexual Function Index (FSFI) assessed six domains (desire, sexual arousal, orgasm, satisfaction, and pain) over the previous 4 weeks [24]. For physical function, the WHO Disability Assessment Schedule (WHODAS 2.0) assessed health and disability over six domains (cognition, mobility, self-care, interaction, life activities, and participation) over the past 30 days [25]. The WHO Global Physical Activity Questionnaire (GPAQ) surveyed occupational, transport-related, and recreational physical activity in terms of intensity, duration, and frequency in a typical week [26]. For mental health, the Center for Epidemiologic Studies for Depression (CES-D 20) screened for depression and depressive disorder in the past week, using symptoms defined by the American Psychiatric Association Diagnostic and Statistical Manual [27]. The Generalized Anxiety Disorder scale (GAD-7) assessed symptoms of anxiety over the last 2 weeks [28]. The Pittsburgh Sleep Quality Index (PSQI) assessed sleep quality over 1 month [29]. Medication consumption was inventoried by asking the women to bring in all medication and supplements taken in the past 2 weeks.

Biophysical measurements

All measures followed strict protocols (available on request). Height and body weight were measured using SECA 769 Electronic Measuring Station. Waist and hip circumferences were measured up to four times with average values calculated. Body mass index (BMI) was computed as the body weight divided by height squared (kg/m2). Arm circumference was measured to guide cuff size for blood pressure measurements. The average blood pressure was calculated from three measurements using an OMRON IntelliSense (HEM-7211).

Physical performance measures

The SPPB (chair stand, repeated chair stand, semi-tandem stand, tandem stand, one-leg stand, and 6-m balance walks) was performed according to standard methodologies [30]. Briefly, lower extremity strength was measured by the repeated chair stand test (measuring the time taken to stand up as quickly as possible, without using arms from a seated position five times). Static balance was assessed by the semi-tandem stand (the side of the heel of one foot touching the big toe of the other foot), tandem stand (stand with the heel of one foot in front of and touching the toes of the other foot), and one-leg stand (standing on either of participant’s preferred leg) tests for 30 s. To measure usual walking speed, participants were requested to walk along a 6-m course at their normal pace, expressed as meters per second. To assess balance, subjects were asked to walk along the same course within two lines that were 20 cm apart. A hand dynamometer (Jamar) was used to measure grip strength with the average grip strength of two trials of both right and left hand, calculated.

Bone mineral density measurement

Participants underwent DXA (Hologic Discovery Wi, Apex software 4.5) scanning for lumbar spine BMD. Quality control assessments were performed by standard protocols according to manufacturer’s instructions. Daily calibration of machine was performed using the Hologic spine and phantoms before commencing scanning. The reproducibility (%CV) of the phantom scans for the lumbar spine BMD was 0.35%. Our primary outcome measure was spinal BMD, chosen because longitudinal studies have reported that spinal bone loss precedes loss in the hip [19], potentially identifying an earlier site for screening in younger women.

Statistical data analysis

The wide range of variables obtained was examined for their correlation with the pre-specified outcome measures: normal [T-score ≥ −1.0], low bone mass [T-score between −1 and −2.5], and OP [T-score ≤ −2.5] at the lumbar spine using the Singaporean reference database. As rates for OP in Singaporean women were not available, the sample size was based on a rate of 6%, equivalent to the incidence of osteoporosis on Japanese population [31]. We chose to use the Japanese population, as it is one of the well-studied osteoporotic populations in Asia with similar rates for gross domestic product, urban living, unemployment, birth, adult obesity, educational attainment, and life expectancy [32]. Since overall missing data rate was low (10.15%), no data imputation was performed. Demographic characteristics (age and ethnicity) of participants and non-participants were compared by two sample t test to assess the possible bias between respondents and non-respondents of the recruited sample. Comparisons of baseline characteristics across the three T-score categories (normal, low bone mass, and OP) were performed using chi-square tests and likelihood ratio tests for categorical variables and one-way ANOVA test for continuous variables. To identify putative variables associated with the outcome measures, univariate multinomial logistic regression was carried out in the variables selection stage. Variables identified with a p value of ≤0.1 were further considered for inclusion in the multivariate multinomial logistic regression model in order not to miss any potentially important correlates. Multivariate stepwise multinomial regression using backward elimination was carried out to evaluate the independent significant variables for the pre-specified outcomes. Those identified variables were added to FRAX® (FRAX®plus) to determine if this addition improved FRAX®‘s predictive ability. Nonparametric receiver-operating-characteristic (ROC) analyses were performed to calculate the area under the ROC curves, and nonparametric ROC analyses with contrast matrix C were applied to compare the area under curve (AUC) values using the ‘c’ statistic. Statistical significance was at p ≤ 0.05 in the final fitted models. To choose among competing models, the preferred final model was selected based on the log likelihood ratio test and clinical relevance. The effect size measurements were presented as RRR (relative risk ratio) with 95% CI per one SD change.

The predicted probability for individual subjects was generated based on fitted model using STATA post-estimation commands. An OSTA score for each subject was derived according to the formula of OSTA: 0.2 × [weight (kg) – age (year)] [9]. Risk for MOF without BMD was generated for individual subjects using FRAX® in which age and weight were excluded in the model, as they are already taken into account. ROC analyses were performed to compare the AUC characteristics of our final fitted model with OSTA and FRAX®. ROC curves were constructed using FRAX® cutoff scores of 9.3 and 6.4%, representing fracture risks for MOF in a 65-year-old with no other FRAX® clinical risk factors in both white American and our Chinese Singaporeans, respectively. All the above analyses were carried out using STATA version 14.0 (StataCorp, College Station, Tex).

Results

In order to examine factors associated with low bone mass and osteoporosis in mid-life women, 1221 women attending the gynecological clinics were approached within a 1-year period between August 2014 and September 2015 (Fig. 1). Of these, 402 declined to participate, and 307 women did not meet eligibility criteria. Two hundred and forty-two decliners volunteered to give information on their age and ethnicity. Compared to those that declined, our participants were more likely to be of Indian ethnicity, while the other two ethnic groups were equally distributed. In terms of age, our participants were similar with a mean age ± SD (57 ± 6.0 years) in comparison to the non-participants (58 ± 6.4 years). The main reasons given for refusing to participate were “not interested in study” (32%), “not interested in scan” (29%), and “time constraints” (29%). The analytic sample composed of 512 (56%) women who completed the study. Study participant characteristics, demographics, reproductive/medical history, physical assessments, and their distributions with respect to lumbar spine BMD categories are presented, and p value for trend is identified in Table 1.

Fig. 1
figure 1

Participant recruitment flowchart

Table 1 Characteristics of participants, stratified according to bone mineral density at spine

Regarding menopausal status and hysterectomy, hysterectomy with bilateral oophorectomy was reported by 34 (6.6%) women, with date of menopause equivalent to date of surgery. Twenty-one subjects (4.1%) had hysterectomy with unilateral oophorectomy. All were older than 49 years at the time of assessment and grouped into the postmenopausal category. There were no cases of hysterectomy with conservation of both ovaries.

Demographic factors and relationship to BMD T-scores

The average age (SD) was 57 ± 6.0 years. Women with low bone mass and OP were on average 2 and 6 years older, respectively, than those with normal BMD. Chinese ethnicity was associated with trends towards low BMD, as Chinese ethnicity was overrepresented in the osteoporotic group, with 97% of osteoporotic subjects being Chinese compared to 86% in the overall cohort (p for trend 0.011) (Table 1). Only lower educational levels (secondary or below) were also associated with low BMD. No significant relationship was observed with respect to marital or pregnancy status, number of children, and household income.

Reproductive/medical history and BMD T-scores

The mean age for menopause was 49.7 ± 4.3 years, and 75% were postmenopausal. Menopause was correlated with low BMD (Table 1). The presence of moderate to severe hot flushes was significantly associated with normal BMD, while chronic joint pain (pain/stiffness not related to injury around all joints for more than a month) and knee pain for more than a month were significantly related to low BMD. Chronic joint pain was the most common symptom reported by over a third of our women (37.5%) with hot flushes and vaginal dryness reported by 20.8 and 23.0%, respectively. Other significant correlates observed were history of fainting spells, diabetes, previous fracture, and parental hip fracture. There was no association with recognized factors associated with OP such as history of liver disease, rheumatoid arthritis, current smoking, alcohol consumption, use of hormonal medications (estrogens, corticosteroids, and thyroid), or calcium supplementation.

Physical performance and BMD T-scores

Body weight and some physical assessments were significantly associated with T-scores (Table 1). Mean weight, body mass index, and waist circumference decreased significantly in parallel with decreasing T-score (Table 1). Slower walking speeds, both unconfined and restricted, were significantly associated with low BMD. Similarly, right hand grip strength declined significantly in parallel with low BMD.

Factors associated with low bone mass (T-score between −1.0 and −2.5)

The prevalence of low bone mass was 63.7%. Increasing age and postmenopausal status correlated with poorer bone mass, whereas having moderate to very severe hot flushes, higher weight, higher body mass index, greater waist circumference, faster unconfined or restricted walking speeds, and increased right hand grip strength correlated with better bone mass (Table 2). In multivariate stepwise multinomial regression analysis, body weight (adjusted RRR per SD, 0.47, 95% CI, 0.37–0.60) and postmenopausal status (adjusted RRR, 1.87, 95% CI, 1.01–3.48) were independently associated with low bone mass, respectively (Table 3).

Table 2 Variables associated with low BMD at spine, identified by univariate multinomial logistic regression analysis
Table 3 Relative risk ratios (RRR) of independent variables associated with low BMD at spine, identified by multivariate stepwise multinomial logistic regression analysis

Factors associated with OP (T-score ≤ 2.5)

The prevalence of OP at lumbar spine was 6.8%. Univariate analysis indicated that increasing age, postmenopausal status, rheumatoid arthritis, chronic joint pain, and fainting spells correlated with higher relative risk of OP when compared to reference group (normal [T-score ≥ −1.0]); whereas having higher education, moderate to very severe hot flushes, breast self-examination, higher weight, higher body mass index, greater waist circumference, and increased right hand grip strength correlated with a lower relative risk of OP (Table 2). Chronic joint pain was highly associated with all three symptoms of knee pain, rheumatoid arthritis, and osteoarthritis (p < 0.001). However, only chronic joint pain emerged as the independent predictor of OP. For some variables (ethnicity, parental hip fracture, and balance walk speed), relative risk ratio estimates were not possible because of the small subject numbers affected (Table 2). Multivariate stepwise multinomial regression analysis indicated that increasing age, postmenopausal status, chronic joint pain, lower body weight, and reduced right hand grip were independently associated with OP at the spine (Table 3).

Final multivariate model for spinal OP

For our final model comprising increasing age, postmenopausal status, chronic joint pain, lower body weight, and reduced right hand grip, the AUC was 84% (95% CI, 77.93–90.29%) for prediction of OP at the spine (Table 4). In comparison, the AUC for OSTA was 79% (95% CI, 71.81–85.31), [‘c’ statistic, p = 0.02] (Fig. 2a). In comparison, the FRAX® tool had an AUC of 0.58 (95% CI, 0.50–0.67) whether using fracture risk cutoffs set at 9.3% [7] or 6.4% (Singapore Chinese aged 65 with no other FRAX® clinical risk factors); [‘c’ statistic, p = 0.0001) (Fig. 2b). Inclusion of menopausal status, chronic joint pain, and right hand grip strength to the FRAX® tool resulted in a FRAX®-plus tool with improved AUC of 76% (95% CI, 0.68–0.84) for fracture risk set of 9.3%, and 67% (95% CI, 0.58–0.7) for fracture risk set at 6.4% (Fig. 2b).

Table 4 Comparison of osteoporotic risk prediction models
Fig. 2
figure 2

Comparison of area under curves (AUCs) of final fitted model, OSTA, and FRAX® models. a New screening tool model comprising right handgrip strength, weight, age, postmenopausal status, and the presence of chronic joint pain. b Osteoporosis self-assessment tool for Asian (OSTA) comprising age and weight. c FRAX® score plus menopausal status, chronic joint pain, and right hand grip strength. d Fracture risk assessment tool (FRAX®) score comprising age, sex, weight, height, previous fracture, parents’ hip fracture, smoking, glucocorticoid treatment, presence of rheumatoid arthritis, secondary osteoporosis, and alcohol intake of 3 or more units/day

Discussion

Whereas DXA screening of women ≥65 years for OP is accepted as best clinical practice [33], uncertainty prevails as to how to predict risk for OP in younger mid-life women. In our prospective cross-sectional study, analysis of a large number of lifestyle and medical variables and biophysical and performance measurements identified chronic joint pain and handgrip strength as novel independent correlates of risk for OP at the spine in Singaporean women. These two factors together with age, body weight, and postmenopausal status were incorporated into a prediction model for triaging mid-life women to BMD scanning. Our final fitted model’s AUCs for predicting OP was significantly higher than OSTA and FRAX®, indicating its potential utility in younger Asian women if our results are validated in larger cohorts.

Chronic joint pain was the most reported menopausal symptoms in our Asian women. The clinical significance of chronic joint pain in menopause has not been universally appreciated in clinical dogma, despite numerous studies consistently reporting muscle stiffness and joint pain as the top complaint in Singaporeans [21], British [34], Japanese [35], Indian [36], Bangladeshi [37], Saudi Arabian [38], Turkish [39], and Latin American [40] climacteric women. In the USA, the prevalence of chronic back pain has been reported to increase steadily after menopause, a pattern was not observed in men [41]. Evidence from the Women’s Health Initiative study indicates that estrogen-alone therapy in postmenopausal women results in a modest but sustained reduction in the frequency of joint pain [42], and over the last decade, increasing links between pain neuropeptides and pathological processes in OP and bone remodeling have been reported [43]. As chronic pain can be associated with a number of medical comorbidities, these were added to the model. None were significant confounders. It is not clear whether the pain directly relates to OP or whether it is a proxy for Vitamin D deficiency, as the latter is common in mid-life Asian women [44].

Hot flushes affected 20.8% of our women and were associated with reduced risk of low bone mass and OP. This observation is consistent with a study indicating that hot flushes severity was associated with higher baseline bone density [45]. Others, however, have reported the opposite [14], perhaps supporting race differences which may limit adoption of recommendations based on a single ethnic cohort.

Menopausal status was confirmed to correlate with spinal OP. The majority of women were able to date their last menstrual period or surgery, if hysterectomy with bilateral oophorectomy was reported. Those reporting hysterectomy with unilateral oophorectomy were deemed postmenopausal, as all were over 49 years of age, the average age of menopause for Singaporean women in both a nationwide survey [21] and in this cohort. As there was no biochemical confirmation, a minority may have been misclassified, but this would not have affected the overall results.

To our knowledge, no OP-screening tool currently incorporates elements of physical performance. Of the physical factors and assessments that were associated with OP, grip strength emerged as the independent correlate. Grip strength, an easily measured parameter, is known to correlate with OP [46] and may be a marker for sarcopenia or Vitamin D deficiency, both of which are associated with osteoporosis [47]. In older women (mean age 68 years), physical performance measures such as gait speed, step length, sit-to-stand ability, and grip strength are known to correlate with OP [18]. In our study, increased walking speed (whether unconfined or restricted) and right handgrip strength correlated with better BMD T-scores. This is consistent with studies indicating that exercise and strength training can have small but positive effects on bone health [48].

Since handgrip strength and weight can be easily measured in outpatient settings, and age, postmenopausal status, and the presence of chronic joint pain can be easily ascertained, we combined these five indices into a final fitted model. Our model had an AUC of 84% (good accuracy) which was significantly higher than both OSTA and FRAX®, suggesting its possible utility as a screening tool for OP in mid-life women, in particular Asian women. Addition of three of our independent variables (menopausal status, chronic joint pain, and right hand grip strength) to the FRAX® tool resulted in a trend to improved accuracy for predicting OP, supporting the validity of these variables as independent correlates in an Asian cohort. A US study, comprising 72% white women with a similar mean age of 57 years and studying femoral neck OP, similarly found that only 34.1% of women with T-score ≤ −2.5 would be recommended for BMD testing, using the USPSTF (FRAX®-based) strategy. In comparison, both SCORE and OST performed better, with 74.0 and 79.8%, respectively, screening positive [49].

Regarding limitations, we acknowledge the study’s cross-sectional design, a health-seeking and largely Chinese cohort, reliance on some self-reported variables and small numbers in some variable categories. While just over half agreed to take part, a comparison between the participants and decliners found similar mean ages. While ethnic distribution was disparate in Indians, ethnicity was not a significant predictor of OP by multivariate analysis and therefore unlikely to bias our results. Having five predictors does make the tool more complex than OSTA, but we believe that the advent of apps and ubiquitous accessibility of electronic tools makes more complex instruments practical in this digital millennium. The study strengths included inclusion of mid-life Asian women, its prospective nature, the broad capture of many variables, use of validated questionnaires, and reliability of performance measurements. We performed our investigation to address the lack of studies on the optimal screening method for mid-life women aged 50–69 years. As such, we believe our screening tool has direct relevance to health-seeking women, but its utility outside the gynecological clinic has to be confirmed in larger population-based studies. Future studies should prospectively validate our model estimates using reclassification methods such as multivariate discriminant analysis, classification, and regression tree analysis [50, 51].

Finally, lessons can be learned from those that came before. In cardiovascular disease, there are over 360 prediction models. A recent systematic review of their use in the general population recommended that rather than developing new models, research focus should be on externally validating, comparing existing models head to head, tailoring existing models to local settings, and investigating extension with new predictors [52]. All of these we have attempted with this study.

In conclusion, we have identified two novel markers for spinal OP, chronic joint pain and handgrip strength. Combined with age, body weight, and menopausal status, we developed a screening tool whose AUC was significantly higher than both OSTA and FRAX® in prediction of spinal OP in mid-life Singaporean women. This new model requires validation in other populations including non-Asian cohorts. These findings inform much needed evidence-based guidelines for targeted and effective screening for OP and osteoporotic fracture prevention in mid-life women.