Introduction

As we age, fragility and morbidity increase, regardless of the considerable variation between individuals. Due to the longevity of the population, we must anticipate that the prevalence of chronic and age-related diseases such as osteoporosis will increase (Ashcroft et al. 1997; Borkan et al. 1982).

In 2010, the European prevalence of post-menopausal osteoporosis was approximately 22.0 million, i.e., 21 % of women aged 50–84 years. This prevalence is estimated to increase to 27 million in 2025 (Borkan et al. 1982). Annually, osteoporosis causes 8.9 million fractures worldwide, with a higher incidence in Scandinavia (Bulpitt et al. 2001). Due to the consequences of fractures (e.g., death, immobilization, pain, hospitalization, and impaired level of function), this disease represents not only a significant economic and societal burden but also a social burden to individuals (Borkan et al. 1982; Christensen et al. 2004; Christensen et al. 2009a). Early detection of osteoporosis is important to be able to implement treatment and prevent fractures.

Diagnostic criteria for osteoporosis are based on bone mineral density (BMD), as measured by dual-energy X-ray absorptiometry (DXA). BMDs of post-menopausal women have been reported to decline by up to 2.3 and 1.4 % per year in the spine and hip, respectively (Christensen et al. 2009b); data on pre-menopausal woman, however, are inconsistent. During the transition from pre-menopause to post-menopause, bone loss initially increases by a steady, gradual rate and accelerates in late menopause (Christensen et al. 2009b).

Based upon DXA images of the lumbar spine (LS), the trabecular bone score (TBS) is calculated. The TBS is a novel surrogate measure of microarchitecture in clinical practice and is reported to decline by approximately 14 % from age 45 to 85 years (Cicchetti 1981). The TBS has been shown to contribute to fracture risk prediction, even when adjusted for BMD (Drzal-Grabiec et al. 2013; Dufour et al. 2013; Ettinger et al. 1994).

It is well known that some individuals look young or old for their age, and this has been used as an indicator of health by clinicians in many countries. Epidemiological studies have reported that high perceived age, based on either photographs or live evaluation of age, even after controlling for chronological age, is associated with mortality (Finkelstein et al. 2008; Gunn et al. 2008), poor health (Guyuron et al. 2009), and clinical indicators of chronic diseases (Hernlund et al. 2013; Iki et al. 2014).

The body mass index (BMI) and the estrogen level have positive influential effects on both BMD and perceived age, but lifestyle-related factors, as e.g., smoking, have a negative effect in both (Johnell 1997; Johnell and Kanis 2006; Kanda and Watanabe 2004; Kannus et al. 1999).

A characteristic change in body posture of aging women is often observed, the tilting forward of the whole body (Kido et al. 2012). The posture of a spine-affected osteoporotic patient is characterized by a protuberant abdomen, loss of height, and especially thoracic kyphosis due to vertebral compression fractures and possible weakness of spinal muscle stabilizers (Krueger et al. 2014; Looker et al. 2005; Mandema et al. 2014). Physicians might, based on posture, predict the severity of osteoporosis and misjudge the patient as older than their actual age. Furthermore, because both BMD and facial characteristics are influenced by estrogen level, perceived facial age might associate with low BMD values either alone or in the combination with posture photographs.

Because osteoporosis, as reflected by BMD (TBS), is considered an age-related disease and because posture of the spine of osteoporotic patients has a very characteristic appearance, we hypothesize that physician-assessed perceived age (PA) from facial and whole-body photographs (presented together or separately) is associated with BMD/TBS. Furthermore, we hypothesize that subjects that appear younger than their chronological age have a higher average BMD/TBS value than subjects who appear older than their chronological age.

Method

Population

This cross-sectional study was conducted at the Research Centre of Ageing and Osteoporosis, Glostrup Hospital, Copenhagen, Denmark. A total of 460 women, aged 25 to 93 years, were recruited by advertisements in newspapers and from two population-based studies carried out at the Research Centre for Prevention and Health in the Capital Region of Denmark. The study population is further described elsewhere (Mayes et al. 2010).

All participants signed informed consent, and the study was approved by the ethics committee of the Capital Region of Denmark (H-4-2009-124/28835).

Photographic presentations

The participants were photographed using a standardized setup at the research center, taking front-facing (facial) or profile (fully dressed, whole-body) photographs with a Nikon DSLR D70s/18–70-mm lens at a distance of 1.5 m. The participants were encouraged to display a neutral facial expression.

Three portable document format (PDF) presentations were prepared with slides containing only “facial photograph,” only “whole-body photograph,” and finally both photographs presented simultaneously. The three sets of photographs were presented in a threefold randomized order (sorted by ID using the SLUMP function in Microsoft® Excel version 2003); i.e., all facial photographs were presented in one randomized order, all whole-body photographs in another, and the combined presentations in a third randomized order.

Assessors and age assessment

Twenty physicians were asked to give age estimates when viewing participants in the three PDF files in a purely subjective and intuitive manner. The assessors were blinded to the general characteristics of the participants and were unaware of the scope of the study. Every assessor was initially asked to estimate the ages of all 460 women from facial photographs, then from whole-body photographs, and finally from the PDF file presenting both photographs.

DXA

BMDs (g/cm2) of the posterior-anterior LS (L2–L4) and both hips (total hip (TH) and femoral neck (FN)) were measured by DXA (Hologic Discovery QDR Series scanner) as described previously (Mayes et al. 2010). All vertebrae were included except those with metal implants. TBSs (unitless), based upon the DXA images of the LS, were then calculated (TBSiNsight version 1.9.2.1, Med-Imaps). All analyses were performed by the same laboratory technician.

DXA accurately determines 2D BMD and detects patients with fragile bones who are at a heightened risk of incurring an osteoporotic fracture (Mika et al. 2005). The WHO criteria of osteoporosis rely on T-scores generated from the BMD. The TBS is a novel, readily available, non-invasive clinical technique based upon the DXA images of LS and provides a surrogate measure of the 3D microarchitecture of the skeleton (Nielsen et al. 2015). A high correlation between the TBS- and CT-addressed microarchitecture is observed (Noordam et al. 2012). Furthermore, reported results suggest that, unlike BMD, the TBS is not influenced by the presence of osteoarthritis (Cicchetti 1981).

Statistics

Data were analyzed using SAS, version 9.4 (SAS Institute Inc., NC, USA). P < 0.05 was considered statistically significant.

Scans with metal implants were excluded (LS n = 1, left hip n = 15, and right hip n = 21). One participant had a BMI that was too high for DXA to be performed, and nine had a BMI that was too high for the TBS to be calculated.

The PA of each participant for each of the three photographic presentations was calculated using linear mixed models. PA was included as an outcome, photographic presentation as a fixed effect, and individual assessors and participants as random effects. PA could be calculated as the simple mean of the assessor’s ratings as the data were completely balanced. Both Pearson’s correlation coefficients and the regression estimate of PA on chronological age were estimated for each photographic presentation and for each calculation method.

Reproducibility was calculated using interclass correlations (ICCs) and 95 % limit of agreement for each photographic presentation using estimated variances from the three linear mixed models.

ICCs were calculated as follows:

$$ \frac{\mathrm{variance}\ \mathrm{between}\ \mathrm{participants}}{\left(\mathrm{variance}\ \mathrm{between}\ \mathrm{participants}\kern0.5em +\kern0.5em \mathrm{variance}\ \mathrm{between}\ \mathrm{assesors}\kern0.5em +\kern0.5em \mathrm{variance}\ \mathrm{of}\ \mathrm{residuals}\right)} $$

Ninety-five percent limit of agreement was calculated as follows:

$$ \mathrm{mean}\pm 2\times \sqrt{\left(2\times \mathrm{variance}\ \mathrm{between}\ \mathrm{assessors}\right)+\left(2\times \mathrm{variance}\ \mathrm{of}\ \mathrm{residuals}\right)} $$

Regression of PA on chronological age indicated regression toward the mean; i.e., older and younger participants were estimated younger and older than their age, respectively. This dependency was managed using the residuals from the regression in the analyses on the association of PA with bone parameters. The residuals gave the difference between the participant’s estimated age and that predicted from their actual age, as described elsewhere (Noordam et al. 2013). High residuals indicate that participants looked older than their chronological age. The residual variable (RPACA) was initially included as a continuous variable and subsequently as a categorical variable grouping the participants into “looking old” (LO) (RPACA > = ±1 SD) or “looking young” (LY) (RPACA < = ±1 SD) for each photographic presentation. Participants looking their age were excluded.

For every photographic presentation, the association of PA with BMD was tested using linear mixed models with BMD as the outcome. Fixed effects were RPACA, skeletal site (i.e., LS, right TH, right FN, left TH, and left FN), and chronological age. Interaction terms between skeletal site and chronological age and RPACA, respectively, were tested. Skeletal sites were included as a repeated statement. Subsequently, the categorical variable of RPACA (LY versus LO) was included. Successive adjustment was conducted for BMI (continuous variable) and the categorical variables menopause, hormone replacement therapy (HRT), and smoking habits.

The association of PA with the TBS was estimated for the three photographic presentations using multiple regression analysis with the same independent variables as described above.

Sub-analyses were conducted for participants not receiving anti-osteoporotic treatment, for participants older than 65 years (due to a large demarcation in the number of participants in that age), and for post-menopausal participants.

Due to model assumptions, the outcome variables (BMDs and TBS) were logarithmically transformed; the results were back-transformed using exponentials, leading to estimates and confidence intervals termed as percentages.

Results

Description

Four-hundred-sixty women participated in this cross-sectional study. The mean chronological age was 63.9 years (range 25–93 years), and mean-perceived ages, based on the three different photographic presentations, were as follows: facial and whole-body, 63.9 years; facial, 61.8 years; and whole-body, 62.3 years. The general characteristics of the study population and assessor panel are presented in Table 1.

Table 1 Descriptive characteristics of the study participants and the assessor panel, reported as the mean ± standard deviation, median (interquartile range) (when the data were not normally distributed), or as a quantity (percentage of the total)

Correlations, regression, and reproducibility

Pearson’s correlations and regression between the chronological age and PA were statistically significant (p < 0.0001), R = 0.92, 0.91, and 0.88 and β = 0.90, 0.89, and 0.82, for combined presentations, facial presentations alone, and whole-body presentations alone, respectively. The coefficients were not altered when the PAs were calculated using the simple means instead. The slopes of the regressions of PA on chronological age were all below one, indicating a regression-to-the-mean phenomenon.

ICCs were 0.84, 0.81, and 0.76, and the 95 % limits of agreements were ±14.1, ±15.7, and ±17.2 for combined presentations, facial presentations alone, and whole-body presentations alone, respectively.

Association of PA with BMDs

Table 2 presents the association of RPACA with BMD. The interaction term “RPACA|scan site” was statistically insignificant. The “chronological age|scan site” was, in all analyses, highly statistically significant and therefore included in the model. All estimates of RPACA were negative, consistent with the hypothesis. For example, BMD decreased by 0.21 % when RPACA increased by 1 year; i.e., if two participants had the same chronological age and scanned skeletal site, participant one would on average have 0.21 % lower BMD if evaluated 1 year older than participant two. However, only the association based on facial photographs was statistically significant (footnote a in Table 2). After adjusting for BMI (footnote b in Table 2), menopausal status, and HRT (footnote c in Table 2), the estimates of RPACA were still negative and statistically significant or borderline significant. Smoking habits were statistically insignificant.

Table 2 Estimates from the linear mixed model analyses of the association of PA, expressed as RPACA, with BMD, stepwise adjusted for the interaction between chronological age and skeletal site; BMI; menopause; and HRT

The association of PA with the TBS

The association of RPACA with the TBS is presented in Table 3. Chronological age was, in all analyses, highly statistically significant; however, only the chronological age-adjusted association of RPACA from whole-body photographs reached borderline statistical significance (p = 0.05). All estimates were negative, consistent with the hypothesis. Smoking habits did not reach statistical significance. Menopause and HRT were significant or borderline significant (p values between 0.04 and 0.06) and included in the fully adjusted model.

Table 3 Estimates from the general linear model analyses of the association of PA, expressed as RPACA, with TBS, stepwise adjusted for the chronological age; BMI; menopause; and HRT

Looking old versus looking young

Table 4 presents the difference in BMD/TBS between LO (reference) and LY. Controlled for chronological age and skeletal site, all analyses showed positive estimates; i.e., the mean values of BMD/TBS were higher in the groups where participants were estimated younger than their predicted age compared with the reference group. However, only the association of whole-body PA with TBS in the primary analyses reached borderline statistical significance. Adjusted for BMI, PAs from all photographic presentations were significantly associated with BMD. Smoking habits, menopause, and HRT did not reach statistical significance.

Table 4 Results of the regression analyses, testing whether there is a difference in BMD or TBS between looking old (reference) or looking young participants

Sub-analyses

Results of the sub-analyses are presented in Table 5. First, participants receiving anti-osteoporotic treatment were excluded to investigate whether the same pattern was present when excluding participants with the potentially most severe osteoporosis. All estimates were consistent with the hypothesis, but no statistical significance was obtained. We then focused on participants with a normally heightened risk of suffering from osteoporosis (i.e., older and post-menopausal women). PA was not statistically significantly associated with TBS, but all of the estimates showed a trend in the hypothesized direction. PA assessed from facial photograph alone or the combined presentations were borderline or statistically significantly associated with BMD, when controlled for chronological age and skeletal site. In addition, the size of the association was greater in post-menopausal women and greatest in women over 65 years of age.

Table 5 Results from the three sub-analyses

Discussion

Previous studies have shown that high PA is associated with increased mortality as well as glucose level and morning cortisol as a marker of chronic and age-related diseases (Gunn et al. 2008; Iki et al. 2014; Olde Rikkert 1999). To our knowledge, the present study is the first in this field.

A tendency in accordance to the hypotheses (all estimates were in the expected direction) was present; if two participants were of the same chronological age, the one with a higher PA would on average have lower BMD/TBS compared with the one with the lower perceived age. The signal was most apparent in the BMD analyses compared with the TBS, as the primary analyses of facial PA were statistically significantly associated with BMD and because PA from all photographic presentations associated with BMD in the fully adjusted models.

Possible causes of the findings may be that participants with a characteristic osteoporotic posture, i.e., thoracic kyphosis, may be age overestimated and concurrently have low bone mass. Furthermore, with menopause, BMD decreases and participants may be estimated as older looking because estrogen withdrawal decreases the collagen production by fibroblasts (Pothuaud et al. 2009), decreases keratinocyte formation, and increases keratinocyte apoptosis, thus accelerating epidermal atrophy (Rexbye et al. 2006; Roux et al. 2013), which leads to aging phenotypes of the skin.

The estimates were quite large. BMD declined on average 0.25 % for every 1 year increase in RPACA. As the estimate of hip BMDs in this study population was approximately 0.4–0.5 % per year increase in age (Mayes et al. 2010), an estimate of half that magnitude seems clinically relevant. A difference in BMD of 4.37 % between LY participants versus LO participants (assessed from facial and whole-body photographs and adjusted for chronological age and BMI) is a clinically relevant magnitude as oral alendronate treatment has shown an increase of 2.4 % in the first year of treating post-menopausal woman (Sherertz and Hess 1993).

The sub-groups were chosen to determine whether exclusion of the small group receiving anti-osteoporotic treatment and including only participants who normally are at heightened risk of having osteoporosis would alter the tendency.

Higher age-adjusted facially PA was associated with the participants receiving anti-osteoporotic treatment (treatment vs no treatment (reference) 1.76 years; 95 % CI 0.11, 3.42; p = 0.04), and the small group receiving anti-osteoporotic treatment actually had a great influence on the PA association with BMD. Excluding this group altered the statistical significance, but the magnitude and direction of the estimates remained. Furthermore, when including only older or post-menopausal participants, the estimate increased and the primary analyses regarding BMD gained statistical significance, indicating that the association of PA with BMD was more apparent in older subjects.

Statistical considerations and reproducibility

The reported ICCs between 0.76 and 0.83 describe an excellent reproducibility (Silva and Bilezikian 2014) and are in agreement with the ICCs reported by Rikkert et al. (Sinaki et al. 1993), where the evaluation panel was composed of four geriatricians. The largest 95 % limit of agreement was obtained from whole-body photograph alone. Both the ICC and the associated 95 % limit of inter-assessor agreement showed accordingly better agreement when both photographs were presented compared with either facial or whole-body photograph alone.

When calculating the individually perceived ages from the ratings of 20 assessors, the obvious choice is to calculate the simple mean, as suggested in previous studies (Gunn et al. 2008; Hernlund et al. 2013). Consistent with a study by Gunn et al. (Tang et al. 2010) on PA as a biomarker of aging, however, we compensated for assessors who might rate the age generally high or low compared with the assessor panel, using a linear mixed model with assessors and the individual participants as random effects. In this particular study, it would not have altered the results if we had chosen the simple mean method because the data were balanced. In our opinion though, the mixed model approach is preferred to obtain generalizable interpretations, i.e., results covering a random assessor and a random participant.

The correlation between perceived and chronological ages was statistically significant, in agreement with previously reported results, but the correlation coefficients in our study were considerably higher (Rs of our study = 0.88–0.92 compared with R = 0.69 and R = 0.40 for the Gunn et al. and Christensen et al. studies, respectively) (Finkelstein et al. 2008; Tang et al. 2010). The larger age range of our study population and the relatively high number of age assessors may explain this. The regression of PA on chronological showed regression-to-the-mean phenomenon, which is a well-known phenomenon in age assessment, although Gunn et al. (Tang et al. 2010) reported that there was no indication that this was universal. PA-subtracted chronological age (PA-CA) has previously been used instead of residuals, as in our study (Guyuron et al. 2009; Kanda and Watanabe 2004), but if regression-to-the-mean phenomenon was present, the data would be biased because most of the chronological aged individuals would be in the low PA-CA group. All analyses were performed with PA-CA and showed the expected tendency, but statistical significance was not obtained (data not shown).

The idea of comparing LO versus LY arose from the study performed by Kido et al. (Hernlund et al. 2013), where PA was considered a biomarker of carotid atherosclerosis; in contrast, we found it more clinically relevant to exclude individuals who were age-estimated close to chronological age, notwithstanding the reduction of suitable participants.

The association between BMDs within one individual is presumably larger than between several individuals. Well aware of biological variation, we therefore chose to consider the five BMD measurements of each participant as repeated measurements and included the scan site as fixed and repeated effects in linear mixed models. Epidemiological studies routinely use separate multiple regression analyses for each skeletal site (Theodorou and Theodorou 2002).

Photographic considerations

Ideally, PA assessments should include observations of mimic, speech, body posture and movements, and facial characteristics, as previously conducted (Guyuron et al. 2009; Noordam et al. 2013). However, in large epidemiological studies, this is not feasible. In fact, in most studies, PA assessment has been based on differently angled facial photographs (Gunn et al. 2008; Hernlund et al. 2013; Iki et al. 2014; Urano et al. 1995). We chose to present participants from both facial and whole-body photographs because osteoporotic patients have a characteristic appearance and because we believe that the presentation of both face and posture is most similar to “live evaluation of age.” Furthermore, presentations of participants from separate and combined photographs might give an impression of what presentation to use in future studies. When evaluating the fit statistics (data not shown), no differences were observed. Because less significant results, lower estimates, and lower reproducibility were present when age was assessed from whole-body photographs, we would not recommend that presentation alone.

Assessor panel

Physicians were chosen because they regularly assess the age or appearance of patients in medical records. Their assessments, however, might be biased because they primarily see a population with chronic diseases in daily practice and may therefore misjudge older, healthy, and well-appearing participants as being younger than their true age. On the other hand, physicians, well aware of signs of ageing, might adjust their assessments accordingly to those signs, even though they were asked to assess how old the participants looked and not to guess their chronological ages.

Strengths and limitation

Every participant was age-assessed by 20 assessors from three photographic presentations using totally balanced data without any missing values. This, together with our statistical approaches using linear mixed models to calculate individual perceived ages and to estimate the PA associations with BMD were some of the main strengths of the present study. A frequent challenge in epidemiological studies is the potential bias by “healthy volunteers.” However, in this study population, the mean BMD values of TH, FN, and LS were slightly lower than the mean BMDs obtained in the National Health and Nutrition Examination Survey (NHANES) study, which is frequently used as clinical reference material (van der Klift et al. 2004). Age-adjusted mean NHANES BMDs of non-Hispanic white women aged 20 years and older were 0.913, 0.799, and 1.049 g/cm2 for TH, FN, and LS (L2–L4), respectively (van der Klift et al. 2004) compared with our measurements of 0.886, 0.748, and 1.015 g/cm2 for TH, FN, and LS, respectively.

In the present study, lower BMDs of all skeletal sites were statistically significantly associated with increase in chronological age; FN declined more than TH as expected, but BMD of LS declined less than expected. One reason might be that all vertebrae were included (except the one with metal implant) in this study, whereas invalid vertebrae, i.e., vertebrae with potential condensations, in the NHANES were excluded.

Conclusions

We were able to demonstrate a trend in the association of PA with BMD and TBS. When controlled for relevant confounders, both statistically and clinically significant associations were present which were greatest in post-menopausal and older women.