Introduction

Physical exercise is a powerful fracture prevention strategy in older women [1, 2]. With respect to bone strength as a key determinant of fracture prevention, or more precisely, its surrogate bone mineral density (BMD), most exercise trials reported favorable effects after different types of interventions (review in [3,4,5,6]). Due to the higher prevalence of osteoporosis in women compared with men [7], the vast majority of exercise studies focus on (postmenopausal) women, however. Nevertheless, from a clinical perspective, the BMD-fracture association is stronger in older men than in older women [8, 9]; thus bone strengthening as part of fracture prevention might be even more relevant for men. Unfortunately, only a few studies focus on the effect of exercise on BMD in men. Based on various eligibility criteria, recent systematic reviews and meta-analyses of exercise trials with men [10,11,12] identified eight [11], nine [10], or when applying very strict criteria, three [12] exercise trials. However, not only the subject characteristics but also the type, mode, and application of the interventions vary considerably among the exercise trials. Heterogeneity among the exercise interventions is a crucial aspect, since exercise is quite a complex agent, with some types of exercise actually being counterproductive for preventing musculoskeletal problems (e.g., [13] review in [14]). However, a large number of the early exercise studies just applied any kind of exercise intervention without properly respecting the basic principles of exercise application on the bone. This trial and error strategy, applied in a phase III setting, generated a large volume of unfavorable study results that confound the results of systematic reviews or meta-analyses in this area.

The general objective of this study was therefore to review and summarize the present literature of exercise effects on BMD at the lumbar spine (LS) and total proximal femur (TPF) regions of interest (ROI: total hip (tHip), femoral neck (FN)) in male cohorts under special consideration of their intervention (i.e., the exercise-specific need to address BMD). In detail, we aimed (1) to identify and describe all the relevant exercise trials that determine the effect of exercise on BMD in healthy men 50 years+ without pharmacological therapy with impact on the bone, (2) to rate the appropriateness of the study protocol under special consideration of the exercise intervention in order to estimate its potential to address BMD, (3) to summarize the identified studies within a quantitative (meta-) analysis under specific consideration of their methodological and intervention-specific quality and appropriateness, and (4) to derive the most effective exercise protocols for increasing BMD at LS and TPF-ROIs.

Material and methods

Data sources and search strategy

A comprehensive search of electronic databases was conducted in PubMed, Scopus, Web of Science, Cochrane, Science Direct, and Eric for all articles on the effect of exercise on BMD among men published in English up to November 30, 2016. The search strategy utilized a population, intervention, comparison, and outcome approach. The literature search was constructed around search terms for “bone mineral density,” “exercise,” “training,” and “men.” A standard protocol for this search was developed, and controlled vocabulary (Mesh term for MEDLINE) was used. Key words and their synonyms were used to sensitize the search by running the following query, (“Bone Density” or “Bone Mineral Content” or “Bone Loss” or “Osteoporosis” or “Osteopenia”) AND (“men” or “male”) AND (“Exercises” or “Aerobic Exercise” or “Isometric Exercise” or “Physical Exercises” or “Physical activity” or “Locomotor Activities”). Additionally, reference lists of the included studies were searched manually. We did not consider unpublished reports. Duplicate publications were identified by comparing author names, treatment comparisons, publication dates, sample sizes, and outcomes.

Review selection, data extraction

Three independent reviewers (WK, MS, and SvS) responsible for eligibility screened the titles and abstracts. Relevant articles were obtained in full and were assessed against the inclusion and exclusion criteria described below. Disagreements between the reviewers were resolved by reaching a consensus. A specialized extraction form was designed and used to list the methodological details for each study: authors, country, and year of publication; and details of the study including study design, study objectives, sample size, inclusion, and exclusion criteria for participants, participant characteristics (i.e., age, weight, height), description of intervention (i.e., frequency, intensity, duration, type of intervention), number of participants at baseline and study completion (including number of withdrawals), risk assessment, types of outcome variables assessed, and their values at baseline and study completion.

Inclusion and exclusion criteria

Only studies that applied an exercise intervention, i.e., either randomized controlled trials (RCT) or non-randomized controlled trials (NCT) that examined the effect of exercise on BMD in men, were included in the review. Studies with a sample size of at least eight per groupFootnote 1 conducted with men 50 years and older for the specified outcomes were eligible (see discussion). When there were multiple publications from a single project, the largest study was included. Review articles, observation studies, case reports, case series, editorials, conference abstracts, animal studies, and letters were excluded. Studies which had intervention ≤ 6 months were not included, either. We likewise eliminated studies with mixed sex without separate results for men and women. All the studies that reported inclusion of subjects taking any type of pharmacological therapy with relevant positive or negative impact on the bone were also omitted. However, studies that provided calcium and/or Vit-D as an adjuvant supplementation were not excluded.

Outcome measures

The primary outcomes in our study were (areal) BMD at lumbar spine, total hip (tHip), and femoral neck (FN) region of interest (ROI) as assessed by dual-energy X-ray absorptiometry (DXA) or dual photon absorptiometry (DPA) at baseline and study end.

Quality assessment

All the articles that satisfied the predefined inclusion criteria were independently assessed for risk of bias by two independent raters (WK and MV) using the PEDro (Physiotherapy Evidence Database) scale [15, 16]. Differences of opinion were discussed with a third assessor (SvS) until a consensus was reached. This procedure was only required in two cases with respect to allocation concealment and similarity of prognostic indicators, however.

The same procedure was used to determine the intervention quality and the appropriateness of the intervention (Table 3) to address the primary endpoint BMD [2]. Briefly, the Intervention Quality Score examines whether the most relevant intervention criteria were adequately reported. This includes (1) general focus of the exercise program, (2) detailed and comprehensive description of exercise type, (3) length of the exercise period, (4) exercise frequency, duration, or volume, (5) exercise intensity, (6) progression, (7) periodization, (8) supervision, and (9) monitoring of adherence/attendance/drop-out. Based on this information, WK and SvS rated the appropriateness of the exercise intervention for addressing BMD (1 = low to 3 = high appropriateness) [2], considering and applying basic principles of osteoanabolic exercise generated by animal studies and cross-sectional studies with athletes [14, 17].

In cases of missing relevant information, authors of the corresponding articles were contacted in order to ensure the completeness of the data.

Results

Study characteristics and quality assessments

In total, our search identified eight articles [18,19,20,21,22,23,24,25] that met the inclusion and exclusion criteria, as seen in Fig. 1. When considering the age-adjusted groups of McCartney [23] as a single exercise and control group, the effect of exercise on BMD among older men was evaluated in 12 participants in the exercise versus eight participants in the control group. Table 1 presents the characteristics of the included exercise trials. Of importance, we contacted six authors with missing data, and four authors responded; however, in two cases, data were not available. In summary, all but one study [25] focused on healthy, Caucasian, or predominately Caucasian men. Two studies recruited both genders, but their results were reported separately for each sex [23, 25]. Based on our criteria, studies were only included if they focused on subjects 50 years and older. Four studies [18, 19, 21, 22] applied a threshold of 50 years, while the other trials used slightly (55 years; [24]) or considerably higher (65 years; [20]) ages to include participants. Further, the upper threshold for age varied between 60 and 80 years [23, 24]. Sample size varied from 8 [20, 23] to 73 [24] participants/group, and all the included studies were RCTs. The studies were conducted in Denmark [20], Australia [19, 22, 24], Finland [21], UK [18], Canada [23], and Hong Kong [25]. All but one study [23] (DPA) used the DXA technique; furthermore, all the studies reported areal BMD changes on at least one of the study endpoints. One study additionally used quantitative computed tomography (QCT) of the lumbar spine [22] and reported changes for total and trabecular volumetric BMD (vBMD) from baseline to study-end.

Fig. 1
figure 1

Flow diagram of search process according to PRISMA [52]

Table 1 General characteristics of included studies (n = 8)

The methodological quality of the studies was rated using the PEDro scale. Because the exercise interventions could not be blinded (i.e., blinding of participants or instructors/personnel), the maximum attainable Pedro scale score is 8 out of 10. The Pedro Score of the reviewed studies ranged from 4 to 7 (Table 2).

Table 2 Assessment of risk of bias for included studies

Three studies reported allocation concealment [18, 19, 25], and two studies conducted a blinding of study assessors [18, 25]. Lastly, three studies applied the intention-to-treat principle [19, 22, 24]. Level of agreement between the raters for methodological quality of the studies was 100%.

Intervention characteristics

Vitamin D and calcium supplementation

Apart from the study of Kukuljan et al. [22] that supplied fortified milk (1000 mg/day calcium, 800 IE/day Vit-D) in two of their four study arms, no other study reported having provided calcium or Vit-D.

Exercise

Table 3 specifies the exercise intervention of the included studies. Most of the RCTs compared a single exercise group (EG) with a single inactive control group (CG). One study, however, determined the effect of unilateral jumping using the inactive leg as a control [18]. Two exercise trials implemented two exercise arms with different types of exercise [20, 25], and one study incorporated an additional exercise and fortified milk supplement group [22]. Two of the studies implemented an active control group that was asked to conduct two or three sessions/week of (brisk) walking [23, 24]; the detailed exercise protocol and corresponding adherence rate were not provided. However (Table 3), most of the studies [18, 19, 22,23,24,25] properly reported data on the training status of their participants. Of importance, all of these studies excluded subjects who were specifically trained, e.g., men who already conducted the type of exercise that was being applied as the study intervention.

Table 3 Exercise characteristics of the included studies

Type and strain parameters of the exercise protocols vary considerably between the trials (Table 3). Two studies compared different types of exercise (RT and soccer [20]), or Tai Chi and RT [25]) vs. control. Excluding these, six studies applied dynamic resistance exercise [19, 20, 22,23,24,25]; two of them further applied jumping protocols with high or very high ground reaction forces [19, 22]. One study each determined the effect of unilateral jumping [18], soccer [20], walking [21], or Tai Chi [25]. Among the studies that prescribed resistance exercise protocols, five studies focused on all or most of the main muscle groups [20, 22,23,24,25], and one study applied upper body (RT) exercises only [19]. The intervention durations ranged from 9 months [19] to 4 years [21]; all the studies prescribed an exercise frequency of at least two times per week (2× week to daily exercise) (Table 3). Apart from the 4-year (walking) study of Huuskonen et al. [21], all the studies reported at least the attendance rate of the intervention groups, which ranged between 53 and 91%. Thus, the estimated exercise frequency varied from ≈ 1.5 sessions [20] to ≥ 6 sessions/week [18].Footnote 2 Length of the exercise sessions varied from 15 [18] to 60–75 min [22]. Bone-specific exercise intensity varied considerably between the trials. The three studies [18, 19, 22] that prescribed an additional or isolated jumping protocol generated high strain magnitudes (and rates) with GRF peaks of up to 9.7× body weight [22]. With respect to RT, all the trials applied a multiple set protocol; four of the six trials [19, 20, 22,23,24,25] prescribed a relative exercise intensity of about 75–85% (or 8RM), while two trials used lower intensities ([19]: 60% 1RM; [25]: ≈ 40–50% (i.e., 30 reps). Two studies [20, 22] report that they focused on high strain rates during the concentric phase of the movement. None of the studies provided information about the absolute intensity (i.e., work to failure or not [26]). With respect to aerobic exercise, the walking protocol of Huuskonen [21] prescribed a wide range of between 40 and 60% VO2max; no corresponding data were provided for the walking control groups of McCartney et al. [23] and Whiteford et al. [24].

Most of the studies progressively increased the intensity and/or volume of their exercise intervention [18,19,20, 22,23,24], although this was not always applied consistently. No progressive RT or Tai Chi was implemented for the 12-month study of Woo et al. [25] and Huuskonen et al. [21] which increased the volume of the 4-year walking exercise training only once after 3 months. Lastly, only one study [22] used a periodized exercise protocol.

Five studies [19, 20, 22,23,24] focused on supervised group exercise with [19] or without home exercise, two trials [18, 21] focused on non-supervised exercise only, and one study did not state the corresponding setting [25].

With respect to the intervention quality score, and the related appropriateness of the exercise intervention, both SvS and WK provided identical ratings. In summary, the highest score was achieved by Kukuljan et al [22], who almost perfectly designed and reported their 18-month intervention [2, 14]. In contrast, the exercise protocol of Woo et al. [25] was rated very low (3 of 9 points) because important exercise parameters were not reported and thus probably not realized.

However, most studies reported their exercise protocol thoroughly and comprehensively. When rating the power of the different exercise protocol to address BMD, the majority of the studies (i.e., six trials [18,19,20, 22,23,24]) applied adequate types of exercise and prescribed a promising composition of exercise parameters. One study [21] used an in essence non-progressive walking protocol for 4 years. Another study [25] applied non-progressive (high repetition, i.e., low intensity) resistance and Tai Chi training. Two studies were shorter than 12 months [19, 23] and so may not have captured the total extent of exercise-induced BMD changes in this older cohort.

Of importance, with respect to exercise-induced negative side effects, one study that applied a low-volume, high-intensity unilateral jumping protocol [18] reported adverse effects (knee and hip pain, Table 4) of exercise during their intervention in three of their 50 men 70 years of age. Four studies listed did not observe negative side effects, and three studies did not report data on adverse effects (Table 4).

Table 4 Results of exercise effects on the bone. Shaded are the articles with low osteoanabolic potential

Results of primary outcomes

As given in Table 4, with two exceptions (TPF-ROIs only [20]; LS only [23]), all the studies determined BMD at the LS and TPF ROIs. Unfortunately, within and between-group changes/differences for BMD along with the corresponding significance levels were not always given (Table 4). With respect to the significance of the within-group changes, results for EG and/or CG were not available in some studies [21, 22, 25]. Even more importantly, three studies did not report the significance of the difference between exercise and control for LS and (all) TPF ROIs at all [20, 23, 25] and three studies [21, 22, 24] listed non-significant differences without an exact p value.

In summary, no study found significant differences in the changes in lumbar spine BMD, as assessed by DXA, in older men who exercised compared with controls. Three studies reported significant exercise effects on BMD for total proximal femur ROIs [18, 19, 22] (Table 4); however, none of the studies consistently reported group differences for the tHip and FN-ROI. Surprisingly, one of the latter studies determined an effect between the two exercise groups [19], which differed, however, only by the number of jumps per session (40 vs. 80). One study reported significant within-group changes for the LS [21], and three studies [20, 21, 24] reported significant changes at least at one TPF-ROI.

Discussion

The present contribution pursued four objectives that are addressed consecutively during the discussion below.

Firstly, this systematic review aimed to identify all the reliable randomized and non-randomized controlled trials that set out to determine the effect of exercise on BMD in comparable cohorts of healthy men 50 years and older with reasonable sample sizes and study duration. The latter aspect is of particular importance since studies ≤ 6 months were considered unable to determine the full amount of bone adaptation at LS and TPF due to the decelerated bone metabolism in older adults along with the average duration of a load-driven remodeling process [27, 28]. We also looked at the description of factors that may confound the effect of exercise on BMD. This includes training status, active control groups, supervision of the exercise protocol, and attendance/adherence. Thus, we provide a complete overview of studies (Tables 1, 2, 3, and 4) that allows the reader to rank their potential and limitations for addressing BMD in the relevant cohort of healthy men 50 years+.

Appropriateness of the study interventions to address BMD

Although we were unable to consistently decide whether important exercise parameters were not reported due to space constraints or were simply not applied, both scenarios indicate that the crucial osteoanabolic relevance of exercise parameters might not have been realized by all authors. Even when considering this limitation and the fact that our rating system is based largely on bone adaptations in postmenopausal women and that sex differences in adaptive response of the bone to exercise [10] might confound our rating, we think the appropriateness of the exercise intervention for addressing BMD (Table 3) was comprehensively and reliably evaluated. In conclusion, there was considerable heterogeneity between the studies for the intervention quality (Table 3). With respect to appropriateness of the exercise protocol to address the bone, only two exercise trials [21, 25] applied exercise protocols with exercise types and/or strain parameters considered inadequate for favorably affecting BMD, at least when taking age and status of their participants into account. All the other studies used state-of-the-art exercise programs [18,19,20, 22,23,24] or protocols that also included challenging strain constellations (e.g., soccer; [20]) for the bone.

Apart from favorable effects of exercise on the bone, adverse effects should not be overlooked. However, only one study that applied a non-supervised, home-based low-volume, high-intensity unilateral jumping protocol [18] reported negative effects (Table 4) in three of their 50 participants. Considering the advanced age of their initially untrained cohort (70 ± 4 years), the conditioning phase of this study (4 weeks) may have been too short to prevent overloading of the musculoskeletal system.

Quantification of study results to provide a general conclusion

The third aim of this project was to provide a general conclusion as to whether and to what extent exercise significantly affects BMD in older men. Although biometric procedures allow the addressing of common limitations of meta-analysis [15, 29, 30], we think that the most important decision in a meta-analysis remains the threshold up to which a study can still be meaningfully included. However, the essential heterogeneity of the exercise intervention, the implementation of active control groups (or not), and the borderline short duration of some studies hinder the meaningful application of a meta-analysis. Even when intending to summarize studies with comparable interventions, the corresponding trials vary considerably for (1) eligibility criteria [20, 23, 24], (2) primary study endpoint [20, 23, 24], (3) intensity of the jumping [19, 22] and/or RT-protocol [20, 23, 24], (4) active/inactive control groups [20, 23, 24], and (5) specific type of exercise (e.g., power vs. strength-type RT) [20, 23, 24]) or study duration [19, 22]. This might indicate that “it does not always make sense to perform a meta-analysis” [31]. For the reader less familiar with exercise interventions, some of these aspects may appear irrelevant. However, the fact that even differences in movement velocity of the concentric phase will trigger different effects on BMD [32] underlines the need for more scientific accuracy in designing and reporting exercise trials that focus on bone strengthening. Unfortunately, several meta-analytic approaches in this area did not adequately consider presumed details of exercise programs and thus failed to provide meaningful results. Consequently, researchers have to exhibit both an understanding of the underlying methodology and the expertise in the exercise domain [33]. Although there is some evidence that weight-bearing impact exercises can have a positive effect on TPF-BMD, in summary, we feel unable to provide a precise conclusion about what degree of exercise is effective for influencing both lumbar spine and proximal femur BMD in older male cohorts.

Generation of exercise recommendation for men

Finally, the desired aim of this systematic review was to identify the most effective exercise protocol(s) for increasing BMD at LS and TPF in older men. Firstly, high impact exercise—with or without intense resistance exercise, and soccer with its intrinsic running and jumping (i.e., moderate-high GRF) demands—generated significant effects at the proximal femur ROIs (although not consistently for tHip and FN-BMD). So far, the data largely correspond to exercise trials with women of similar age and overall status. Applying recommendations derived by studies with postmenopausal females, the exercise protocol provided by Kukuljan et al. [22] is close to optimum. Indeed, the authors reported a significant effect for FN-BMD but failed to generate positive effects for tHip and LS-BMD after 18 months,Footnote 3 however. Interestingly, group differences were much more pronounced after 12 months of exercise (p < .05 for LS-BMD, DXA) [34].

Reviewing the studies included in this review, isolated resistance exercise protocols generated no effects on BMD at the spine or TPF-ROIs, although exercises were applied with adequately high strain magnitude and/or rate [23, 24]. Further, and unexpectedly, none of the studies of older men reported significant effects on LS-BMD, even though they applied exercise protocols that generated (very) high strain magnitude and strain rates induced by very high ground (up to ≈ 10× body weight) and/or joint reaction forces.

This finding does not correspond with the much more extensive literature with female cohorts (review in [3, 6, 35,36,37,38]). Indeed, all of these meta-analyses consistently reported significant positive BMD effects on femoral neck and LS ROI in women. Moreover, one meta-analysis [39] stated the opposite effect: that “in premenopausal women high-intensity progressive resistance training was efficacious in increasing BMD at the lumbar spine (p < 0.001), but not at the femoral neck (p = 0.78).” Consequently, recognized recommendations derived from exercise studies with women might not necessarily be effective in male cohorts.

The RCTs of Woo et al. [25] and McCartney et al. [23], who conducted studies that compared BMD changes between older men and women, may provide a further insight into differences in sex-specific exercise responses. Woo et al. [25] did not determine any positive effects of Tai Chi or low-volume/low-intensity resistance exercise on BMD in men but reported significant BMD changes at the total hip after both types of exercise and borderline significant BMD changes at the LS after RT for their female peers. The authors attribute the different results to the low intensity of both types of exercise that may be inefficient in men, while their female counterparts might still adapt to these rather low mechanical stimuli. Further, it is reasonable to expect that healthy older men have to exercise with higher (absolute) intensity compared with their female counterparts. In contrast, McCarthney et al. [23] applied a much more challenging RT protocol in a comparably aged male and female cohort and did not determine any LS-BMD change in the groups.

In summary, we are unable to recommend specific exercise protocols in terms of the optimal type and dose (e.g., exercise frequency, strain magnitude, strain rate) of training to increase BMD in older men.

Thus, upcoming exercise trials for bone strengthening with high methodologic and intervention quality will have to generate a definite conclusion about (1) the relevance of exercise interventions in older males and in parallel (2) the most optimum type and composition of exercise to address BMD in this cohort.

Study limitations

Some limitations and features of this work should be addressed to allow the reader to rate the relevance of our findings and to follow our conclusions.

(1) We only included studies that applied an exercise intervention, i.e., RCTs or non-randomized CTs. This is in line with most (e.g., [5, 10, 11, 37, 40,41,42,43]), but not all (e.g., [3, 4, 12, 44]), systematic reviews and meta-analyses in the field of exercise and BMD. (2) With respect to other eligibility criteria, we focus on studies that included apparently healthy men 50 years+ without any medication that relevantly impacts the bone (either positively or negatively). We have to admit that the cut-off point of 50 years was somewhat arbitrary and was aligned to the age range typical of osteoanabolic exercise studies in postmenopausal women. Further, apart from some exceptions that focus on specific interventions (e.g., military service [45,46,47]), all the studies that determine exercise effects on BMD in healthy older male adults included people 50 years and older. Thus, our approach can be considered as a compromise so as to include all the relevant studies with older male cohorts but prevent confounding results due to varying age-dependent adaptation processes of the bone. (3) We further included only studies with intervention periods ≥ 6 months because the duration of a remodeling cycle averages 200 days for cancellous and 120 days for cortical bone (review in [48]), and thus it is not likely that any true physiological exercise-induced skeletal changes would occur prior to this period [49]. (4) We included only studies with sample sizes of ≥ eight people at baseline. This decision was based on our own data of exercise-induced BMD changes in women [50] that allow a significant effect to be determined with eight subjects per group. (5) We included studies irrespective of whether BMD was the primary study endpoint. As mentioned, one study [23] might have considered BMD as a secondary study endpoint and might thus be underpowered to address BMD changes at LS and TPF. However, many (older) exercise studies did not report a hierarchy within their study endpoints; thus, we did not consider this aspect as an exclusion criterion. (6) Finally, we failed to register the study (PROSPERO), because at the time of registration, we were (far) beyond the point of data extraction.

Not a study limitation, but a recommendation for further studies in the area of exercise and bone strength in people with advance age is the aspect to using alternative techniques to determine additional characteristics of bone strength beyond the assessment of bone mass (or BMD) by DXA. This specifically refers to QCT, which enables the isolated validation of trabecular BMD and thus prevents confounding effects of degenerative changes of increased age particular relevant for the lumbar spine site [51]. Further, applying this technique would allow the proper calculation of biomechanical bone strength indices and so provide further insight in the effect of exercise on bone strength in older cohorts.

Conclusion

This systematic review identified only limited evidence for the favorable effect of exercise on BMD in older men. Although there is some evidence that exercises can have a positive effect on TPF-BMD, the lack of positive effects on LS-BMS is discouraging. Of importance, even when focusing on the few high-quality studies that adequately address the effects of exercise on the bone, the result was unsatisfying in terms of the skeletal benefits on both lumbar spine and proximal femur BMD.

In summary, apart from the need for more well-designed studies that properly address exercise effects on BMD changes in older men, it is important to further evaluate whether there are differences between men and women in bone adaptation to exercise. Corresponding approaches would be helpful for gauging the legitimacy of transferring recognized exercise recommendations from older women to the cohort of older men.