Introduction

Osteoporosis is characterized by excessively low bone density, bone fragility, and increased risk of fracture with relatively minor trauma. It is a serious condition estimated to affect up to one-third of postmenopausal women [1]. Recent evidence indicates the incidence of osteoporosis may be increasing even more than would be expected based on the increased number of older persons, suggesting a decrease in bone quality from generation to generation [2]. The already staggering medical, social and economic costs [3] can be expected to increase unless effective prophylactic and therapeutic regimens are developed.

The etiology of osteoporosis is complex and multi-factorial. Traditionally, nutritional (calcium, vitamin D and fluoride), hormonal, and exercise interventions have been recommended, with mixed results [4]. Combined therapies may have greater efficacy. Indirect evidence from studies of bone mineral density (BMD) in amenorrheic versus eumenorrheic athletes suggest that exercise and female sex hormones have independent and possibly additive skeletal effects [5]. Animal studies showing independent and additive effects of estrogen replacement and treadmill exercise on vertebral and femoral bone mass in ovariectomized rats support this view [6]. The evidence from prospective human studies is also supportive, although available studies have been limited to small groups of women and the adequacy of the exercise stimulus in some studies is uncertain [7,8,9].

The purpose of the present study was to assess the effects of exercise and HRT on regional and total body BMD in postmenopausal women. To this end, we tested the effect of exercise on BMD in women who used HRT or did not use HRT. Since exercise and HRT are thought to mediate the skeletal response through different mechanisms, we postulated that the combination of exercise and HRT would have greater effects on BMD than HRT alone.

Materials and methods

Design

The Bone, Estrogen and Strength Training (BEST) Study was a partially randomized clinical trial of the effects of 12 months of exercise on bone mineral density (BMD) in early postmenopausal women. Women who were undergoing hormone replacement therapy (HRT, n=159) for at least one year and not more than 5.9 years and women who had not used HRT (NHRT, n=161) during the preceding year were randomized to exercise (EX) or no exercise conditions (NEX), giving four groups: HRT, EX (n=86); HRT, NEX (n=73); NHRT, EX (n=91); NHRT, NEX (n=70). All participants agreed to maintain their current HRT use status while in the study. Written informed consent was obtained from all participants prior to entering the study.

Recruitment, entry criteria, and run-in

Participants were recruited through television, radio and newspaper advertisements and flyers distributed in the community. Inclusion criteria were as follows: age (40–65 years); surgical or natural menopause (3–10.9 years); body mass index (BMI) less than 33 kg/m2; non-smoker; no history of osteoporotic fractures and initial lumbar spine and hip (trochanter and neck) BMD greater than Z-score of −3.0; undergoing HRT (1–5.9 years) or not undergoing HRT (>1 year); cancer free for the last 5 years (treatment free for last 5 years), excluding skin cancers; not using medications that alter bone mineral density, and no beta-blockers or steroids; calcium intake >300 mg/day; less than 120 min of physical activity per week, and no weightlifting or similar activity. All participants also agreed to maintain their baseline level of physical activity (if not randomized to exercise) and dietary practices for the duration of the study, accept randomization to exercise or no exercise groups, and to consume calcium supplements provided by the study. Eligible women were enrolled in an 8-week run-in phase designed to test adherence and encourage early drop out. All screening (medical history, physical exam and 12 lead ECG exercise stress test; orthopedic and postural analysis by a physical therapist) and baseline measurements were made during run-in. Follow-up measurements occurred at 12 months.

Hormone replacement therapy

Women who used HRT (n=159) followed regimens prescribed by their primary care providers. Consequently, a variety of regimens were used, although most women took oral estrogen (32%) or estrogen and progesterone (51%). Another 12% received transdermal (patch) estrogen or estrogen and progesterone. Participants were instructed to maintain the same regimen throughout the study and to report changes if they occurred. HRT use (type and regimen) was assessed at 6-month intervals to monitor potential changes. In addition, total serum estrone and estradiol levels were measured in duplicate by double-antibody radioimmunoassays (Diagnostic Systems Laboratory; Webster, Tex., USA) to compare steroid levels between groups and to monitor HRT compliance. Intra-assay and inter-assay coefficients of variation for all assays were <10%.

Calcium supplements

All participants received 800 mg per day of elemental calcium in the form of calcium citrate (Citracal®, Mission Pharmacal, San Antonio, Tex., USA) supplied by the study in blister packs in 2-month allotments. The subjects were instructed to take 2 tablets (200 mg elemental calcium/tablet), twice a day, without food, with a minimum of 4 h between doses. Compliance with calcium supplements was monitored through pill counts. Participants were considered compliant if they consumed at least 80% of the allotted pills during each period.

Anthropometry

Standing height (HT) was measured to the nearest 0.1 cm during a maximal inhalation using a Schorr measuring board. Weight was measured on a calibrated digital scale (SECA, model 770, Hamburg, Germany) accurate to 0.1 kg. Body mass index in kilograms per squared meter was calculated from WT (kg) and HT (m2).

Dual energy X-ray absorptiometry

Total body and regional BMDs (g/cm2) were measured by dual energy X-ray absorptiometry (DXA) using the Lunar, Model DPX-L (software version 1.3y, extended research analysis) pencil-beam densitometer (Lunar Radiation Corp, Madison, Wisc., USA). Subject positions for total body, antero-posterior lumbar spine (L2–L4), and femur (neck and trochanter) scans were standardized according to manufacturer's recommendations. Each subject was scanned twice at each measurement period and the mean of the two measurements was used in all analyses. Scan analysis was done by one certified technician. The calibration of the densitometer was checked daily against a standard calibration block supplied by the manufacturer. In addition, a spine phantom was scanned daily over the course of the study to account for potential BMD variations due to machine error. The coefficient of variation (CV) for the spine phantom (BMD, L2–L4) was 0.6%. BMD precision, expressed as a percent of mean BMD, was ±1.8%, ±2.4%, ±2.4%, and ±0.8% for lumbar spine, femoral neck, trochanter, and total body BMD, respectively, estimated from 261 women with repeat scans at baseline.

Body composition

Percent body fat and lean soft tissue mass were obtained from DXA whole body scans as described above. Percent fat was derived as the ratio of fat mass to whole body mass estimated by DXA. Lean soft tissue mass measured by DXA is the equivalent of whole body mass minus the fat and bone masses.

Muscle strength

Muscle strength was assessed from the maximal isokinetic torque of the right knee extensors/flexors, hip extensors/flexors and back extensors/flexors measured with a LIDO isokinetic dynamometer following standard procedures (Loredan Biomedical, Sacramento, Calif., USA). Speed of movement was set at 60°/s for all three sites with a 90°, 45°, and 65° range of motion for knee extensors/flexors, hip extensors/flexors, and back extensors/flexors, respectively. After a calibration to compensate for limb or torso weight and five to seven practice repetitions including at least one maximal effort, subjects were instructed to work as hard and fast as possible throughout the range of motion. Subjects performed three sets of five repetitions for knee extensors/flexors and hip extensors/flexors, and three sets of three repetitions for back extensors/flexors, with a 1-min rest period between sets. Maximal torque was calculated as the mean of the peak one-repetition torque for the second and third sets.

In addition to LIDO estimates, muscle strength in exercise subjects was estimated from assessments of the one repetition maximum (1-RM) measured in exercise facilities every 6–8 weeks to set training loads. The 1-RMs were measured in the same exercise facility using the same equipment for all subjects at each assessment.

Diet assessment

Dietary intake was assessed from eight randomly assigned days of diet records collected at baseline (3 days), 6 months (2 days), and 12 months (3 days). Each 2- to 3-week recording period included 1 weekend day and 1–2 non-consecutive weekdays. Participants did not receive dietary advice and were instructed not to change their diets during the study. Diet records were reviewed for completeness and accuracy with the participants by trained technicians. The diet records were analyzed for nutrient intake by trained technicians using the Minnesota Nutrient Data System (NDS) versions 2.8–2.92.

Exercise intervention

Exercise subjects trained 3 days per week (non-consecutive days) in community facilities with supervision by study trainers who were trained in the study exercise protocol and who met weekly with an investigator. Exercise sessions included stretching, balance and aerobic weight-bearing activity for warm-up, weightlifting, an additional weight-bearing circuit of moderate impact activities (e.g. walk/jog, skipping, hopping), and stair-climbing/step boxes with weighted vests. Attendance, loads, sets and repetitions, steps with weighted vests, and minutes in aerobic activity were monitored with logs checked regularly by trainers on site.

Weightlifting was done using free weights and machines. Exercises included leg press, hack squats or Smith squats, lat pulldowns, lateral rows, back extensions, right and left arm dumbbell presses, and rotary torso. Two sets of six to eight repetitions were done each day at 70% (2 days per week) or 80% (1 day per week) of the one-repetition maximum (1-RM). Strength (1-RM) was measured every 6–8 weeks and the load increased to maintain loads at 70–80% 1-RM.

Weightlifting was supplemented with stretching, balance, and ancillary resistance exercises, using therabands and physiotherapy balls, that were done during warm-up and cool-down periods. These exercises were designed to improve balance and flexibility, both critical to falls prevention, while engaging muscles that support the thoracic cavity and the spine (e.g. lower rectus abdominis, erector spinae group, middle and upper trapezius).

Aerobic, weight-bearing activity was done for approximately 10 minutes during warm-up (walking) and for another 20–25 min during the circuit and stair-climbing/step boxes. The weight-bearing circuit progressed from walking to increasing time spent in jogging, skipping, hopping and similar activities with greater ground-reaction forces [10,11]. Intensity was maintained at approximately 60% of maximal heart rate, self-monitored by carotid palpation. Stair-climbing/steps boxes began with 120 stairs/steps (8 inches) per session and increased progressively to 300 stairs/steps while wearing 10–28 lb in a weighted vest. Loads during stairs/steps were increased approximately monthly in 1–3 lb increments as the participants were able to tolerate greater loads.

Statistical analysis

Statistical analyses were completed using SPSS [12]. Following intent-to-treat principles, the data from all subjects were included in analyses according to their group assignments regardless of compliance. Participants who discontinued exercise and other dropouts were measured at follow-up if they agreed to assessments. Measures of central tendency and distribution were examined at each interval to describe the sample, test for normality and homoscedasticity, and to describe outcomes. Possible baseline mean differences in BMD and other characteristics between women randomized to exercise and no exercise groups were tested using independent t-tests within HRT and no HRT groups. The baseline physical characteristics of women who completed the study were compared with women who dropped out using independent t-tests. Multiple linear regression with three a priori coded contrasts (exercise versus no exercise within HRT and no HRT groups, and HRT versus no HRT) was used to test for significant group differences in changes in BMD due to exercise and HRT, while controlling for potentially important covariates such as baseline BMD and years past menopause. This approach was used for the primary analysis rather than a 2×2 analysis of variance, since randomization to exercise and no exercise occurred within the HRT and no HRT groups. The regression approach is equivalent to ANCOVA and is preferred over ANOVA since it provides estimates of intervention effects after controlling covariates. In a secondary analysis the interaction of exercise with HRT was tested controlling for the same covariates, using the same approach. Type 1 error was set at α=0.05 (two-tailed) for all tests.

Results

Retention and compliance

Sample sizes at baseline were 86, 73, 91, and 70 for EX/HRT, NEX/HRT, EX/NHRT and NEX/NHRT, respectively. Overall, 83% (n=266) of the baseline sample completed 1-year assessments. Retention rates were 82%, 89%, 78% and 84% for EX/NHRT, NEX/HRT, EX/NHRT and NEX/NHRT, respectively. The dropout rate was somewhat higher for women randomized to EX (20%) compared to women randomized to NEX (13%). More women who did not use HRT (19%) dropped out compared to women who used HRT (14%).

Attendance at exercise sessions averaged 71.8±19.9% and was similar in both EX/HRT and EX/NHRT groups. This number includes all women (n=142) who were assigned to exercise and who completed DXA scans at the 12-month follow-up, whether or not they exercised for the entire 12 months. Exercise class attendance for women (n=108) who exercised in each of the 12 months of intervention averaged 79.9±10.8%. Compliance with calcium supplements (defined as the percent of women using ≥80% of the pills) averaged 91.3±14.4%. There were no significant differences among groups for calcium supplement compliance.

Physical characteristics at baseline

Baseline age, estrogen levels, BMD, and physical characteristics for women completing the 12 months of intervention are given in Table 1. There were no statistically significant (P<0.05) differences between EX and NEX groups compared within HRT and NHRT groups. T-Tests were also performed between HRT and NHRT groups (EX and NEX combined) at baseline. Women not using HRT were older (+1.6 years, P<0.01) than women using HRT and, as expected, HRT users had significantly higher levels of estrone and estradiol compared to women who did not use HRT. No other significant differences between HRT and NHRT were found at baseline. Estrogen levels at baseline, 6 months and 12 months were examined within groups using repeated measures ANOVA and no significant differences were found over time for any group.

Table 1. Baseline (mean±SD) physical characteristics for women completing the study. BMI body mass index, LST lean soft tissue, FN femur neck, FT femur trochanter, L2–L4 lumbar vertebrae, TB total body, BMD bone mineral density

Baseline characteristics of dropouts (n=54) were compared to subjects completing 12 months of intervention (n=266) using independent t-tests. Drop-outs were 2 years younger than women who completed the study (53.6±4.5 versus 55.6±4.7 years; P<0.01). There were no significant differences in years past menopause, estrogen levels, height, weight, percent fat, lean soft tissue, and total body and regional BMDs.

Dietary intake

There were no significant differences in average nutrient intakes (energy, macronutrients and selected minerals and vitamins that affect bone) among groups at baseline when assessed from 3 days of diet records obtained just prior to entry into the study, nor over the 1 year of intervention when assessed from the 8 days of records.

Muscle strength

There were no differences in LIDO assessments of muscle strength between EX and NEX within HRT and NHRT groups at baseline. After 12 months of training, muscle strength had increased significantly (11–21%) (P<0.01) in exercising women while changes in the NEX groups were not significant, except for back extensor strength, which had increased by 8.0%. The strength gains were similar in women who used HRT and women who did not use HRT.

Similarly, baseline 1-RMs were not different between exercise groups. There were significant (P≤0.001) increases in strength in both groups, ranging from 27% to 71% for leg press (~71%), lat pulldown (~28%), rows (~27%), dumbbell press (~36%), and back extension (~41%). The increases were similar for exercising women within HRT and NHRT groups.

Changes in BMD

The changes in BMD are given in Table 2. Significant (P<0.01) increases in BMD were found at the femoral neck (1.5%), trochanter (2.1%), lumbar spine (0.8%) and total body (0.4%) for the HRT/EX group, lumbar spine (0.7%) and total body (0.4%) for the HRT/NEX group, and trochanter (1.2%) for the EX/NHRT group (Table 2). Total body (-0.3%) and femur neck (−0.4%) BMD decreased significantly (P<0.02) in women who did not exercise and did not use HRT. There was considerable inter-individual variation in response at all sites, for all groups, as shown for the trochanter in Fig. 1.

Table 2. Average changes in BMD (mean±SD) (g/cm2) from baseline to 12 months
Fig. 1.
figure 1

Change in femur trochanter BMD after 12 months of intervention. Bars are average changes and circles are individual response

Multiple linear regression was used to compare the effect of exercise on BMD within HRT and NHRT groups while controlling for baseline BMD and the number of years past menopause. Baseline BMD was a significant predictor of the change in femoral neck BMD and years past menopause was a significant predictor of change in total body BMD. For both groups (HRT and NHRT), women who exercised increased trochanteric BMD significantly (P<0.05) more than women who did not exercise. There was also a significant (P<0.01) effect of HRT on femoral neck, lumbar spine (L2–L4), and total body BMD. On average, women who used HRT increased BMD and women who did not use HRT had smaller gains or lost BMD. In a secondary analysis, the interaction of exercise with HRT on change in BMD was tested, after controlling baseline BMD and years past menopause. The interaction was not significant, except for total body BMD (P=0.052). The main effect of exercise, shown in Fig. 2, was significant at the femur neck and trochanter sites.

Fig. 2.
figure 2

Main effect of exercise on regional and total body BMD. Significant (P≤0.05) effects of exercise at femur neck and trochanteric sites

Discussion

Our results demonstrate that exercise increases femur trochanteric BMD in calcium-replete postmenopausal women who use HRT and in women who do not use HRT. The combination of exercise and HRT also resulted in significant increases in femoral neck, lumbar spine and total body BMD, although only the increase at the trochanter was significantly greater than the increase with HRT alone. Trochanteric BMD was also significantly increased in women who exercised and did not use HRT. Although the increases in BMD with exercise were modest (Fig. 2), they were comparable to changes reported in other studies of postmenopausal women [13,14,15,16,17] and overall effect sizes from recent meta-analyses [16,18,19].

Notelovitz et al. [9], Heikkinen et al. [7], and Kohrt et al. [8] have examined the effects of HRT and exercise on BMD, with conflicting results. Whereas Notelovitz and Kohrt reported synergistic and sometimes additive effects of HRT and exercise, Heikkinen found exercise did not enhance the effects of HRT on BMD. The results of these studies must be viewed with some caution since the sample sizes were quite small (n=9–13 per group), subject assignment to treatment groups was either non-random [8] or not explained [7], and the duration of HRT use was variable [9]. Despite different exercise regimens, Notelovitz et al. and Kohrt et al. reported remarkably similar increases in total body (~2%) and lumbar spine (7–8%) BMD as a result of exercise and HRT. Although our finding of ~2 times greater change in trochanteric BMD in women who use HRT and exercise versus HRT alone is comparable to the effect observed in the Kohrt study, albeit at a different site, the interaction of exercise with HRT was not significant. Thus, our findings do not support a synergistic effect of BMD and exercise on BMD.

The reasons for the conflicting findings are not clear. Undoubtedly, there are a number of factors that modify the effects of exercise on bone in addition to HRT and contribute to differences among studies. For example, the participants' ages, years past menopause, years on HRT, body mass, diet, initial BMD, and exercise type and intensity may all be important. The women in the Kohrt et al. [8] study were ~10 years older than the women in the present study, had been menopausal longer (≥10 years past menopause compared to 3–10 years in the present study), and had considerably lower initial BMDs (↓15–25%) even after adjusting for known bias among densitometers. If it is true that the bone response is greater in inactive persons with lower BMD, then the larger increase in BMD in the Kohrt study is reasonable to expect. There was also a difference in duration of HRT. The BEST study participants had been on HRT for 1–5.9 years, whereas Kohrt's subjects initiated HRT at the onset of the study and it is possible that HRT is more effective in the early stages of treatment [20]. Unfortunately, the duration of HRT use in the two treatment groups was not reported in the Notelovitz study and the effect of exercise without HRT was not evaluated. Thus, it is not possible to know to what extent differences in HRT use might have contributed to variable changes in BMD in the treatment groups. Moreover, Notelovitz et al. [9] estimated changes in spine BMD from the spine region of total body DPA scans which may be problematic since regional assessments from whole-body scans are considerably less precise than targeted regional scans and longitudinal assessments from DPA can be inflated by isotope decay without appropriate precautions.

The different findings may also be due in part to the different exercise regimens. Given the inclusion of exercises that involved both ground-reaction forces (GRF) and joint-reaction forces (JRF) in our program, it is not surprising that the greatest BMD response was observed at the trochanteric region since several of the resistance exercises recruit muscles with attachments in the trochanteric region. The combination of GRF from weight-bearing exercise with the direct pull of muscles on the bone during resistance exercise presumably would be more effective than GRF alone. The greater response at the trochanter in comparison to the femoral neck in both exercise groups supports this view.

An important design limitation has been the failure of many studies to randomly assign participants to treatment groups [21]. In this regard, Kelly [16] reported a trend for non-randomized trials to yield more positive results than randomized trials, suggesting that nonrandomized trials tend to overestimate the bone response. In a later analysis of only randomized trials, Kelly [18] found resistance training, and not aerobic activity, was associated with significant improvements in BMD. However, when only the effects for sites most likely to respond were analyzed, the treatment effect for aerobic training (1.6%) was greater than resistance training (0.6%), although both were significant. In perhaps the only study to directly compare exercise involving either ground-reaction forces (walking, jogging and climbing stairs) or joint-reaction forces (weightlifting and rowing), Kohrt et al. [22] found both types of programs resulted in significant and similar increases in BMD of the whole body (1.6–2.0%) and lumbar spine (1.5–1.8%) in older postmenopausal women, while femoral neck BMD responded only to GRF. Thus, it seems likely that it is the load on bone and not the specific activity that provides the osteogenic stimulus.

A strength of the aforementioned meta-analytic approach is the ability to assess the importance of covariates that may modify the response to exercise. The results from available meta-analyses provide surprisingly little support for important influences by covariates such as participant age, menopausal status, initial BMD, and program characteristics such as type, intensity, and duration of training [16,18,19] despite the common belief that these variables are likely to modify the BMD response to exercise. The findings from these analyses must be interpreted cautiously due to the relatively small number of studies included in most meta-analyses combined with the considerable heterogeneity in program design, although it is noteworthy that initial BMD was a significant predictor of the BMD response at only one site (femoral neck) in the present study.

Clearly, the increases in BMD with exercise are modest. However, when evaluated in the context of the typical rate of loss of BMD during the 5–7 years following menopause, any increase in BMD would seem important, especially if it was maintained. The results of a recent study [23] suggest postmenopausal women who exercise regularly with weighted vests over 5 years are better able to maintain bone mass than women who do not exercise. While this is one of the first studies to show an effect of long-term exercise on hip BMD, it was a small study including only 18 women. Additional studies of long duration exercise are needed to define the volume and intensity of exercise needed to maintain BMD and to assess whether further increases are possible with ongoing training. Studies of long duration exercise are also needed to identify differences between women who respond to exercise with increases in BMD versus non-responders.

In conclusion, the results of this study demonstrate that regional BMD can be improved with resistance exercised combined with aerobic, weight-bearing activity at clinically relevant sites in postmenopausal women who consume adequate calcium. Sites exposed to both GRF and JRF (e.g. trochanter) generally improved more than sites stressed by only GRF (e.g. femoral neck). Thus, a program of exercises that generate both GRF and JRF may be most effective for osteoporosis prevention. Significant increases in BMD were observed at more sites in women who used HRT than in women who did not use HRT. However, the interaction of exercise with HRT was not significant, and our results do not support a synergistic effect of HRT with exercise on BMD.