Introduction

Breastfeeding is widely recognized as beneficial to the well-being of mothers and children [1, 2]. However, according to a review by Kovacs, lactation is associated with significant temporary bone loss and increased bone turnover markers, especially during the exclusive breastfeeding period [3]. High levels of prolactin cause prolonged suppression of the hypothalamic-pituitary-ovarian axis, amenorrhea, and consequent hypo-estrogenemia [4]. In addition, other factors, such as higher parathyroid hormone–related protein (PTH-rP) serum levels and lower efficiency of calcium intestinal absorption, may contribute to higher bone resorption rate [5].

As previously described by our group [6], pregnancy and lactation-induced osteoporosis (PLIO) is a rare complication related to substantial trabecular bone loss and fragility fractures, mainly spine fractures in the first weeks of lactation and the cortical bone is relatively spared in this period.

Several studies evaluated bone mineral density (BMD) measurements by dual X-ray absorptiometry (DXA) during bone loss after lactation with conflicting data: complete recovery (studies 4 [7], 5 [8], 6 [9], 9 [10], 16 [11], 18 [12], 19 [13], 20 [14], 23 [15], 24 [16], 25 [17], 26 [18], 28 [19], 29 [20], 31 [21], and 32 [22]), incomplete recovery (studies 3 [23], 7 [24], 10 [25], 11 [26], 13 [4], 15 [27], and 17 [28]), or tendency to recovery (studies 8 [29], 12 [30], 14 [31], 21 [32], 22 [33], and 27 [26]). The lack of information regarding the bone recovery rate according to the return of menses and/or weaning, as well as other methodologies, including quantitative ultrasound (QUS) (studies 16 [11], 19 [13], and 30 [34]) and high-resolution peripheral quantitative computed tomography (HR-pQCT) (studies 31 [21] and 32 [22]), are some confounding factors related to these controversial data.

It is worth highlighting the methodological heterogeneity among these studies, including follow-up timing, methods and skeletal sites of the bone measurements, and body composition changes. In the current literature, there are two systematic reviews focused on lactation-related bone loss [35, 36]. However, these studies make no mention to the new methods for bone loss evaluation, including HR-pQCT, hip structural analysis (HSA), and body composition data. In addition, very few prospective studies were evaluated, and none of them included adolescents. Thus, the aim of this study was to perform a systematic review and a meta-analysis in order to evaluate the bone mass recovery rate after lactation-related loss.

Methods

Search strategy and selection criteria

Systematic review

In the first phase of the study, four researchers (ACJA, CMDA, FMFG, and RBP; designated as group 1) performed a search to define the number of Medical Subject Headings (MeSH) terms, including “bone diseases,” “bone resorption,” “bone density,” “osteoporosis,” “calcium,” “postpartum period,” “weaning,” “breast feeding,” and “lactation,” as well as related entry terms. The complete search strategy used is shown in ) Eletronic supplementary material 1 (EMS 1) and was performed on July 21, 2017.

Secondly, the four researchers screened the selected papers by reading titles and abstracts in an independent and blinded approach. If some disagreement was found, it was solved by three different experts (SMP, MDBC, and MMP; designated as group 2) using the validated MeSH terms. In the third phase, the full-text articles were randomly distributed to the group 1 researchers. In case of disagreement, the inclusion or exclusion of a paper was decided by the experts of group 2. Thus, the papers selected by group 1 were distributed again to three other independent readers (MDBC, MMP, and SMP) for certification. Full agreement between group 1 and group 2 was necessary to define the final selection of papers. Additional references from the original articles were surveyed in order to identify other publications of interest.

Eligibility and exclusion criteria

The following inclusion criteria were considered: prospective human studies in women of reproductive age; no other clinical medical conditions or concomitant diseases; and with no drugs or any pharmacological intervention causing interference with bone measurements. In addition, it was necessary to perform 2 bone measurements in the postpartum period: the first one within the first weeks of lactation and another one 12 months after delivery, 3 months following the return of menses, or 3 months postweaning.

The exclusion criteria were as follows: articles with the same database; studies that evaluated only pregnancy and not the postpartum period, as well as those measured calcium or bone turnover markers; studies that evaluated parity or lactation at menopause; reviews; case reports; short communications; book chapters; comments to the editor; letters; interviews; guidelines; and publications with errata.

Meta-analysis

The eligibility criteria to be included in the meta-analysis were as follows:

  1. 1-

    BMD measurements using DXA methodology, in g/cm2;

  2. 2-

    Mean spine, hip, forearm, or whole body BMD values with standard deviation (if not, confidence intervals enabling calculation of the standard deviation).

Information sources

The following electronic databases were used: PubMed, Web of Science, and Scopus. Only full-text articles in the English, French, Portuguese, and Spanish language were included to be reviewed, and there were no restrictions regarding publication date.

Data extraction

The reviewers in group 1, divided in two groups (RBP and ACJA; FMFG and CMDA), independently conducted data extraction, and experts resolved disagreements. General characteristics of the studies, such as the year of publication, authors, city, and country in which the study was performed, method and chronology of the assessment of bone, sites evaluated, sample size, follow-up time, breastfeeding categories, inclusion and exclusion criteria, age, and main conclusions, were collected.

Protocol and registration

This systematic review was performed in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) [37] statement [38]. Additionally, it was recorded on the PROSPERO database (https://www.crd.york.ac.uk/prospero/) (number: CRD42018096586Bone).

Data analysis

Meta-analysis for lumbar spine and femoral neck BMD was performed using the Stata programme (12.0, Stata Corporation, College Station, TX, USA). Fixed and random effects were used. If the heterogeneity was high, the random effects model was chosen. The statistical heterogeneity of the studies was assessed by Cochran’s Q statistical test (p < 0.05) as an indicator of significance. The inconsistency of publications was rated by the Higgins and Thompson I2 statistic, with 50% or higher regarded as significant [39]. The meta-analysis was performed using the DerSimonian-Laird random effects model to weight each study [40]. In addition, the weighted mean difference (WMD) was used as an effect estimate measures [41]. For publication bias risk, the Begg et al.’s [42] and Egger et al.’s [43] methods were also used, with significance P < 0.05.

Obtaining standard deviations (SD) for the calculation of the meta-analysis

The SD was obtained from the confidence interval, following the calculations below:

The confidence interval to standard error

As the confidence interval (CI) was 95%, the standard error (SE) was calculated as:

$$ \mathrm{SE}=\left(\mathrm{upper}\ \mathrm{limit}\ \mathrm{for}\ \mathrm{CI}-\mathrm{lower}\ \mathrm{limit}\ \mathrm{for}\ \mathrm{CI}\right)/3.92 $$

From SE to SD:

SD was obtained from the mean SE by multiplying of sample size (n) square root:

SD = SE X √n.

The calculations were checked by using the calculator tool Cochrane, available at https://training.cochrane.org/resource/revman-calculator.

From Range to SD:

According to Hozo et al. (2005) [44] and the online calculator (http://vassarstats.net/median_range.html), the SD of sampling was obtained.

Study quality assessment

The Newcastle-Ottawa scale (NOS) was used for assessing the quality of non-randomized trials [45]. The NOS assigns a maximum of four stars for selection, two stars for comparability, and three stars for exposure or outcome. Newcastle-Ottawa form scores of seven to nine indicated high-quality studies, while scores of five to six indicated moderate-quality studies [45] Eletronic supplementary material 2 (EMS 2).

Bias risk

The references of selected papers were searched manually, and experts’ suggestions were sought through email communications by group 1. This approach was highly relevant because it allowed the identification of publications that were not found in the database searches according to the descriptors and predefined search strategies.

Results

A total of 9455 papers were found after applying the first strategy. Based on the title and the abstract analysis, 8812 articles were not included. In addition, 189 were excluded by duplication. Thus, 454 were extracted for full-manuscript analysis. From these, 32 were used for systematic review and 7 of them to perform the meta-analysis (Fig. 1).

Fig. 1
figure 1

PRISMA flow diagram

The main findings of selected papers are shown in Table 1. The publication period of the studies ranged from 1990 to 2016. Fifteen studies were conducted in America: 6 in the USA (studies 3 [23], 4 [7], 5 [8], 9 [10], 10 [25], and 11 [26]), 4 in Mexico (studies 20 [14], 23 [15], 28 [19], and 29 [20]), 2 in Argentina (studies 21 [32] and 24 [16]), 2 in Brazil ([studies 18 [12] and 26 [18]), and 1 in Chile (study 6 [9]). Thirteen studies were from Europe: 4 studies in the UK (studies 12 [30], 16 [11], 17 [28], and 25 [17]), 3 in Sweden (studies 14 [31], 19 [13], and 31 [21]), 2 in Denmark (studies 8 [29] and 27 [48]), and 1 each in Finland (study 13 [4]), Italy (study 7 [24]), Germany (study 30 [34]), and Hungary (study 15 [27]). Of the remaining studies, three studies were conducted in Australia (studies 1 [46], 2 [47], and 32 [22]) and one in India (study 22 [33]). There were 1605 lactating postpartum women in the case group, 103 women in the non-lactating postpartum control group, and 363 women in the non-pregnant non-lactating control group.

Table 1 Main findings of studies performed by using bone mass measurements in breastfeeding women

All studies included in this systematic review used DXA measurements, except three of them (studies 1 [46], 2 [47], and 30 [34]). Three other studies used a heel quantitative ultrasound study (QUS) (studies 16 [11], 19 [13], and 30 [34]). Another 3 studies assessed women by single-photon absorptiometry (SPA) (studies 1 [46], 2 [47], and 9 [10]), other 3 one used quantitative computed tomography (QCT) (one using the spine: study 10 [25]) and the other 2 using peripheral skeletal sites: the ultradistal tibia (studies 31 [21] and 32 [22]; and radius (study 32 [22]), and another one was performed by hip structural analysis (HSA) (study 25 [17]).

In studies with DXA or SPA, the lumbar spine was evaluated in 24 studies, the hip in 19, the forearm in 13, and the whole body in 13, and the calcaneus in just one (Table 1). The number of patients per study ranged from 10 to 115 for the lactating postpartum case group, from 8 to 36 for the non-lactating postpartum control group, and from 16 to 57 for the non-pregnant non-lactating control group (Table 1).

The most of the studies using DXA or SPA showed transient bone loss with complete recovery or a tendency to recovery in all skeletal sites evaluated (studies 1 [46], 2 [47], 4 [7], 5 [8], 6 [9], 8 [29], 9 [10], 12 [30], 14 [32], 16 [11], 18 [12], 19 [13], 20 [14], 21 [32], 22 [33], 23 [15], 24 [16], 25 [17], 26 [18], 27 [48], 28 [19], 29 [20], 31 [21], and 32 [22]). The recovery was only partial in a few studies: at the femoral neck (studies 3 [23], 13 [4], and 17 [28]), forearm (studies 2 [47], 7 [24], and 15 [27]), whole body (studies 10 [25] and 11 [26]), spine (studies 7 [24] and 15 [27]), and total hip (study 17 [28]). Figure 2 shows the BMD measurements evolution over time among all studies, but no any stratification according to time of lactation. There was complete spine BMD measurements recovery in all of them (Fig. 2a). Regarding femoral neck BMD measurements, there was a trend to recovery (Fig. 2b). All spine and femoral neck BMD measurements are available in Electronic supplementary material 3 (EMS 3)

Fig. 2
figure 2

Lumbar spine (a) and femoral neck (b) BMD measurements over time. Sámano et al.¥: adolescent; Sámao et al.¥¥: adult; Pearson et al.*: bottle; Pearson et al.**: mixed; Pearson et al.***: breast

Considering the lumbar spine BMD measurements comparison between the final (after 12–18 months) and initial (postpartum), this meta-analysis showed a significant mean difference (p < 0.001), with an overall combined WMD of 0.067 (95% IC 0.044–0.089 g/cm2) (Fig. 4a). The weighted mean difference at spine BMD measurements remained significant among the Latin American (p < 0.001), European (p = 0.02), and Asian (p = 0.03) studies (Fig. 5a). From all papers included, only seven of them were used for this second analysis because of high homogeneity (fixed model effect; I2 = 3.2%). Although with asymmetry by funnel plot, there was no publication bias according to Begg’s test (p = 0.386) and Egger’s test (p = 0.882) Eletronic supplementary material 4 (EMS 4).

On the other hand, the comparison among the femoral neck BMD measurements did not show any significant association between the final and initial values (p = 0.323). The 5 papers included in this analysis had homogeneity and it was analyzed using the fixed effect model (I2 = 25.3%) (Fig. 4b). Regarding the publication bias risk, the funnel plot was slightly asymmetric Eletronic supplementary material 4 (EMS 4), but no bias, according to Begg’s test (p = 1.00) and Egger’s test (p = 0.184). However, the analysis by geographic area had significant weighted mean difference among the BMD measurements values (WMD = 0.047; 95% IC 0.012–0.083 g/cm2; p = 0.01) (Fig. 5b).

One study evaluated hip geometry (study 25 [17]) showed some changes, including cortical thickness and cross-sectional area (CSA) reduction, between 2 weeks after delivery and the peak of lactation. However, after weaning and adjusting for weight changes, there were no significant differences from 2 weeks postpartum. The QUS studies did not show any bone loss during the follow-up (studies 16 [11], 19 [13], and 30 [34]).

Regarding QCT studies, one of them (study 10 [25]) showed transient volumetric spine trabecular loss with complete recovery. On the other hand, a HR-pQCT study (study 31 [21]) demonstrated cortical vBMD, cortical and trabecular thickness reduction in the first 12 months postpartum in women lactating 4 months or longer. Also, the cortical vBMD and trabecular thickness were still lower than baseline values in women lactating 9 months or longer. Another study of the ultradistal tibia and radius (study 32 [22]) revealed an increase of cortical porosity, as well as matrix mineralization deterioration, and fewer trabeculae and greater separation among them.

Considering the different breastfeeding subgroups and the BMD measurements behavior at over time, there was an earlier tendency to recovery at lumbar spine (Fig. 3a–c) than at femoral neck (Fig. 3d–f), except in those with longer breastfeeding.

Fig. 3
figure 3

BMD measurements according to different breastfeeding time subgroups: a, d less than 1 month, b, e from 1 to 6 months, and c, f more than 6 months of breastfeeding at lumbar spine and femoral neck, respectively

Discussion

Our results showed transient trabecular bone loss during breastfeeding with recovery or tendency to recovery after weaning, when assessed and monitored by DXA and HR-pQCT measurements. However, the cortical bone recovery can be delayed.

Some pathophysiological mechanisms are involved with these findings during the lactation, including hypoestrogenism and longer hypothalamic-pituitary-ovarian axis suppression related to higher prolactin [4], and parathyroid hormone–related protein (PTH-rP) serum levels, as well as lower efficiency of gut calcium absorption [5]. The daily rate of calcium transferred from maternal milk to newborn is approximately 200 mg and higher gut absorption of this ion is one of the most important homeostatic mechanisms to meet foetal needs [49]. However, much of calcium from milk is supplied through bone resorption of maternal skeleton because the intestinal calcium absorption returns to pre-gestational levels while breastfeeding [50]. The PTH-rP is a key mediator during lactation because its high concentrations may predict the magnitude and severity of bone loss, regardless estradiol, intact PTH, and 25-OH-vitamin D serum levels [51, 52]. After returning the menses, there is a tendency to bone loss recovering in the first months of lactation, especially related to estrogen status, which is similar, but not analogous, to its effect during puberty and is opposite to its role after the menopause [53].

There was a wide variability among outcomes regarding different methodologies (SPA and DXA) used to measure BMD changes during lactation. Most of the studies reported “no changes” and a tendency for recovering bone loss after breastfeeding (studies 1 [46], 4 [7], 5 [8], 6 [9], 9 [10], 16 [11], 18 [12], 19 [13], 20 [14], 23 [15], 24 [16], 25 [17], 29 [20], 31 [21], and 32 [22]). However, some residual effects concerning long-term breastfeeding were observed in some skeletal sites, especially cortical bone (studies 3 [23], 10 [25], 11 [26], 13 [4], and 17 [28]). It is worth emphasizing the DXA studies state an incomplete bone recovery was associated with longer lactation, insufficient sampling, or inadequate time to demonstrate a complete recovery (studies 2 [47], 3 [23], 7 [25], 10 [25], 11 [26], 13 [4], 15 [27], and 17 [28]). Our meta-analysis showed bone recovery at spine after 12–18 months of delivery in women who breastfed. Considering that the average lactation time between included studies ranged from 6 to 13 months and average follow-up was 15.3 months, this aspect could justify the 6-month incomplete bone recovery rate reported by the most of authors. According to the Kovacs’s review [3], the bone loss is completely reversed 12-month after weaning. Although our meta-analysis has not shown any statistical significance regarding femoral neck BMD measurements (Fig. 4b), the geographic area analysis demonstrated bone recovery in Latin American studies (Fig. 5b). Some aspects could explain these differences, including latitude, sun exposition, and ethnic background [54].

Fig. 4
figure 4

Forest plot showing the lumbar spine and femoral neck BMD measurements from delivery to 12–18 months of postpartum in lactating women. a Significant homogeneity existed among the studies (I2 = 3.2%, p = 0.405). Therefore, a fixed-effects model was applied to pool the data. The results showed a significant lumbar spine BMD measurements mean the difference between the assessments (at baseline and 12–18 months of postpartum in lactating women (WMD, 0.067; 95% CI, 0.044–0.089 g/cm2; p = < 0.001). b Significant homogeneity among the studies (I2 = 25.3%, p = 0.244). Therefore, a fixed-effects model was applied to pool the data. The results showed no significant femoral neck BMD measurements mean the difference between assessments (at baseline and 12–18 months of postpartum in lactating women (WMD, 0.011; 95% CI, − 0.011–0.032 g/cm2; p = 0.323). WMD, weighted mean difference; CI, confidence interval. Single asterisk, adolescents sample; double asterisks, adults sample

Fig. 5
figure 5

Forest plot regarding meta-analysis outcomes among spine (a) BMD and femoral neck (b) BMD measurements from delivery to 12–18 months postpartum among women lactating, according to the geographic area. Significant homogeneity among the studies (I2 = 3.2%, p = 0.405). Therefore, a fixed-effects model was applied to pool the data. Considering Latin American studies, the WMD was 0.047; 95% CI, 0.012–0.083 g/cm2; p = 0.01) and there was high significant homogeneity (I2 = 0.0%, p = 0.949). WMD, weighted mean difference; CI, confidence interval. Single asterisk, adolescents sample; double asterisks, adults sample

Another important thing to highlight is related to the baseline BMD measurements included by Kulkarni’s study (study 22 [33]) that had values bellower from other ones. Some aspects must be pointed out to explain that, including high number of adolescents, low daily calcium intake, and other diet inadequacies. Altogether, they could be associated with peak of bone mass acquisition impairment. Similarly, More et al. (study 15 [27]) also did not show spine BMD measurements recovery after 12 months of follow-up in women who breastfed for more than 6 months and some comments could be speculated to explain these different results, including the small sample size, amenorrhea time longer than 1 year and insufficient follow-up time to demonstrate the bone loss recovery.

The best performance (precision and accuracy) of DXA measurements at baseline and over time might explain these findings in relation to the QUS methodology. Different QUS systems measure different bone properties that are not closely related to bone mineral content (BMC) measured by DXA. Broadband ultrasound attenuation (BUA) measurements depend on the trabecular architecture of cancellous bone (qualitative aspects, such as separation and connectivity of trabeculi) explaining why QUS measurements are predictive of fracture risk in elderly but are poor for monitoring the bone changes in young subjects [11].

It is well established that higher body weight plays a positive role on BMD measurements due to bone-loading effect or better consumption of nutrients (studies 22 [33] and 25 [17]). In pregnant or lactating women, the hip and spine BMD changes were attenuated after adjusting for weight changes, suggesting the bodily composition modifications could be another confounding factor [55].

The HR-pQCT findings found by Brembeck et al. [21] after 18 months of follow-up are in agreement with the findings previously described, as peripheral skeletal deficits remained only in women who breastfed for 9 months or more. Most likely, a longer follow-up time would demonstrate full recovery of the bone loss. In contrast, Bjornerem et al. [22] found permanent residual deficits after 3.6 years of follow-up, with greater cortical porosity, less trabeculae, and lower matrix mineralization. The bone loss was irreversible during the 2.6 years of follow-up after the end of lactation and 3 years after resumption of regular menses. The sample size was modest when compared with Brembeck et al. [21] (58 vs. 81 women, respectively), and the follow-up duration may have been insufficient to detect reversal of peripheral skeletal sites changes, including mineralization matrix impairment and higher remodeling bone rate.

Another important question addressed by our systematic review was the hip structural geometry analysis. Laskey et al. showed reduction of the neck and intertrochanteric transverse diameter and cortical thickness during the peak of lactation (study 25 [17]). Although these changes may induce greater susceptibility to axial overload and higher fragility fractures rate, they were transient and reversible after 12 months of follow-up. Other studies with postmenopausal women [56, 57] have found that the cross-sectional diameter of the femur increased and that cortical bone area is restored or increased when women who have had one or more children were compared with nulliparas. Likely, Laskey et al. [17] did not find any structural bone modifications due to insufficient follow-up time. The combination of bone microarchitecture and geometry changes, especially the increase of femur transverse diameter, could be a compensatory and adaptive mechanism for maintaining bone resistance in women who breastfed for longer time [50]. Thus, these data are in accordance with Kovacs’s review [3] addressed the neutral or protective role of lactation regarding bone health measurements, as well as low fragility fracture risk in the medium term and long term.

The only study that evaluated patients after a new gestation within 18 months of the last lactation reported that none difference was found after 18 months of the second gestation when compared with controls no new gestation (study 5 [8]). There are potential mechanisms to explain why women who became pregnant reached or exceeded the baseline BMD values in spite of higher calcium demand (lactation and subsequent pregnancy): (1) early reestablishment of ovulation and a consequent estrogen status recovery, (2) factors related to the new gestation itself, such as increased intestinal calcium absorption and estrogen levels achieved during the third trimester, and (3) body weight gain and additional loading to maternal skeleton (study 5 [8]).

Interestingly, 6 studies evaluated adolescent mothers (studies 18 [12], 20 [14], 21 [32], 23 [15], 28 [19], and 29 [20]). All of them demonstrated a tendency to complete bone loss recovery, in accordance with NHANES database performed in 819 women, 20–25 years old [58], suggesting that adolescent pregnancy had no negative impact on peak of bone mass acquisition [3].

Our study has some limitations. Firstly, the most of the studies took into account the chronological time and not the return of menses and/or weaning. The follow-up evaluation based on these two parameters would be much more reliable than a pre-established follow-up time. Secondly, several studies did not provide detailed information on BMD measurement data, hampering to perform a meta-analysis with more papers, i.e., more evidence power. Thirdly, there are no studies with bone histomorphometry, and the studies regarding HR-pQCT and HSA were scarce. Furthermore, the sampling size of each study was relatively small, although the compilation data has been more substantial.

Summary and conclusion

Although some patients can experience spinal fractures, a rare and impressive event related to pregnancy and lactation, this systematic revision showed the lactation is associated with transient trabecular and cortical bone loss at axial and peripheral skeletal sites, depending on returning regular menses and weaning. In most of the women, a complete bone recovery occurred after lactation. Some microarchitecture deterioration of peripheral sites, such as radius and ultradistal tibia, may occur after prolonged breastfeeding although no hip geometry damage.

More prospective studies, including greater sampling, longer follow-up, and other methodologies to detect bone fragility more accurately, are necessary to demonstrate the role of reassuming menses and the ending of weaning to complete recovery of bone health status.