Researchers in the physical and social sciences have long considered important the influence of childbearing and rearing on long-term health outcomes for both parents and children. In the social sciences, interest in childbearing has increased due to changes in childbearing patterns, including parity, over the past few decades. Many couples today compared to prior decades, especially in developed countries, delay childbearing in pursuit of education and career opportunities, and more couples have fewer or no children (Martin 2004). Scholars argue that childbearing and rearing is an integral part of the life course, wherein parents’ and children’s lives are linked and interdependent (e.g., Elder and Glen 1998; Umberson et al. 2010). One component of these ‘linked lives’ is the extent to which the number of children adults bear and rear influences their long-term outcomes, particularly longevity.

The extant literature suggests that parity is associated with parental health (e.g., Högberg and Wall 1986; Penn and Smith 2007), but there is little consensus across disciplines regarding the mechanisms underlying, or the direction of, the parity-health association. Some hypotheses suggest a negative association between parity and health and longevity; some suggest a positive association, and others suggest that the association between parity and health is nonlinear (Dior et al. 2013; Doblhammer 2000; Green et al. 1988; Jaffe et al. 2009).

This gap in the literature has been addressed, at least in part, by Zeng et al. (2016), who build upon an earlier conference draft of our present paper (Moore et al. 2014). We extend this work by using meta-analysis and meta-regression techniques to examine the direction and form of the association between parity and parental health (measured as all-cause mortality) while directly controlling for important moderating factors (i.e., sex, age, and time period, as well as health selection). Moreover, our examination builds upon the small body of literature positing that the parity-health association forms a J-curve, or is nonlinear. Unlike prior studies (excepting Zeng et al. 2016), which focus on a specific country or a comparison between two countries, our data include multiple national contexts from 1946 to the present (N = 37 studies) and span a range of disciplines, theoretical perspectives, and methodological approaches. Our analytic approach allows us to extend the literature by using results from multiple existing studies (many of which do not directly address the form of the association) to examine the J-curve hypothesis. Consistent with Zeng et al. (2016), our results suggest that there is a significant J-shaped curve association between parity and all-cause parental mortality.

Background

Theoretical Perspectives

Scholarship on the association between parity and parental health spans a range of disciplinary perspectives and methodological approaches. We focus on negative and positive influences of parity on parental health from evolutionary/biological, biomedical, and sociological perspectives. In general, evolutionary/biological perspectives emphasize the direct physical repercussions of pregnancy and childrearing on the aging process. Biomedical theories generally emphasize the indirect physical links between parity and health, attending to the onset of chronic physical and mental conditions. Sociological perspectives (particularly the life course perspective), on the other hand, underscore the collective experiences of parents and children over time and the potential consequences of those experiences (e.g., Elder and Glen 1998).

Several studies underscore the negative association between parity and parental health. As we have noted, evolutionary and biological theoretical perspectives generally emphasize the direct physical repercussions of pregnancy and childrearing on parents. For example, both Disposable Soma Theory of Aging (Alter et al. 2007; Dribe 2004) and Evolution of Senescence Theory (Doblhammer 2000; Hurt et al. 2006b) consider the metabolic and physiological trade-offs between parity and mortality. These theories posit that those with higher parity invest more in fertility than in the biological resources that help maintain the body (i.e., cell reparation). Disposable Soma Theory in particular implies that one invests in either fertility or longevity, with each of these coming at the expense of the other. Thus, increasing parity accelerates the aging process (e.g., see Le Bourg 2001). Some scholars suggest that the influence of parity on aging depends on the timing of childbirth (Alter et al. 2007), and recent studies suggest that younger and older ages at first birth among women are associated with an increased risk of cause-specific and all-cause mortality (Barclay et al. 2016; Grundy and Kravdal 2010). Using Swedish register data, for example, Barclay et al. (2016) find a strong association between older ages at first birth (i.e., age 30 or older) and the onset of breast cancer.

Biomedical theories generally emphasize the indirect physical links between parity and overall health by examining how parity affects the onset of chronic diseases (e.g., Alter et al. 2007; Read et al. 2011). For example, the hormonal fluctuations that mothers experience during gestation, delivery, and lactation may be associated with an increased susceptibility to cancer, cardiovascular disease, and diabetes (Alter et al. 2007; Daling et al. 2002; Henretta 2007; Hurt et al. 2006b). The increased exposure to hormonal fluctuations which comes with increasing parity implies an increased risk of disease onset. Additional research suggests that parity increases susceptibility to both infections and depression (Grundy and Kravdal 2008).

In addition to physical risk factors, higher parity itself may be a source of stress, which is negatively associated with physical and mental health. More children may require increased emotional, physical, and financial investments (for a review, see Thoits 2010) which may elevate stress, especially when children are close in age (Alter et al. 2007). Alter et al. (2007) further suggest that having many children may reduce parents’ attention to their own needs. This argument is consistent with maternal depletion models, which suggest that good nutrition is hard to come by among women with many children as healthier foods tend to be expensive.

Studies have also underscored the positive association between parity and parental health. A small number of evolutionary/biological studies suggest that having more children might contribute positively to parental health. Read et al. (2011), for example, found that nulliparous women are at a greater risk of inactivity as a result of poor health compared to women with children. Child-motivated physical activity may have a positive impact on individual health across multiple generations, as is suggested by the adage about grandparents that grandchildren help to “keep them young.”

The majority of studies which find a positive association between parity and health are sociological. One possible social mechanism linking parity to health is that the presence of children strengthens social control processes. Parents’ heightened sense of responsibility and commitment to their family may mean they take fewer health risks, such as using or abusing drugs and alcohol (e.g., Chilcoat and Breslau 1996; Paradis 2011; Umberson 1987). New parents may quit smoking cigarettes to protect their children against the dangers of second-hand smoke and they may also adopt healthier diets and avoid reckless driving. If parents do not endogenously adopt a more restrictive lifestyle for themselves, they may feel normative pressures to do so from relatives, friends, and even strangers.

A second possible social mechanism linking parity to health is that larger families often provide greater access to emotional and instrumental social support (Alter et al. 2007). In particular, adult children may provide important support to their parent(s), buffering health concerns associated aging (Stein et al. 1998), and which may increase as parity increases. Parents with multiple children may have at least one child from whom they receive emotional and instrumental support in times of need. Indeed, Stein et al. (1998) found that adult children, particularly when they are younger, feel a sense of obligation to provide support to their parents. In addition, Ishii-Kuntz and Seccombe (1989) found that parenthood was associated with increased access to social support from friends and neighbors who perceive a greater opportunity or need to help. In addition, disadvantaged mothers have made beneficial contacts through their children’s formal child care centers (Small 2009) and middle class parents have garnered social support through their adolescent children’s social activities (Offer and Schneider 2007). Nulliparous or even low parous adults may have less access to support, and consequently more difficulty with aging.

A third possible social mechanism linking parity to mortality positively centers on the normative pressures associated with both having a prescribed number of children (e.g., 2 or 2.5 per household) and having any children. Those who have fewer children than is normatively prescribed may experience role strain (Goode 1960), while parity consistent with normative expectations may reduce the likelihood of role strain and associated negative health implications. Thornton and Young-DeMarco (2001), for example, argue that gendered, normative family structures were historically central in Western societies and non-normative family behaviors resulted in sanctions.

While having significantly fewer/greater children than is the norm may induce some degree of role strain, remaining childless may be viewed as a more serious norm violation. These differences may be more or less pronounced in Western and Eastern countries given social differences in things like religious beliefs. For example, childlessness (for different reasons) has increased in Western countries over the last few decades. In the U.S., childlessness has increased from 10% in 1976 to 18% in 2002 among women aged 40–44 (Abma and Martinez 2006). In China, on the other hand, marriage and childlessness has not increased, albeit parity decreased significantly in the 1990s following the implementation of the one-child policy. In other Eastern countries like Japan, Korea, and Taiwan, however, marriage and childlessness is projected to increase (Raymo et al. 2015). Even so, some research suggests that the stigma associated with childlessness remains. For example, qualitative interviews with both men and women show that childless adults voluntarily use strategies to negotiate positive identities that help to stave off the stigma associated with their childbearing choices (Park 2002). To the extent that normative pressures associated with bearing children increase strain felt by individuals and couples, parity may reduce or eliminate negative health consequences associated with this type of strain.

Finally, social-psychological processes may underlie the link between parity and heath. That is, parous adults may feel joy and a sense of fulfillment from rearing children. Edin and Kefalas (2005), for example, show that lower-income, unmarried women choose to have children (prior to marriage, which low-income women view as financially unattainable) because children give them a sense of purpose and meaning. On the other hand, other research suggests that the context (i.e., marriage, cohabitation, divorce, widowhood, or having never married) in which adults are childless is also associated with wellbeing. For example, never married childless women may be more active later in life (often associated with psychological health, see Penedo and Dahn 2005), while formerly married men may be more likely to have health problems (see Umberson et al. 2010 for a review). Indeed, Umberson et al. (2010) argue that the link between childlessness and adult wellbeing is complex and that the long-term effects of childlessness on adult health and wellbeing may be positive for some social groups and negative for others.

As we briefly noted, some studies suggest that the parity-health association may be nonlinear (e.g., Dior et al. 2013; Doblhammer 2000; Green et al. 1988; Jaffe et al. 2009). Specifically, adults with none, few, and many children may experience more adverse health outcomes than adults with a moderate number of children. This so-called J-curve hypothesis derives from the interplay between the positive and negative consequences of childbearing. For nulliparous adults, the direct and indirect negative physical consequences of childbearing and childrearing are completely absent, and the full complement of economic, physical, and emotional resources remains available for personal or couple use. However, social support levels may be very low, particularly as adults age, which itself tends to impact health negatively (e.g., House et al. 1988).

Parents with very large families may find support from children to be readily available, but there is also an increased likelihood of economic, physical, and (potentially) emotional depletion (Alter et al. 2007) and/or that parents experience a negative relationship with at least one of their children (see Umberson et al. 2010). In contrast, parents with a moderate number of children may be best situated in terms of long-term health. Support is more likely to be readily available (though not necessarily maximized), and though substantial resources are likely dedicated to children, resources are less likely to be as depleted. All else being equal, this combination of resources and social support may ease the aging process, potentially reduce the risk of adverse health outcomes, and buffer against early mortality (particularly among the unhealthy).

A small number of empirical examinations support the J-curve hypothesis. For example, using data from England, Wales, and Austria, Doblhammer (2000) found that childless women and those who had high parity (4 or more children) compared to those who had lower parity (1 or 2 children) were at a significantly increased risk of all-cause mortality. Green et al. (1988) found similar results using data from the Office of Population Censuses and Surveys in England. Jaffe et al. (2009) also found support for the J-curve hypothesis with respect to all-cause mortality among both men and women, even after controlling for age, socioeconomic status, and other demographic characteristics. Finally, examining cause-specific mortality, Dior et al. (2013) found that mothers who had one child and those who had between 5 and 10 children were at a significantly increased risk of mortality from cancer, circulatory disease, and heart disease among others. In fact, even after adjusting for health conditions and lifestyle choices such as smoking, mothers in the low and high parity groups had between an 18 and 49% increased risk of mortality from cause-specific diseases. Overall, however, the literature directly addressing the J-curve association is limited and warrants further investigation.

Moderating Factors

Prior research suggests that the association between parity and parental health may vary in magnitude and direction depending on the population. While biological and biosocial factors can be intuitively linked to maternal mortality because women bear children and remain the primary caretakers of children (see Casper and Bianchi 2002), the link between parity and mortality seems less intuitive for fathers. Research suggests that mothers are more likely than fathers to cultivate personal relationships and reap the health benefits of social support (e.g., Barefoot et al. 2005), thus implying that fathers may be insulated from these benefits. In addition, Penn and Smith (2007) find that women pay a higher health cost for fertility compared to men, especially as women age.

Some research suggests, however, that parity also impacts the health of men. Childless single men were significantly likely to die early as a result of accidents, suicide, and other forms of violence and were generally more likely to abuse drugs and/or be violent compared to men who were not childless and single (Weitoft et al. 2004). On the other hand, it is difficult to rule out selection here because men who do not have children, for whatever reason, may be the men who are also more likely to engage in unhealthy behaviors. In addition, while higher-SES men may be less impacted by the presence of children, lower SES men experience more of the consequences of parenthood (Keizer et al. 2011). Others have found that the health processes associated with childrearing look similar for both men and women (e.g., Jaffe et al. 2009).

Given that access to social support may underlie the association between parity and health, we also consider parents’ age as a potential moderator (e.g., Högberg and Wall 1986). Again, access to support may be particularly important at older ages (e.g., Avlund et al. 1998), where adult children often provide help with basic daily needs and with accessing routine medical care. Adult children and their children may also help to provide a sense of purpose, as aging parents often retain an important role in the rearing of grandchildren (e.g., King et al. 1997). Therefore, we might expect to find a stronger relationship between parity and health among older parents, as younger parents may not require as much support.

A third potential moderator is parent’s socioeconomic status. Numerous studies (e.g., Alter et al. 2007; Dribe 2004; Grundy and Kravdal 2008; Jaffe et al. 2009) note that the physiological risk factors associated with higher parity are exacerbated among parents who have few resources and live in lower versus higher socioeconomic environments. Lower SES parents may be less able to purchase nutritious foods during and following pregnancy and may have limited access to stress-reducing and health-improving resources and activities (e.g., time to exercise). Families from a lower socioeconomic background are also at an increased risk of experiencing depression and related illness (Lorant et al. 2003).

In addition to parent’s demographic characteristics, one might expect the strength of the association between parity and parental health to vary over time (i.e., across age cohorts). Historically, not only were adults expected to bear and rear children (Thornton and Young-DeMarco 2001), the social structures in Western societies meant that children and parents would live close to one another and work together in meeting basic needs. The increasing distance between parents and children over time may have weakened intergenerational relationships (e.g., Hank 2007; Litwak 1960; Parsons 1943). Moreover, in the past, families tended to rely more closely on children for the provision of labor and/or other key services, which conceivably linked parity more strongly to health outcomes. However, the reverse is also possible. Putnam (2000) and others have argued that, in previous periods, individuals could rely on a more diverse range of sources for emotional and instrumental support. If that is indeed the case, children’s support may be more important today and we may expect the parity-health association to be stronger.

It is worth noting here that social support from people other than one’s children can have a potentially complementary role for health outcomes; in extreme cases, one might argue that a lack of support from children may be offset by strong support from close friends or other relatives (see Shor and Roelfs 2013). Unfortunately, due to data limitations in our sample of studies, we are unable to examine the role of social support across family and other social contexts.

Selection Effects

Selection effects may account, at least in part, for the relationship between parity and all-cause parental mortality (Alter et al. 2007; Hurt et al. 2006b). First, socioeconomic selection effects are likely to be strong, since poorer adults are more likely to smoke and drink heavily, less able to invest in a healthier lifestyle (e.g., exercise, diet), and more likely to have large families. In addition to less access to nutrition and stress-reducing resources, lower socioeconomic status groups are more likely to have high parity and more likely to be at an increased risk of mortality (Hoffman 2005; Musick et al. 2009). Higher SES groups, on the other hand, are more likely to opt out of childbearing, have fewer children, and live longer than lower SES groups (Casper and Bianchi 2002; Hoffman 2005; Musick et al. 2009).

Second, health selection also may be a concern as major health problems at a young age may impede one’s ability to have children. Roughly 10–15% of the people in Western populations currently experience infertility (World Health Organization 2003) and some of this infertility is linked to health problems severe enough to prevent them from having children altogether (Chachamovich et al. 2010). In addition, people who have serious health problems may experience difficulties in forming romantic unions (including marriage). Lillard and Panis (1996) found that young men in good health were more likely to marry than young men who were in poor health. While marriage is no longer a required context for bearing children (Hamilton et al. 2010), this may mean that poor health reduces the number of children born.

Former studies accounting for potential selection effects such as pre-childbearing health and socioeconomic status, suggested that selection accounts for some, but not all, of the association between parity and parental mortality. For example, Green et al. (1988) found a significant association between parity and cause-specific mortality among women in Britain, which they claimed could not be explained by social class. Similarly, Hurt et al. (2006) found that, among a cohort of mothers in Bangladesh, parity (in this case number of sons born) was positively associated with mortality even once models were adjusted for age, education, marital status, religion, and area of residence (each linked to both SES and health status). Finally, Jaffe et al. (2009) found a significant parity-mortality association among Israeli women even after adjusting for age, education, SES, and lifestyle factors. These studies suggest that selection may not be enough to explain the association.

Prior Meta-Analyses and the Current Study

In the current study, we examine the association between parity and all-cause parental mortality using meta-analysis and multivariate meta-regression. The sample studies were restricted to those where baseline data were collected after 1945 in developed nations. Our primary contribution to the literature is a larger scale (compared to prior studies), controlled examination of the J-curve hypothesis, which posits that both low and high parity tend to be associated with a higher risk of all-cause parental mortality when compared to moderate parity. Meta-analysis and meta-regression techniques are particularly well suited for the examination of the J-curve hypothesis, as heterogeneity with respect to family size, respondents’ sex and age distributions, geographic location, and other factors can be leveraged in fruitful ways. In short, studies that were not designed to test the J-curve hypothesis, moderating variable hypotheses, and/or selection hypotheses can still—when used in combination with other studies through a meta-regression—provide the basis for testing these hypotheses directly.

A number of previous meta-analytic studies have examined the relationship between parity and individual health outcomes [see Dahabreh et al. (2012) for an examination of lung cancer in women; see Ewertz et al. (1990) for an examination of breast cancer; see Guan et al. (2014) for an examination of pancreatic cancer; see Guan et al. (2013a) for an examination of kidney cancer; see Guan et al. (2013b) for an examination of colorectal cancer]. However, these all rely on a relatively small number of studies and all assume the parity-health association follows a linear “dose–response” structure. We are aware of only one prior meta-analysis of the parity-mortality association (Zeng et al. 2016), which shows a significant J-shaped curve association (with the minimum relative risk associated with between 3 and 4 children). Our analysis extends the literature by examining a broader range of studies. In addition, unlike Zeng et al. (2016), the current study examines key moderators of the parity-mortality relationship and, unlike prior studies, addresses selection.

Methods

Data Collection and Study Inclusion Criteria

The candidate pool of studies was gathered using an iterative search strategy (see Roelfs et al. 2013), beginning with a keyword search in the Medline, EMBASE, CINAHL, and Web of Science databases in 2005 (see "Appendix" for the search terms used) and updated through September of 2016. The search was designed to capture the association between various types of social support and all-cause mortality. Figure 1 illustrates our full search and exclusion process. In total, we identified 752 studies which required further examination. Of these, 415 were excluded because all-cause mortality was not the outcome, they did not use a relative risk measure; they did not include variables for any of the target measures of social support, or the analysis technique or data were noticeably flawed.

Fig. 1
figure 1

Study inclusion/exclusion flow diagram

The full database of relative mortality risk measures for social support contained information from 337 studies. Of these 337 studies, 270 were excluded because they contained no measure of social support from children and 20 were excluded because they did not specifically measure number of children, but rather only looked at the effects of having versus not having children. Of the remaining publications, 4 were excluded because these studies were conducted in an incomparable, developing nation (e.g., Bangladesh), 4 were excluded because they contained redundant data, and 9 were excluded because they measured mortality during an incomparable time period (prior to 1945). At the end of this process, we were left with 37 studies on which we base this study.

Analytic Approach

To examine the relationship between parity and all-cause parental mortality, we include data on 223 measures of relative mortality risk from 37 studies (see Table 1). A meta-analysis model was used to estimate the mean hazard ratio, stratified by the number of covariates. A meta-regression model (a weighted linear regression) was used to estimate the effect of covariates on the magnitude of the hazard ratios across sample studies. We assessed the presence and magnitude of heterogeneity using Q-tests. All of our analyses were calculated by maximum likelihood using a random effects model (fixed slope, random intercept) and matrix macros provided by Lipsey and Wilson (2001). The possibility of selection and publication bias was examined using a funnel plot, with plot asymmetry evaluated using Egger’s test (Egger and Davey-Smith 1998).

Table 1 Summary information for studies included in the analysis

The dependent variable used in the meta-regression (and examined in the meta-analysis) was the log of the relative mortality hazard (i.e., a hazard ratio; the numerator group, or case group, included respondents with more children and the denominator, or comparison group, included respondents with fewer children). Statistical methods varied between studies, and all non-hazard-ratio point estimates were converted to hazard ratios. Where not reported, standard errors were calculated using (1) confidence intervals, (2) t statistics, (3) χ2 statistics, or (4) p-values. We sought to maximize the number of hazard ratios that were analyzed, capturing variability both between and within studies.

For studies based on individual-level data, a curvilinear relationship between number of children and the mortality hazard rate can be readily estimated using the following Cox hazard model: \(\ln \left[ {\lambda \left( {t |X} \right)} \right] = \ln \left[ {\lambda_{0} ({\text{t}})} \right] + \beta_{1} X + \beta_{2} X^{2} + \beta_{2} X^{3} + \mathop \sum \nolimits \beta_{i} X_{i}\) where \(\ln \left[ {\lambda \left( {t |X} \right)} \right]\) denotes the natural log of the mortality hazard rate; t denotes time; X denotes the number of children a respondent has; the \(X_{i}\) s denote a vector of appropriate covariates, and the βs denote the corresponding unstandardized regression coefficients).

For meta-analyses, however, the data used contain only aggregated results and measurement strategies vary between studies. The number of children was measured in the 37 studies in our analysis using either (in 27 of the studies) discrete categories (e.g., 0–2 children vs. 3 or more children) or (in 7 of the studies) continuous measures (i.e., a count of the number of children). The three additional studies included in the analysis used both discrete and continuous measures. While studies using either type of measures can be used for meta-analyses, the continuous measures provide information about a linear association alone while the central goal of the present paper is to test for a nonlinear association. The discrete measures have a methodological advantage in that they can be used in a meta-regression to produce estimates of regression coefficients one would have obtained if one had individual-level data.

Meta-regression variables—calculated from both the mean number of children among the numerator (case) group and the mean number of children among the denominator (comparison) group—allow one to estimate coefficients equal to those in an individual-level study. Estimates are based on the following:

$$\ln \left[ {\lambda \left( {t |X} \right)} \right] = \ln \left[ {\lambda_{0} ({\text{t}})} \right] + \beta_{1} X + \beta_{2} X^{2} + \beta_{3} X^{3} + \mathop \sum \nolimits \beta_{i} X_{i} .$$
(1)

Let \(\lambda \left( {t |X + k} \right)\) denote the hazard rate for group 1 and \(\lambda \left( {t |X} \right)\) denote the hazard rate for group 2, where k is any integer ≥1. Then, the natural log of the hazard ratio can be expressed as follows:

$$\ln\left( {\frac{{\lambda \left( {t |X + k} \right)}}{{\lambda \left( {t |X} \right)}}} \right) = \ln \left[ {\lambda \left( {t |X + k} \right)} \right] - \ln \left[ {\lambda \left( {t |X} \right)} \right].$$
(2)

Substituting \(X + k\) for \(X\) in Eq. (1) yields

$$\ln \left[ {\lambda \left( {t |X} \right)} \right] = \ln \left[ {\lambda_{0} ({\text{t}})} \right] + \beta_{1} \left( {X + k} \right) + \beta_{2} \left( {X + k} \right)^{2} + \beta_{3} \left( {X + k} \right)^{3} + \mathop \sum \nolimits \beta_{i} X_{i} .$$
(3)

Substituting Eqs. (3) and (1), respectively, into Eq. (2) yields

$$\ln\left( {\frac{{\lambda \left( {t |X + k} \right)}}{{\lambda \left( {t |X} \right)}}} \right) = \left\{ {\ln \left[ {\lambda_{0} ({\text{t}})} \right] + \beta_{1} \left( {X + k} \right) + \beta_{2} \left( {X + k} \right)^{2} + \beta_{3} \left( {X + k} \right)^{3} + \mathop \sum \nolimits \beta_{i} X_{i} } \right\} - \left\{ {\ln \left[ {\lambda_{0} ({\text{t}})} \right] + \beta_{1} X + \beta_{2} X^{2} + \beta_{3} X^{3} + \mathop \sum \nolimits \beta_{i} X_{i} } \right\}.$$
(4)

Cancelling terms and rearrangement yields

$$\ln\left( {\frac{{\lambda \left( {t |X + k} \right)}}{{\lambda \left( {t |X} \right)}}} \right) = \beta_{1} \left[ {\left( {X + k} \right) - X} \right] + \beta_{2} \left[ {\left( {X + k} \right)^{2} - X^{2} } \right] + \beta_{3} \left[ {\left( {X + k} \right)^{3} - X^{3} } \right].$$
(5)

As the derivation above shows, the coefficient for the difference between the number of children for the case group and the number of children for the comparison group (i.e., the coefficient for \(X_{case} - X_{control}\)) produces an estimate of the main effect (i.e., an estimate of \(\beta_{1}\), the coefficient for X in an individual-level study of the association between number of children and the mortality hazard rate). The coefficient for the difference between the squared parity levels (i.e., the coefficient for \(X_{case}^{2} - X_{control}^{2}\)) produces an estimate of the quadratic effect (i.e., an estimate of \(\beta_{2}\), the coefficient for \(X^{2}\) in an individual-level study of the association between number of children and the mortality hazard rate). The coefficient for the difference between the cubed parity levels (i.e., the coefficient for \(X_{case}^{3} - X_{control}^{3}\)) produces an estimate of the cubic effect (i.e., an estimate of \(\beta_{3}\), the coefficient for \(X^{3}\) in an individual-level study of the association between number of children and the mortality hazard rate). Together, the three variables allow us to use meta-regression to examine whether the relationship between parity and the risk of all-cause mortality is nonlinear.

Where the number of children was measured using discrete categories, we recorded information on the lower and upper boundaries for both the numerator (case) group and the denominator (control) group. Assuming a Poisson distribution, we used the information on these lower and upper boundaries to estimate the mean number of children for both groups (e.g., a group with a range from 4 to 5 has an estimated mean of 4.38). In cases where the upper boundary of the category was not reported, we conservatively assumed the maximum to be 25 children (we also checked for robustness by comparing the results with other assumed maximums; see the limitations section for a full description). Descriptive statistics for all variables are reported in Table 2.

Table 2 Descriptive statistics

Additional covariates include: (1) the proportion of the sample that was male (modeled as an interaction with the parity variables); (2) the mean age of the study sample (adjusted for differences in baseline dates by adding the number of years elapsed since the study baseline to the mean age reported at baseline; the adjusted mean age was also modeled as an interaction with the parity variables), divided by ten; (3) the number of years elapsed since baseline data collection began, divided by 10; (4) an indicator variable for whether or not the study sample suffered from a known chronic condition; (5) the underlying death rate in the sample; (6) the duration of follow-up particular to the study; (7) a series of dummy variables capturing the regions where the study was conducted; (8) a series of indicator variables for whether or not the study controlled in any way for age, other demographic factors, socioeconomic status, general health status, health-related behaviors (e.g., smoking, drinking), or the presence of chronic health conditions at the individual level; and (9) an indicator variable for whether or not the weighting variable for the regression needed to be estimated prior to analysis.

The proportion of the sample that was male (sex) and interactions between sex and the parity variables were included in order to examine sex differences in the magnitude of the social support-mortality association. Age (measured as the mean age of the study sample after adjustment for differences in baseline study dates) and interactions between age and the parity variables were included to examine possible changes in the parity-mortality association across age cohorts. The number of years elapsed since baseline data collection began was included to control for time trends in the parity-mortality association; this control is particularly important given the changes in parity norms over the four decades represented by these data. The indicator variable measuring the presence of a chronic health condition across the entire sample was included because ratio comparisons among non-healthy samples tended to be closer to 1 because the death rates for both the numerator and denominator groups were high.

We controlled for the underlying death rate for the sample in order to account for any factors, other than chronic illness, which might also affect the magnitude of the relative mortality hazard in similar ways (i.e., the statistical artifact of being less able to detect differences in hazard rates when death rates are high). Data on death rates was obtained from the Human Mortality Database (University of California-Berkeley & Max Planck Institute for Demographic Research 2011). The underlying death rate was then calculated using a weighted average, such that the result would be matched to a particular study in terms of the nation from which the sample was drawn, the year in which the study was conducted, and the sex and age of the respondents.

We controlled for the mean follow-up duration of a study in order to account for differences in the length of time over which mortality could occur. We controlled for the region in which a study was conducted by creating categories taking into account both geographic proximity and cultural similarity. We used seven categories: East Asia (China, Japan, and Taiwan), Australia, Mediterranean Europe (Israel and Italy), Germanic Europe (Austria, Germany, the Netherlands, and Switzerland), Scandinavian Europe (Denmark, Finland, Norway, and Sweden), the United Kingdom, and the United States. We also controlled for differences in the types of covariates used in each of the articles in our sample by including a series of indicator variables. These are important indicators of selection. For example, if health selection helps explain why we might observe a higher mortality risk among people with no or few children (i.e., they cannot have any/many children because they are already unhealthy), then should observe lower hazard ratios for studies that control for health when compared to studies that do not control for health. Controlling for these differences is also important because we did not use the presence or absence of certain covariates as a factor when making inclusion/exclusion decisions.

We also included an indicator variable to identify the minority of cases where we had to estimate the weight used for a particular hazard ratio rather than calculate the weight directly from the variance of the hazard ratio (necessary for 11.2% of the hazard ratios included in the analysis). In these cases, we estimated the regression weight using multiple regression from all 347 studies (2971 hazard ratios) in our database. Significant predictors of the standard error were sample size (log transformed), follow-up duration, publication date, the geographic region in which the study was conducted, and an indicator for whether the study controlled for age (Multiple R = .663). We also conducted meta-analyses both including and excluding studies for which we estimated the regression weight. Thus, we retained the ability to assess the impact of regression weight estimation on the final results. Sensitivity tests showed that there were only minor differences in the results when we excluded the 11.2% of the hazard ratios with estimated inverse variance weights from the analysis. We therefore chose to leave these in the reported analyses, to increase statistical power and our ability to identify important subgroup differences.

Both the study selection process and the inclusion of the indicator variables (for whether or not a study controlled for age, other demographic factors, SES, general health, health-related behaviors, or chronic health conditions and the indicator variable for whether or not the study included precise information on the standard error of an effect estimate) serve the additional function of controlling for differences in study quality. The first six indicator variables help to assess whether the study accounted for important confounding factors. The last indicator variable helps to assess the statistical rigor of the study itself.

Results

In Table 3, we report the meta-regression results predicting hazard ratio magnitude using a discrete categorical measure of family size. The full model includes all covariates and the parsimonious model includes only significant covariates. The results of both models suggest a significant nonlinear association between the magnitude of the hazard ratio and the mean number of children in the denominator or the comparison group that interacts with both sex and age cohort. These results generally show a curvilinear relationship between parity and all-cause parental mortality such that the mortality rate initially decreases but then later increases as parity increases.

Table 3 Meta-regression models predicting HR magnitude (logged) among studies of number of children using a discrete categorical measure of family size and an all-cause mortality outcome

The relationships between sex, parity, and mortality are shown in Fig. 2 [which is derived from the interactions between sex and \(X_{case} - X_{control}\) (p = .0254), sex and \(X_{case}^{2} - X_{control}^{2}\) (p = .0118), and sex and \(X_{case}^{3} - X_{control}^{3}\) (p = .0216), as well as the main effects for sex (p = .2809; variable included so the interaction is properly specified statistically) and the three parity measures themselves (p = .0077, p < .0001, and p < .0001, respectively)]. The relationships between age cohort, parity, and mortality are shown in Fig. 3, [which is derived from the interactions between age and \(X_{case} - X_{control}\) (p = .1930; variable retained so the interaction is properly specified) and age and \(X_{case}^{2} - X_{control}^{2}\) (p = .0427), but not age and \(X_{case}^{3} - X_{control}^{3}\) (p = .1949; the interaction could be removed from the model without compromising proper model specification), as well as the main effects for age (p = .3087; variable included so the interaction is properly specified statistically) and the three parity measures themselves (p values same as reported for sex)]. For both Figs. 2 and 3, confidence intervals for the predicted hazard ratios are available in the "Appendix".

Fig. 2
figure 2

Predicted mean hazard ratio by parity and sex from parsimonious model

Fig. 3
figure 3

Predicted mean hazard ratio by parity and age (standardized for study baseline date)

As Fig. 2 shows, there is a clear non-linear, J-shaped curve association between parity and mortality risk for both men and women. In fact, for parity levels less than or equal to 7 children, there are no appreciable differences between men and women. Compared to men or women with no children, there is a trend of decreasing relative mortality risk until a parity of 3 children is reached. For a parity of 4 or more, the relative mortality risk increases again, reaching a point (at a parity of approximately 7 children) where the relative mortality risk is about equal to that of nulliparous men and women. Differences between men and women only meaningfully emerge at about 7 children, with the predicted mean hazard ratio rising to 2.13 for women with 10 children (again compared to women with no children) but only 1.61 for men with 10 children. This suggests that the health risks of very large family sizes are greater for women compared to men. However, it is worth noting that since most parents have 7 or fewer children, sex differences in mortality risks associated with parity affect few people.

As Fig. 3 shows, the J-shaped curve parity-mortality association is stronger for more recent cohorts versus older cohorts (if at all for the latter group). For respondents 60 years of age (as of 2016), the J-curve pattern is clear – the lowest relative mortality risk occurs at approximately 3 children (the hazard ratio is about 25.1% lower for 3 versus 0 children). Similarly to Fig. 2, the relative mortality hazard increases above 3 children, with approximately 7–8 children associated with about the same mortality risk as nulliparous persons. The mortality risk for those with 10 children is about 32.4% higher than that of nulliparous persons and about 76.9% higher than those with 3 children.

As Fig. 3 also shows, for respondents who would have been age 120 as of 2016, the parity-mortality association was different. For this age cohort, the relative mortality risk is strictly highest for nulliparous individuals, with the hazard ratio essentially falling as parity increases, even to 7, 8, 9, or 10 children. The pattern for this cohort is also one where the differences between the relative mortality hazard at different parity levels are muted. For example, in this oldest cohort, the relative mortality hazard for nulliparous people is only 21.3% higher than the relative mortality hazard for those with 10 children (with this 21.3% difference highest among all possible parity comparisons). This finding may reflect changes in family norms over the periods and cohorts reflected in the sample. For example, in developed nations in the 1960s versus later periods, there were few nulliparous adults.

The statistically significant covariates in the parsimonious model of Table 3 include the number of years elapsed since baseline data collection began (i.e., the time period); the underlying death rate for the study; the follow-up duration of the study; the geographic region where the study was conducted; whether or not the study controlled for age, health behaviors, and chronic health conditions; and whether the standard error was reported in the original study results. Non-significant covariates included whether or not an entire study’s sample had a chronic health condition (p = .0839), whether or not the study controlled for demographic factors other than age (p = .9409), whether or not the study controlled for socioeconomic status (p = .1715), and whether or not the study controlled for general health status (p = .1866).

Among the covariates included in the analysis, the indicator variables for whether or not a study controlled for socioeconomic status, general health status, health behaviors, and chronic health conditions provide some insight into the presence/strength of certain types of selection effects (and allowing for the direct examination of the effect of statistical control on study results, a measure of study quality). The results indicate that health selection is important. The mean hazard ratio for studies that controlled for chronic health conditions was 29.45% higher (exponentiated coefficient = 1.2945; p < .0001) than the mean hazard ratio for studies that did not. Caution must be taken, however, when drawing conclusions with respect to health selection, as the other indicator variables related to health (general health status and health-related behaviors) provide contrasting results. The mean hazard ratio for studies that controlled for adverse health behaviors like smoking and drinking was 29.76% lower (exponentiated coefficient = 0.7124; p < .0001) than the mean hazard ratio for studies that did not. In addition, the indicator variable for whether or not a study controlled for general health status was not statistically significant, suggesting the mean hazard ratios were equal for studies that controlled for general health and those that did not. Health factors were important, but the direction of their influence was mixed.

The indicator variable for whether or not the original study provided enough information to calculate accurately an effect estimate’s standard error was also statistically significant—studies that failed to directly report a standard error (i.e., the lower-quality studies), on average, over-estimated the mean mortality hazard by approximately 8.52% (exponentiated coefficient = 1.0852; p = .0187).

Table 4 shows a series of mean hazard ratios from our meta-analyses. When the 223 hazard ratios were stratified solely by level of statistical adjustment, among multivariate-adjusted studies, we found the mortality hazard was, on average, 4.58% lower (p < .01) for respondents with more children when compared to those with fewer children. Not surprisingly, the difference in mortality hazard was greater, in relative terms, among studies that only controlled for age (19.25% decreased hazard; p < .001) or utilized no control variables (15.38% decreased hazard; p < .001). There were no substantial differences between the subset of hazard ratios based on discrete measures of parity and the subset based on continuous measures.

Table 4 Meta-analyses of the association between number of children and all-cause mortality

Limitations

In terms of heterogeneity, the results suggest strongly that important between-study differences exist; and therefore the meta-regression results are preferred over the simpler meta-analysis results. One should interpret the mean hazard ratios from the meta-analyses reported in Table 4 with caution. The null hypothesis of data homogeneity was rejected (at the .05 level) for two of the eleven mean hazard ratios, and was only marginally significant (i.e., at the .10 level) in another 3 of the 11. The meta-regression accounted for approximately 70% of the variance among the hazard ratios, which suggests that the included covariates captured much of the data heterogeneity present in the simpler meta-analyses.

The results of Egger’s test for funnel plot asymmetry (p < .001) indicated publication/selection bias in the data. A visual examination of the funnel plot (see Fig. 4) suggests there may be missing studies with higher log hazards and small weights. A visual examination also shows a few outlying (low) log hazard ratios with large weights. The absence of higher log hazards with small weights suggests that the mean mortality risk reported in our findings is slightly lower than it would be if there was no publication bias. Conversely, the outlying log hazard ratios with large weights would decrease the mean hazard ratio estimate. While it is not possible to accurately assess whether the two phenomena balance out, the opposing directions of the two suggests that they at least partially cancel out each other. Consistent with the above results addressing heterogeneity, Fig. 4 further underscores the meta-regression (Table 3) versus meta-analysis results (Table 4).

Fig. 4
figure 4

Funnel plot of hazard ratios (logged) vs. meta-analysis weight. Vertical line denotes the mean hazard ratio (logged) of −0.0871 among the 205 hazard ratios from studies using a categorical measure for number of children. P value from Egger’s test for funnel plot assymetry <.001

One must also be cautious in trying to determine precisely what numbers of children constitute “low parity,” “moderate parity,” and “high parity,” particularly when drawing observations from Figs. 2 and 3. It is important to note that the studies were substantially based on somewhat older data (i.e., from the 1960s and 1970s) and that even the more recent studies are based on respondents who bore and reared their children many years prior to the study baseline date. Having a small family (or no children altogether) is much more accepted today than at these earlier times, and our study cannot provide a high level of certainty as to whether the parity-mortality association found here fully captures the variation in adults in their childbearing years. Indeed, in Western (and increasingly in Eastern, see Raymo et al. 2015) countries, family forms have changed considerably over the past few decades. While most childbearing continues to occur within marital unions, many families today are formed outside of marriage, many parental unions dissolve, and many parents have children within more than one union (e.g., Casper and Bianchi 2002). Parents who live apart from their children following parental separation or divorce often have less contact with their children, particularly if they have children with new partners or spouses (i.e., more frequently fathers as they are more likely to be the non-custodial parents). Decreased contact may mean that children provide less social support to non-custodial parents as they age. To the extent that social support offsets the effects of aging and reduces the risk of mortality (particularly among parents with 2 or 3 children), increased family complexity may complicate the association between parity and all-cause parental mortality.

Kravdal et al. (2012), for example, find that mortality rates vary considerably among parents who divorce, do not marry, and/or have step-children. These variations may be the result of biological children—who live apart from a parent during childhood—providing more care for the (childhood) custodial parent as they age. Moreover, step-children may be more likely to care for their (childhood) custodial biological versus (childhood) custodial step-parent. Thus, step-parents, particularly those who also have children from prior partnerships, may find themselves in particularly vulnerable situations as they age—biological children from whom they lived apart and step-children may provide them with little social support. While data limitations do not allow us to examine these associations, they are worth exploring in future studies.

Our study may also be limited by our choice to assume a maximum of 25 children for any parity grouping where the upper bound of the parity range was unknown (e.g., for a category of 5+ children). To address this possible limitation, we did sensitivity analyses where we assumed either 15 or 20 children for the upper bound. The results were the same. The estimates for the mean number of children of 5+, for example, was 5.72871560 if we assumed a maximum of 25 children and was 5.72870856 if we assumed a maximum of 15 children for the same parity range. The mean difference (from which the parity IVs are derived) was only 0.00000704, which does not influence the coefficients or their associated p-values in any concerning way.

Discussion

We used data from 37 studies and 223 measures of relative mortality risk to examine the relationship between parity and all-cause parental mortality. Meta-analysis was used to estimate the mean hazard ratio, stratified by the number of covariates. Meta-regression techniques were used to estimate the effect of covariates on the magnitude of the hazard ratios across sample studies. Our results suggest that there is a significant J-shaped curve association between parity and all-cause parental mortality. That is, the mortality hazard rate decreases as parity increases up to 3 children, but increases at higher levels of parity (for both men and women, but more so for more recent versus earlier cohorts). We found that the nonlinear association between parity and all-cause parental mortality is moderated both by parents’ sex and by cohort. Finally, we also found that the presence of statistical controls for health factors influenced the original study results, though the direction was unclear.

Nulliparous adults and adults with low parity tend to have higher versus lower socioeconomic backgrounds (Casper and Bianchi 2002), and therefore have financial resources which may positively affect their health. However, our findings suggest that the long-term consequences of nulliparity and low parity may be offset by other factors. One such factor may be social support and the social connections between parents and children over their life course. Emotional and/or instrumental social support is a resource that often money cannot buy. Yet, research suggests that these are priceless resources to one’s health net of SES. In particular, aging populations benefit from receipt of social support (e.g., Avlund et al. 1998; Lyyra and Heikkinen 2006) and access to support increases parents’ ability to cope with the onset of diseases and/or disabilities associated with aging (e.g., Penninx et al. 1997). Research further suggests that adult children provide a substantial proportion of social support to their aging parents (e.g., Stein et al. 1998) and that social isolation (i.e., little to no access to support) increases the risk of mortality (see House 2001; House et al. 1988). In China, for example, the system of care for aging parents has become much more complex as a result of the one-child policy. Traditionally, aging parents in China (as in some Western countries, including the U.S.) have relied on their children to provide care, but lower overall parity has decreased the extent to which children are available to provide support and care to aging parents (e.g., see Song et al. 2016). Moreover, nulliparity and low parity may increase the risk of social isolation, particularly following the loss of a spouse or for those who never marry—parity may operate through social integration to influence parents’ risk of mortality (see Roelfs et al. 2011; Shor et al. 2012). This might explain why we observe significantly lower risks of mortality at higher levels of parity – there are more children around to provide support. While our data do not allow us to test directly this hypothesis, future studies may consider the potential mediating influence of social isolation in the association between parity (particularly nulliparity and low parity) and parental mortality.

Another factor that may account for the elevated mortality risk observed among nulliparous and low parity adults may be health selection, which our findings suggest is present in significant ways. This at least partially corroborates previous research suggesting that those who experience health problems may also experience difficultly with union formation and childbearing (Chachamovich et al. 2010; Lillard and Panis 1996). Those who have very low parity may be at an increased risk of mortality due to factors associated with either early life or pre-childbearing health conditions. Our data are limited to those confounding factors that were included in the sample studies. Thus, we are unable to account for a number of health-related factors that are likely to influence early life or pre-childbearing health conditions (e.g., diseases associated with infertility). Even so, our findings corroborate those of others (e.g., Jaffe et al. 2009) who found that the parity-health association in Western countries is not entirely explained away by health selection mechanisms.

The persistence of the SES gradient in health and mortality (see Elo 2009 for a review) suggests that SES background explains much of the variation in long-term health trajectories. Furthermore, it is certainly possible that the observed mortality risks associated with having relatively few children exacerbate the pre-existing mortality risks associated with SES as those in lower SES groups who do not have children may not be able to purchase care late in life. Due to data limitations, we are unable to explore this possibility. We did, however, attempt to (partially) address the existence of SES selection by comparing the mean hazard ratio of studies that controlled for SES to the mean hazard ratio of those that did not. We found that controlling for SES significantly increases the mean hazard.

Finally, various studies have suggested that parity has a stronger effect on the longevity of females than on the longevity of males (Alter et al. 2007; Daling et al. 2002; Dribe 2004; Henretta 2007; Penn and Smith 2007). We found no support for this assumption in our study. Both the non-significance of the main effects and the interactions between sex and parity suggest that the shape, direction, and significance of the nonlinear association between parity and all-cause mortality are similar for males and females. One possible explanation for this surprising finding may be that the social factors associated with all-cause parental mortality are more important, in the long-run, than the physical factors associated with childbearing. That is, it may be that any physical toll of childbearing and childrearing that is predominately born by women is largely offset by the qualitatively higher level of social support that mothers often receive from their children. This possible balancing process may eventually render the health trajectories of mothers and fathers more or less equal. Indeed, Bird and Rieker (1999) convincingly argue that to better understand the differences in health outcomes among women and men, scholars must account for both biological and social influences.