1 Introduction

Countries in Sub-Saharan Africa (SSA) continue to have some of the highest rates of maternal mortality in the world. Most research on the causes of maternal mortality focuses on conditions during pregnancy and at the time of delivery. However, consistent with the fetal programming hypothesis, a woman’s maternal survival may also be related to conditions she experienced while in utero. There is growing evidence from both the economics and medical literature that fetal conditions can have long-term and permanent effects on educational attainment, income, and adult health outcomes. There exist multiple plausible pathways through which in utero conditions could affect risk factors for maternal mortality later in a woman’s life, and no study has yet investigated the potential causal impact of fetal conditions on maternal survival.

The relationship between in utero conditions and a woman’s maternal survival later in life is likely to be confounded by factors such as parental investments in children, household income, and genetic endowment. In this study, I use the level of rainfall during the in utero period as a source of exogenous variation for in utero conditions. Variation in the level of rainfall during the in utero period comes from differences in the location at birth as well as the month and year of birth for the study’s sample population. In my analysis, I estimate a reduced form regression assessing the effect of the level of rainfall during the in utero period on the probability of maternal death later in a woman’s life.

The use of rainfall as a source of exogenous variation for in utero conditions is based on the assumption that rainfall indirectly affects the health of the fetus, through its effect on crop yield. The literature on rainfall and agriculture in SSA has identified a positive relationship between the level of rainfall and crop yield, even at high levels of rainfall (Woo 2010). This study’s proposed identification strategy assumes that high levels of rainfall improve in utero conditions by improving crop yield, thereby increasing agrarian households’ income and consumption. Similarly, low levels of rainfall, resulting in drought-like conditions, are predicted to worsen in utero conditions through their negative effect on crop yield. While there may be other pathways through which rainfall may affect households’ well-being, such as its effect on malaria through mosquito vectors, I do not find evidence supporting these alternative pathways. As a result, if in utero conditions have a long-term effect on maternal survival later in life, then I hypothesize that high levels of rainfall during the in utero period should predict a lower probability of maternal death later in life and, conversely, low levels of rainfall during the in utero period should predict a higher probability of maternal death.

Similar to other studies in the economics literature, I use an identification strategy based on an unexpected shock during the in utero period in order to provide evidence of causal impact on maternal survival later in life. My approach is similar to the one used in Maccini and Yang (2009) where they assess the effect of rainfall during early years of life, including the in utero period, on height, self-reported health, schooling attainment, and asset ownership in Indonesia. Another similar study, Kudamatsu et al. (2012), uses cross-country data from SSA to assess the effect of rainfall shocks during the in utero period on infant mortality.

The data for this study come from 14 countries in SSA where Demographic and Health Surveys (DHS) were conducted from 1994 to 2007 and which include both sibling-reported maternal survival data and GPS data on the households’ current residential location. I link each individual observation (n = 365,214), including both the respondents and the sisters they report on, to the nearest weather station using the GPS location of her household. Based on each observation’s month and year of birth, I identify the level of rainfall as recorded at the nearest weather station during the period when each woman was in utero, using rainfall data from the Global Historical Climatology Network (GHCN) precipitation data. Since the DHS do not include information on the individuals’ residence at birth and because migration from birthplace may bias the results, I conduct robustness checks, using available variables, to account for migration. In the main specification, I assess whether women who experienced a positive rainfall shock during the in utero period (defined as rainfall that is at least 2 standard deviations greater than mean rainfall in the local area) have a different likelihood of surviving pregnancy and delivery later in life compared to those who did not experience a positive rainfall shock while in utero. Similarly, I assess whether women who experienced a negative rainfall shock during the in utero period, compared to those who did not, have a different likelihood of surviving pregnancy and delivery later in life.

The main results demonstrate that women who experienced a positive rainfall shock while in utero are significantly less likely to die during pregnancy or delivery later in life. A positive rainfall shock in utero decreases the probability of maternal death by 1.1 percentage points, representing a 58 % decrease from a mean of 1.9 % in the sample. There is no detectable effect from a negative rainfall shock during the in utero period, although alternative specifications do identify an effect when different thresholds are used for the level of rainfall required to classify a shock. The specification controls for rainfall during other early-life years, which suggests that the main result is identifying the effect from rainfall during the in utero period, independent from rainfall during other early years of life. The specification also controls for year of birth fixed effects and weather station fixed effects. Robustness checks are conducted to account for migration from birthplace, adjust for potential maternal mortality bias in sampling, control for birth season, and control for conditions at the time of delivery. None of the specification checks invalidate the study’s main findings.

One plausible pathway to explain these findings is through early-life selection effects. Differential early-life survival and conception rates, as a function of rainfall shocks during the in utero and pre-conception period, could have changed the average characteristics of the surviving women entering their reproductive years, thereby affecting their maternal survival. For example, only better off households may have sufficient coping mechanisms to survive droughts and, therefore, their surviving offspring will be better off in adulthood, despite having been born during a drought. I investigate the possibility of early-life selection, but I do not find evidence that rainfall shocks during the in utero period affect early-life survival, fertility outcomes, or fetal survival.

I further explore the possible pathways which may explain the relationship between rainfall shocks during the in utero period and subsequent maternal survival. I do this by assessing the effect of rainfall shocks during the in utero period on various health, socioeconomic, and fertility-related outcomes as well as outcomes related to access to care at delivery. These analyses demonstrate that a positive rainfall shock while in utero reduces the probability of moderate or severe anemia while pregnant, which is a risk factor for postpartum hemorrhage, the leading cause of maternal death. In addition, a positive rainfall shock while in utero decreases body mass index (BMI) in adulthood. Adult BMI influences cardiovascular risk factors, such as hypertension, which could affect the probability of developing pregnancy-induced hypertension, another important cause of maternal deaths.

This study draws on the growing medical literature related to the long-term effects of in utero conditions on health outcomes and extends the scope of the fetal origins hypothesis in a novel direction by investigating whether and how potential risk factors for maternal death relate to in utero conditions. These findings suggest that, in addition to conditions during pregnancy and at the time of delivery, conditions experienced by women when they were themselves in utero also seem to play a significant role in affecting maternal mortality in SSA.

The paper is divided into the following sections: Section 2 presents an overview of the literature on in utero conditions and lays out a conceptual framework and the identification strategy for assessing the effect of in utero conditions on maternal survival later in life; Section 3 discusses the data sources and sample; Section 4 presents the empirical approach; Section 5 provides the main results, including analyses of potential causal pathways and robustness checks; Section 6 presents the study’s limitations; and Section 7 concludes with a discussion of the results and possible policy implications.

2 Overview of literature, conceptual framework, and identification strategy

2.1 Potential relationship between in utero conditions and maternal survival

Sub-Saharan Africa (SSA), as a region, has the world’s highest rate of maternal mortality, with one out of every 31 women dying during pregnancy or delivery (WHO et al. 2010). The Millennium Development Goal (MDG) for maternal health is aimed at reducing maternal mortality ratios (MMRs) by 75 % from their levels in 1990 (United Nations General Assembly 2000). While recent estimates indicate that maternal deaths have fallen significantly over the last three decades, countries in SSA have shown the least progress; none of them have reached the necessary 5.5 % annual reduction in maternal deaths necessary to achieve the MDG for maternal health in 2015 (Hogan et al. 2010).

Most of the research on maternal survival focuses on conditions during pregnancy and at the time of delivery. For example, there is significant evidence that access to skilled birth attendants and facilities providing emergency obstetric care contribute to significant reductions in maternal mortality (Campbell and Graham 2006; Koblinsky 1995; Paxton et al. 2005; Starrs 2006). However, in addition to the influence of factors during pregnancy and at the time of delivery, there is also a potential role that could be played by the conditions that a woman experienced when she herself was in utero.

While no study has yet assessed the role of in utero conditions on maternal mortality later in life, there are various plausible pathways through which such an effect could occur. Figure 1 presents the conceptual framework for such a relationship. In SSA, the main causes of maternal deaths are postpartum hemorrhage, pregnancy-induced hypertension, infection, and obstructed labor (Ronsmans and Graham 2006; WHO 2005). The framework suggests possible pathways through which in utero conditions could influence the risk factors associated with these main causes of maternal deaths.

Fig. 1
figure 1

Conceptual framework linking in utero conditions and maternal mortality later in life. There are three proposed pathways for the relationship: A a direct effect on adult health outcomes consistent with the fetal programming hypothesis; B an indirect effect on education and income, via an effect on early-life cognitive development and childhood health, with implications for access to care, adult health, and fertility outcomes; and C a direct effect on fertility outcomes, specifically age at menarche

The first pathway is based on the fetal origins hypothesis, known as “fetal programming,” depicted as pathway A in Fig. 1. This hypothesis, also known as the Barker hypothesis, posits that in utero conditions enact permanent changes in biological and physiological systems that pre-dispose individuals to certain health conditions during adulthood (Barker 2001). Studies in the medical literature have identified long-term effects from in utero conditions on heart disease (Barker and Osmond 1986; Eriksson et al. 1999; Leon et al. 1998; Rich-Edwards et al. 1997; Stein et al. 1996), diabetes (Dabelea et al. 2008), obesity (Dabelea et al. 2008; Kral et al. 2006), schizophrenia (St Clair et al. 2005), and cognitive ability (Rauh et al. 2006). Studies in the economics literature have also found effects from shocks during the in utero period on long-term health outcomes, including adult height and infant mortality (Banerjee et al. 2010; Kudamatsu et al. 2012), as well as more immediate outcomes, such as low birth weight for the infants (Burlando 2014). In addition, one study identified the inter-generational transmission of poor health, with low birth weight mothers being more likely to have low birth weight children even when comparing sisters (Currie and Moretti 2007). Some studies have, nonetheless, failed to identify long-term health effects from in utero conditions on height, weight, and self-reported health status (Maccini and Yang 2009), life expectancy (Banerjee et al. 2010), or disability and chronic disease (Cutler et al. 2007). Despite the mixed evidence, there remains evidence that fetal conditions can permanently alter biological systems that are important for maternal survival, including the hormonal and immune systems and iron metabolism (Chen and Parker 2004; Cooper et al. 1996; Griffin et al. 1999; Mahajan et al. 2004; Maisonet et al. 2010; Moore et al. 2006). These and other studies suggest that in utero conditions could be linked with specific risk factors for maternal deaths, such as anemia, hypertension, and susceptibility to infection.

A second pathway may involve the indirect effect of in utero conditions on educational attainment and income later in life, depicted as pathway B in Fig. 1. However, there is mixed evidence in the economics literature regarding this relationship. Some studies identify an effect of in utero conditions on educational attainment and income (Almond 2006; Almond and Mazumder 2011; Behrman and Rosenzweig 2004), while some studies fail to identify such an effect (Cutler et al. 2007; Maccini and Yang 2009). Nevertheless, both income and education have been shown to be strongly correlated with maternal mortality (Chowdhury et al. 2007).

Finally, a third pathway may exist through the effect of in utero conditions on the reproductive system, depicted as pathway C in Fig. 1. Evidence shows that females with lower birth weight, a common result of poor in utero conditions, have earlier age at menarche (Cooper et al. 1996; Maisonet et al. 2010). In SSA, pubertal onset is correlated with sexual initiation (Zabin and Kiragu 1998). Given low rates of contraceptive use in SSA (Khan et al. 2007), early pubertal onset will be correlated with a younger age at first birth, thereby exposing women to the risk of maternal death over a longer period of time (Trussell and Pebley 1984; Winikoff and Sullivan 1987). In addition, early pubertal onset limits the physical growth period, leading to shorter adult stature (Biro et al. 2001) and, theoretically, increasing the risk of obstructed labor due to smaller pelvic girth (Rush 2000) which has been shown to lead to shorter adult stature. Evidence also suggests that early age at menarche increases a woman’s risk for hypertension and obesity (Biro et al. 2001; Lakshman et al. 2009) which are risk factors for maternal death.

Given the plausible mechanisms for a relationship between maternal mortality and the conditions that a woman experienced while she was in utero, this study investigates whether such a relationship exists and explores possible explanatory pathways.

2.2 Identification strategy for in utero conditions

The relationship between maternal survival and the conditions that a woman experienced while in utero is likely to be confounded by factors such as socioeconomic status, parental investments in children, and genetic endowment. Many studies in the economics literature have identified the impact of in utero conditions on outcomes later in life by relying on unexpected shocks during the in utero period. These shocks include in utero exposure to Ramadan (Almond and Mazumder 2011), in utero exposure to the influenza pandemic (Almond 2006), differential birth weight of twins which likely results from differential placement in utero (Behrman and Rosenzweig 2004), an insect infestation of grape vines in France (Banerjee et al. 2010), exposure to 9/11 in New York City while in utero (Eccleston 2011), and exposure to hurricanes while in utero (Currie and Rossin-Slater 2013).

Consistent with other studies assessing the impact of in utero conditions (Alderman et al. 2006; Kudamatsu et al. 2012; Maccini and Yang 2009), I use the level of rainfall during the in utero period as a source of exogenous variation for in utero conditions. The level of rainfall during the in utero period is predicted to independently affect in utero conditions (particularly nutrition for the fetus) which subsequently may affect future maternal survival (when the female fetus is an adult woman). While the analytical approach uses a reduced form regression to identify the relationship between the level of rainfall during the in utero period and maternal survival later in life, the underlying assumption is that rainfall affects agricultural output which affects agrarian households’ consumption and income, including the nutritional status of pregnant women in the household (and their female fetuses).

2.3 Background on rainfall and agriculture in SSA

Water is the most important input affecting crop yield for rain-fed agriculture (Woo 2010), and rain-fed agriculture makes up 96 % of crops planted in SSA (World Bank 2013). The main crops in SSA include sorghum (23 %), maize (21 %), millet (20 %), cassava (11 %), and rice (7 %) (National Research Council 2008). Within SSA, there is variation by climatic region in the types of crops planted: maize is the dominant crop in Eastern and Southern Africa, while sorghum and millet tend to be planted in drier areas and rice tends to be planted in the wetter areas of West Africa (National Research Council 2008). There is also variation, by crop, in terms of its vulnerability to the level of rainfall. Maize, for example, is very affected by droughts (Monsanto 2013), and historical trends have shown that maize yields vary more widely from year to year compared to rice and wheat yields (Hellin et al. 2012). In contrast, sorghum is drought resistant and can be grown in both temperate and tropical zones (Taylor 2003). These differences in the crops planted in different areas of SSA could create challenges in conducting cross-country analyses because of differences by region in the relationship between rainfall and crop yield. However, the analysis includes location fixed effects which control for these differences across regions. Using rainfall as an identification strategy is particularly appropriate in SSA because agricultural output is highly dependent on local rainfall due to limited land irrigation (Barrios et al. 2003). For example, while irrigated lands provide the best cultivating environment for rice and would decrease vulnerability to variations in rainfall, only 11 % of rice areas are irrigated in SSA (Oteng and Sant’Anna 1999).

Research has shown a positive relationship between the level of rainfall and crop yield in SSA, even at high levels of rainfall. Based on data aggregated from 1955 to 2004 on maize production in SSA, more rainfall is correlated with higher yield and lower variability (Woo 2010). Maccini and Yang similarly cite that, even at high levels of rainfall, there is a positive relationship with rice output in Indonesia. Nonetheless, as flooding is expected to become more common in the future in SSA because of climate change, such flood-like conditions could negatively affect crop yield (Hellin et al. 2012). Similarly, crop yield will be lower than its potential when rainfall is less than the water requirement for the crop (Critchley and Scheierling 2012). Droughts are the main cause of food shortages in SSA through their impact on agriculture including lower crop yields, livestock deaths or underproduction, and other detrimental environmental effects (Food and Agriculture Organization 2011). In summary, for the purposes of this retrospective analysis, we would expect a monotonic relationship between rainfall and crop yield, given that high levels of rainfall in this context are not equivalent to flood-like conditions.

Rural, agrarian households in SSA are heavily dependent on agriculture both for income generation and consumption. In SSA, 55 % of the population is employed in agriculture (Frenken 2005). In addition, these households have limited access to consumption-smoothing mechanisms such as weather insurance, savings devices, or access to credit to cope with fluctuations in income (Collins et al. 2009; Dercon 2005). Research has shown that these populations are not able to fully smooth consumption in response to adverse events (Morduch 1995; Zimmerman and Carter 2003). In addition, findings show that mothers who experience economic shocks, such as a temporary reduction in earnings, have a higher probability of having low birth weight infants, thereby highlighting their inability to insure against such shocks (Burlando 2014). In summary, household consumption (including household members’ food intake) and income generation among these populations will be affected by the level of rainfall.

The positive relationship between rainfall and agricultural yield, even at high levels of rainfall, suggests that households will be better off (through greater food availability) when rainfall is high. Similarly, households will be worse off when rainfall is low, particularly during drought-like conditions. The level of rainfall, therefore, is predicted to affect in utero conditions primarily through its effect on food availability for the household, including food availability for pregnant women in the household and their female fetuses. There are other mechanisms through which rainfall may affect in utero conditions, including access to health services (through greater income availability), parental time allotment for their children (relative to time allotment to agricultural production), and changes in the disease environment (such as the prevalence of malaria which is affected by rainfall). However, these pathways are likely to be secondary compared to the main effect of rainfall on agricultural output and subsequent nutritional status of household members. Nonetheless, I do consider malaria as a potential alternative explanation for the main findings but do not find evidence in favor of this mechanism as a causal pathway.

Finally, it is important to keep in mind that there is also a delayed effect between when the rainfall occurs and when the household benefits from the crop yield. The delay between planting and harvesting varies according to the timing of the rainy season by climate area. As such, rainfall that occurs during the in utero period will affect the nutritional status of the fetus differently, based on the date of conception. Certain fetuses will be in utero during the rainy season but will be born by the time of the harvest. Other fetuses will be in utero during both the time of the rainy season and the time of the harvest. Robustness checks show that the results are not differentially affected after accounting for differences by cohorts according to their month of birth, in addition to controlling for location fixed effects.

In summary, the relationship between rainfall and in utero conditions is predicted to be positive; when rainfall is high, in utero conditions are expected to be more favorable for the fetus, and when rainfall is low, in utero conditions are expected to be less favorable. If there are long-term effects from in utero conditions on maternal survival later in life, then rainfall is expected to exogenously influence these in utero conditions which in turn will influence maternal survival. The proposed identification strategy therefore estimates the effect of the level of rainfall during the in utero period on maternal survival later in life.

3 Data sources and sample

3.1 Data and summary statistics

My dataset comes from the sub-sample of DHS which collected both sibling history data and GPS data for the household. I link each observation with rainfall data from the closest weather station, based on the GPS coordinates for both the household and the weather station. My final sample includes data from 14 SSA countries and consists of 365,214 observations from women born between 1944 and 1992.Footnote 1

The DHS sibling history data include both data for the survey respondent herself (women ages 15 to 49 at the time of the survey) and data she provides on her siblings. While the DHS data include comprehensive individual-level data for the survey respondent, the data on the respondents’ siblings are relatively limited. They include the siblings’ sex, month and year of birth, year of death (if deceased), and whether the death occurred during pregnancy, delivery, or the postpartum period (for deceased sisters). The postpartum period is defined as the 6-week period following childbirth, reflecting the 42-day period recommended in the ICD definition (Stanton et al. 1997). The final dataset that I construct includes individual observations for each DHS respondent as well as observations for her sisters who have died and observations for her sisters who are still alive but have moved out of the household (Fig. 2). In addition, I limit the observations to respondents’ sisters who would have been 15 to 49 years old at the time of the survey. By including in the sample not only sisters who have died but also those who have moved away and by adjusting for double counting by respondents, the final dataset represents a simple random sample of each generation of sisters associated with each household (Appendix Text A1).

Fig. 2
figure 2

Original DHS sample and final analytic dataset. The DHS sample consists of all women of reproductive age (ages 15–49) who currently reside in the household. For the DHS which collect sisterhood data, each respondent reports both on her sisters who have died (and whether this was during pregnancy, delivery, or the postpartum period) and her sisters who are alive but have moved out of the household. To create a simple random sample of sisters from each generation for my analytic dataset, I add observations to the original sample to represent the sisters who have moved away and the sisters who have died (based on the data provided by the DHS respondent)

I further limit the sample to the DHS which collected GPS coordinates; these coordinates identify each household’s location at the time of the survey. Since the observations’ residence at birth is not available but is necessary for the identification strategy, I make the assumption that the current location of residence proxies for residence at birth. While the DHS only collect a limited number of variables on migration from birthplace, I use these variables to conduct robustness checks which validate this assumption.Footnote 2 I also restrict the sample to rural households because they are assumed to be most dependent on rainfall.

The rainfall data come from the GHCN precipitation data.Footnote 3 These data are collected worldwide on a monthly basis from weather stations identified by GPS coordinates (Appendix Text A2; Appendix Table 7). Using ArcGIS software (version 2010, ESRI), I match each household to the closest weather station in the country (i.e., identified as the shortest distance “as the crow flies”). I calculate the level of rainfall during the in utero period by summing the 12 consecutive months of rainfall preceding the month and year of birth for each observation. I assume a 1-year period for simplicity because I measure the level of rainfall during other early-life years in 1-year increments. I impute missing monthly rainfall data by using rainfall data from previous months in the same weather station and rainfall data from nearby weather stations (Appendix Text A2).

In summary, the final dataset includes observations for each DHS respondent, her sisters who are alive and moved away, and her sisters who have died. For each observation, I have data on the level of rainfall during the period when the woman was herself in utero. In addition, the data show whether the woman survived infancy and childhood. If the woman reached her reproductive years, the data include whether she is still living at the time of the survey or not and, if she has died, whether the death is pregnancy related. The main outcome for the analysis is a dummy variable that equals 1 if the woman has died during pregnancy, delivery, or the postpartum period. For DHS respondents and their sisters who are alive but have moved out of the household, this dummy equals 0; for the respondents’ sisters who died during pregnancy, delivery, or the postpartum period, the variable equals 1. The main analysis therefore assesses the probability of maternal death as a function of the level of rainfall during the in utero period, comparing women who are still alive at the time of the survey (DHS respondents and their sisters who have moved out of the household) with women who died during pregnancy, delivery, or the postpartum period. The variation in the level of rainfall during the in utero period, for the identification strategy, comes from differences between the women based on their location at birth and their month and year of birth.

Summary statistics show that 1.9 % of women in the sample died during pregnancy, delivery, or the postpartum period (Table 1). This translates to an estimated maternal mortality ratio of 433 maternal deaths per 100,000 live births (Appendix Table 8). Data for other outcomes (health, income, access to care, and fertility-related outcomes) are available only for DHS respondents but not their sisters, thereby making it necessary to use the data from the DHS respondents as proxy measures of these outcomes for the entire sample. I present summary characteristics for both the whole DHS sample of respondents as well as DHS respondents between the ages of 15 and 19. This sub-sample is used in the sub-analyses in order to avoid bias due to potential selective mortality among older respondents. This younger cohort of respondents serves as a proxy for all women in the sample and their characteristics as they enter their reproductive years. The respondents’ ages 15–19 have an average BMI of 20.3, and 16 % of them are moderately or severely anemic (Table 2). Fifty percent have had no schooling. Among women who have delivered, 29 % delivered in a health facility and 11 % delivered with a skilled birth attendant. These women had their first child by age 16, on average. Summary statistics for the entire cohort of DHS respondents (women ages 15–49) show evidence of potential selection effects among the older cohorts of DHS respondents. On average, data for the entire cohort of respondents show higher BMI, lower rates of anemia, greater use of skilled birth attendants, and older age at first birth. However, some of these differences may also be related to aging and being part of an older cohort; they have lower educational attainment, a lower rate of facility-based deliveries, and a greater number of children.

Table 1 Main outcome variables for entire sample
Table 2 Baseline characteristics for sample of respondents

4 Empirical approach

I use multivariate regression analysis to identify the impact of rainfall shocks during the in utero period on the probability of maternal death later in life. Assuming i denotes the individual woman, j represents her closest weather station, k represents her month of birth, and t represents her year of birth, I estimate the following specification:

$$ {m}_{ijkt}=\alpha +{R_1}_{ijkt}\beta +{R_2}_{ijkt}\pi +{\delta}_t+{\lambda}_j+\eta {x}_{ij}+{Y}_i\theta +{\varepsilon}_{ijkt} $$
(1)

where the outcome, m, is a dummy variable that takes on a value of 1 if the woman died during pregnancy, delivery, or the postpartum period (see Appendix Table 9 for variable definitions). Vector R 1 is composed of three variables. The first variable measures the level of rainfall during the in utero period, defined as the 12-month period preceding the woman’s month and year of birth. This variable is measured as a z-score, whereby the level of rainfall during the in utero period is normalized using the mean and standard deviation for rainfall in the woman’s closest weather station. Mean and variation of rainfall is estimated for each observation based on all rainfall observations available from the closest weather station (i.e., weather data from 1944 to 1992). The two other variables are dummy variables which measure whether the level of rainfall represents a positive or negative rainfall shock, respectively. A positive and a negative rainfall shock are defined as rainfall that is at least 2 standard deviations greater than/less than mean rainfall in the nearest weather station. The effect of a positive (negative) rainfall shock is, therefore, calculated as the linear combination of two times (negative two times) the coefficient for the z-score plus the coefficient for the positive (negative) rainfall shock dummy variable. In the sample, the probability of a positive rainfall shock is 3.2 % and the probability of a negative rainfall shock is 1.7 % (see Appendix Table 10). The definition of a positive and negative rainfall shock is consistent with others studies, such as Kudamatsu et al. (2012), which use a 2 standard deviation threshold. I also present the analysis with rainfall data divided into bins, since this specification helps highlight which levels of rainfall matter, specifically for negative rainfall shocks.

Since rainfall may be serially correlated across years, vector R 2 is included to control for the level of rainfall during other early-life years; these years include the woman’s first 3 years of life and the year prior to when she was conceived. The pre-conception period is defined as the 12-month period preceding the in utero period (before the woman’s mother was pregnant with her). The first year of life is measured as the 12-month period starting with the woman’s month and year of birth. The second and third years of life (up to the woman’s third birthday) are defined similarly. For each of these years, the same three variables are included, namely the z-score and the two dummy variables for a positive and negative rainfall shock. δ is the year of birth fixed effect to control for time-invariant differences by year of birth, such as better average health for a cohort of women due to an early-life health intervention in a certain year. λ is the weather station fixed effect to control for time-invariant characteristics by geographic location. The weather station fixed effect serves to control for differences by weather station, such as the types of crops that are planted in different geographic locations which may differentially be affected by rainfall and, in turn, differentially affect in utero and maternal outcomes.Footnote 4

No individual-level characteristics are included because these data were not collected for the sisters of respondents. x represents the household religion and is the only household-level characteristic in the regression. Other household-level characteristics are not included because they potentially represent a causal pathway for the relationship between in utero conditions and maternal survival later in life (such as household income). Finally, Υ is the vector of two variables to control for the potential effect of imputing missing rainfall data (Appendix Text A2); the first variable measures the number of months for which rainfall data are missing in the original dataset, and the second variable measures the number of months for which rainfall could be imputed. These control variables ensure that the imputation of missing rainfall data is not affecting the coefficient of interest.Footnote 5

I use a linear probability model, with robust standard errors clustered at the weather station level. The model includes DHS individual sampling weights which are adjusted to account for differences in sample size and population size by country, consistent with methodologies to analyze cross-country DHS data (Balk et al. 2003). I also conduct sensitivity analyses using maternal mortality sampling weights (Gakidou and King 2006); since there will be a lower probability of selection into the sample in high maternal mortality areas, these adjustments give greater weight to households with a higher proportion of sister deaths.

Since there are different potential pathways through which in utero conditions could affect maternal survival later in life, I then investigate these potential pathways. I use the same specification but with different outcome variables. First, I look at the potential role of selective mortality and selective fertility in mediating the main results. Selective mortality may occur if differential rates of early-life survival affect which women reach reproductive age and alter the average characteristics of the surviving cohort. Similarly, selective fertility and selective in utero survival occur if there are differential rates of conception and fetal survival due to the level of rainfall during the pre-conception and in utero period, which affect the size of the birth cohort and its average characteristics.Footnote 6 To test for potential selective fertility and selective in utero survival, I generate a variable measuring birth cohort size for each birth cohort in each weather station. I then collapse the data so that the unit of analysis is at the birth cohort-weather station level. Using a Poisson regression (suitable for count variables as outcomes), I estimate the effect of rainfall during the in utero period, as well as the pre-conception period, on birth cohort size, and I include birth cohort and weather station fixed effects.Footnote 7 Then, to assess differential early-life survival, I use my original specification where my outcomes are (1) a dummy variable for whether the female survived infancy (reached age 1), (2) a dummy variable for whether she survived childhood (reached age 5), and (3) a dummy variable for whether the female survived to her reproductive years (reached age 15).

Secondly, I look at the impact of in utero conditions on key adult outcomes in order to identify potential pathways linking in utero conditions to maternal survival. These outcomes include the woman’s health, education, income, access to care at delivery, and fertility indicators. For these analyses, I must limit the sample to DHS respondents (thereby excluding the sisters they report on) because comprehensive individual characteristics are only available for the respondents. I also limit the sample to respondents between the ages of 15 and 19 to proxy for the characteristics of the entire sample as they enter their reproductive years. Due to potential selective mortality over time, older survey respondents may not be representative of the entire sample of sisters because they have survived multiple pregnancies and are likely to be healthier.

5 Main findings, causal pathways, and robustness checks

The main results of the study (Table 3) show that a positive rainfall shock during the in utero period decreases the probability of maternal death later in life by 1.1 percentage points (p < 0.01).Footnote 8 This effect represents a 58 % decrease from a mean of 1.9 % in the sample. The main results fail to identify a statistically significant effect from a negative rainfall shock during the in utero period (see Appendix Table 11 for a full set of coefficients reported).Footnote 9 When control variables are included for rainfall during other early-life years, the effect of a positive rainfall shock during the in utero period remains robust. The consistency of this effect, even after controlling for rainfall during other early-life years, suggests that this main effect is identifying the influence of conditions during the in utero period, independent from conditions during other early years of life. The analysis fails to identify statistically significant effects from levels of rainfall that are relatively close to mean local rainfall.

Table 3 Effect of rainfall during in utero period on maternal deaths later in life

In analyses grouping the level of rainfall by bins, where the level of rainfall is grouped into 1 standard deviation increments and compared to the omitted category (0–1 standard deviation of rainfall greater than the mean), the effect of a positive rainfall shock during the in utero period remains consistent (Table 4). Relative to women who experienced rainfall between 0 and 1 standard deviation greater than the mean, those who experienced rainfall between 2 and 3 standard deviations greater than the mean have a 0.9 percentage point reduction in the probability of maternal death (p < 0.10), representing a 47 % relative decrease. Women who experienced rainfall 3 or more standard deviations greater than the mean had a larger decrease in the probability of maternal death, a 1.6 percentage point decrease (p < 0.05) representing an 84 % decrease. There also appears to exist a protective effect of rainfall of at least 3 standard deviations less than mean local rainfall (relative to women who experienced rainfall 0–1 standard deviation greater than the mean). These women have a 1.5 percentage point reduction in the probability of maternal death later in life (p < 0.01), representing a 79 % decrease. Overall, the effect of a positive rainfall shock remains consistent across specifications and demonstrates that a positive rainfall shock during the in utero period (whether defined as 2 or 3 standard deviations greater than mean rainfall) is protective of maternal survival. There is also suggestive evidence that very low rainfall (at least 3 standard deviations less than the mean), likely drought conditions, appears to reduce the probability of maternal death.

Table 4 Effect of rainfall during in utero period on maternal deaths later in life

Subsequent analyses assess whether selection effects, from differential early-life survival, can explain the main findings. Table 5 shows that there is no detectable effect from either a positive or negative rainfall shock while in utero on these women’s own survival as infants or as children. Additionally, there is no detectable effect from rainfall shocks during the in utero period on fetal survival, as measured by birth cohort size. Rainfall shocks during the pre-conception period also have no detectable effect on cohort size, a proxy for conception rates. These results imply that the improvements in maternal survival, as a function of both positive and negative rainfall shocks while in utero, cannot be explained by selective mortality early in life or selective fertility among these women’s mothers.

Table 5 Effect of rainfall during in utero period on birth cohort size, infant deaths, and childhood deaths

The next set of analyses investigates potential causal pathways that could explain the study’s main findings, using the sub-sample of 15–19-year-old respondents. The results (Table 6) show that a positive rainfall shock while in utero reduces the probability of moderate/severe anemia by 9.9 percentage points, representing a 52 % decrease from a mean of 19 % in the sample. A positive rainfall shock while in utero also reduces BMI by 1.1 (p < 0.01), representing a 5 % decrease from a mean of 20.5 in the sample. This effect is driven by a reduction in weight by 2.7 kg, representing a 5 % decrease from a mean weight of 50.4 kg in the sample. There is no statistically significant effect on height. The analyses fail to identify detectable effects from a positive rainfall shock while in utero on education, wealth, access to care at delivery, or fertility-related outcomes. There are no detectable effects on any of these outcomes from a negative rainfall shock while in utero. Because of limitations with the data (both in terms of the absence of individual characteristics for sisters of respondents and the limited sample size for respondents’ ages 15–19), there may be other potential causal pathways that exist but could not be detected, given a lack of statistical power in this sub-sample.

Table 6 Effect of rainfall during in utero period on adult health, socioeconomic, access to care, and fertility outcomes (15–19-year-old respondents)

5.1 Robustness checks

I conduct a number of sensitivity analyses to assess the robustness of these results. First, I test whether migration from birthplace affects the main findings. The identification strategy relies on rainfall from the location while in utero, whereas the DHS GPS coordinates represent the current location of women’s residence (i.e., their current home). Migration from birthplace would only affect the results if it was far enough from a woman’s birthplace that she would have been matched to a different weather station. In the literature, there is little quantitative data available on the rate or distance of rural-to-rural migration in SSA, especially compared to rural-to-urban migration data (International Organization for Migration 2006). In my main specification, I have restricted the sample to rural households, which should reduce the probability of long-distance migration since urban residents are likely to migrate longer distances. The only available variable in the DHS related to migration asks respondents to report the number of years they have lived in their current residence (i.e., their current physical housing structure). When I limit the sample to respondents who report having always lived in their current residence, the effect of a positive rainfall shock while in utero is no longer statistically significant, likely because the sample is reduced by 51 % which limits statistical power (Appendix Table 12). Among women who have migrated (i.e., they reported not having always lived in this current location), the effect of a positive rainfall shock remains consistent; it reduces the probability of maternal death by 1.2 percentage points, a 69 % decrease (p < 0.01). When I test for the equality of coefficients for a positive rainfall shock, comparing the coefficient for women who have never migrated and those who have migrated, I fail to reject that these coefficients are equal to each other. This test confirms that using current residence as a proxy for residence while in utero is a valid assumption for the main specification.

Second, I test whether timing of birth (by birth season) affects the results. Evidence has shown that parents may time births according to seasons because of differences in disease prevalence and economic conditions between the rainy and dry seasons in SSA (Artadi 2005). Individuals who time births according to the rainy and dry seasons may be different, on average, than individuals who do not time their births. I include a birth season fixed effect which equals 1 if the birth occurred during the rainy season for that country.Footnote 10 The main results are not affected by the inclusion of this variable (Appendix Table 13). In addition, the rainy season dummy is not statistically significant, meaning that maternal survival later in life is not differentially associated with whether a woman was born during the rainy or dry season.

Third, I use maternal mortality sampling weights in addition to the individual sampling weights. A major complication from using DHS data is that sample selection is done ex-post, after certain women in the household may have died. This ex-post sampling creates two problems: (1) households with high mortality rates will be less likely to be selected and (2) households where all siblings have died will never be selected into the sample. In the study of Gakidou and King (2006), sampling weights correct for this first potential source of maternal mortality bias by giving more weight to households with higher mortality rates; the sampling weights cannot correct for the second potential source of bias where no women have survived to reproductive age.Footnote 11 The main results for a positive rainfall shock are not affected by including maternal mortality sampling weights in the specification (Appendix Table 14).

Fourth, to increase the precision of my coefficients, I include control variables for rainfall during the year of the most recent delivery (defined similarly to rainfall during the in utero period). I restrict the analysis to respondents, for whom I have the year of the most recent delivery, and sisters who have died, for whom I use the year of death. Sisters who have moved away are excluded from this analysis, since I do not have data on their most recent delivery. The main results for a positive rainfall shock remain robust; a positive rainfall shock decreases the probability of maternal death by 5.7 percentage points (p < 0.01), representing a 95 % decrease from a mean of 6 % (Appendix Table 15). This analysis highlights that a positive rainfall shock during the year preceding the most recent delivery also decreases the probability of maternal death by 2.6 percentage points (p < 0.01), representing a 43 % decrease from a mean of 6 %. There is no detectable effect from a negative rainfall shock during this period.

In summary, these robustness checks confirm that the main findings are consistent across different specifications.

6 Limitations

This study is the first to assess the relationship between in utero conditions and maternal survival later in life. While this study identifies effects from a positive rainfall shock during the in utero period on maternal survival later in life, there remain important unanswered questions regarding this relationship that this study is unable to answer.

One of the study’s main limitations is the absence of intermediate outcomes for the entire sample. Such intermediate outcomes could help directly demonstrate how rainfall during the in utero period affects household well-being, including that of pregnant women and their female fetuses. The study’s findings are consistent with the predicted positive relationship between rainfall and crop yield, even at high levels of rainfall, in the context of SSA. Additionally, other studies similarly fail to find evidence of detrimental effects from even very high levels of rainfall (Kudamatsu et al. 2012; Maccini and Yang 2009).

An alternative explanation for the protective effect of a positive rainfall shock could relate back to possible early-life selection effects. Specifically, this would occur if high levels of rainfall during the in utero period are detrimental, rather than beneficial, resulting in selective fertility, selective fetal survival, and selective early-life survival. In SSA, malaria could be a contributing factor and an alternative explanation for these findings. Malaria transmission peaks when rainfall is high (Kent et al. 2007; Odongo-Aginya et al. 2005). In addition, Plasmodium falciparum malaria can increase the chance of miscarriage (McGready et al. 2012). The effect of malaria, especially in relation to rainfall, is an important consideration in Sub-Saharan Africa. However, in order for malaria to be a mediating factor in this study’s observed effects (i.e., high levels of rainfall improving maternal survival), there would need to be early-life selection which leads to a healthier cohort of women who survive conception, birth, and early childhood. While there is some evidence that such effects may exist (Kudamatsu et al. 2012), these findings are based on a different identification strategy than my study and use the combination of rainfall and temperature data to identify malarial conditions. Nevertheless, I find no evidence of early-life selection effects in my study sample. There is no detectable effect on birth cohort size, a proxy for miscarriage, nor is there an effect on early-life survival, such as among the children under 5 years of age who are most at risk of dying from malaria.

The absence of statistically significant effects from a negative rainfall shock during the in utero period is somewhat surprising, because droughts are a major contributor to food shortages in SSA. In an alternative specification, I do find that rainfall at least 3 standard deviations less than mean rainfall affects maternal survival but in the opposite direction from what would be expected. These results suggest that very low levels of rainfall during the in utero period are protective of maternal survival later in life. In results not shown, I find that very low rainfall during the in utero period (i.e., 3 standard deviations less than the mean) also increases the probability of facility-based deliveries. This link suggests that these women may have higher-risk pregnancies and deliveries and, as a result, are more likely to be referred or transferred to a facility for delivery, which could explain the protective effect on maternal survival. However, lack of data on these intermediate pathways means that I am not able to draw strong conclusions about these specific mechanisms. Another potential explanation for the absence of an effect from a negative rainfall shock could relate back to positive selection. While I fail to find evidence of positive selection, my sample will not include generations of sisters where all sisters have died and the use of the maternal mortality weights will not correct for this potential source of bias. This means that there may be positive selection which we would observe if we could compare the analytic sample (generations of sisters where at least one sister has survived) to the sample of all generations of sisters. Sisters in the analytic sample may be more resilient to negative rainfall shocks, particularly because sisters’ survival rates may be correlated within household, and this could explain the absence of detectable effects from a negative rainfall shock. Further investigation is warranted to better understand these two potential interpretations.

Overall, I am significantly limited by my sample in terms of drawing strong conclusions about the intermediate pathways explaining the main effects. I restrict these sub-analyses to the youngest cohort of respondents to avoid potential selection. Indeed, there is some evidence of better outcomes for the older cohorts of respondents, which legitimates only using the youngest cohort of respondents to generate unbiased intermediate outcomes for the entire sample. However, this restriction significantly limits the statistical power for assessing causal pathways and means that other causal pathways may exist.

Another limitation of this study is the absence of GPS data on the observations’ location while in utero, since the DHS GPS data only reflect the household’s current location. This introduces potential measurement error into matching observations with weather stations, if migration is far enough such that an observation would have been matched with a different weather station. As one of my main robustness checks, I assess whether the main findings are different when I compare observations who report having always lived in the current location with observations who report having moved from their birthplace. Since I fail to reject that the coefficients of interest are equal, this suggests that using current residence as a proxy for residence while in utero, at least in this sample, is a valid assumption for the main specification. In addition, we would expect migration from birthplace to attenuate the coefficient of interest since better off women from low rainfall areas would be more likely to be able to migrate to high rainfall areas, thereby attenuating the beneficial effect of a positive rainfall shock.

One potential challenge with the data is that it relies on respondents’ self-reports for birth month, and birth month is used in the identification strategy to identify the in utero period. While there is the possibility of measurement error, it is difficult to distinguish between measurement error and seasonality of birth months. Recent research using DHS data for births in SSA from the 1980s onwards found differences in birth seasonality by regions of SSA. For example, West and Central Africa have unimodal birth month distributions compared to other SSA countries with a bimodal or a flat distribution (Dorélien 2013). In contrast to these data, my dataset represents births as early as 1944 and birth month seasonality may have changed. While there may be measurement error, particularly when respondents are reporting their sisters’ birth months compared to their own, the measurement error should be random noise, since it is not plausibly associated with the level of rainfall during the in utero period. In some cases, months during the post-birth period will be included as part of the in utero period. This should attenuate the main effect since this period was not found to affect the outcome of interest. In other cases, additional months during the pre-conception period will be included; there is suggestive evidence that this period also influences maternal survival and means that the main effect could be identifying some influences coming from this earlier period. However, it is not possible to distinguish between measurement error and variation in birth month based on concurrent factors shown to influence birth timing, such as social factors, disease environment, and climatic and energetic factors (such as food availability) and demands on labor (Artadi 2005; Dorélien 2013).

Finally, to maximize the likelihood of identifying detectable effects, I combine datasets across SSA countries. The results provide estimates of the average effect of rainfall during the in utero period, controlling for differences by climate region through the weather station fixed effect. However, further country-level analyses would be beneficial to understand the relationship between rainfall and maternal survival, given country-level differences in agriculture. Such analyses are challenging, given that maternal mortality is a low probability outcome (statistically speaking) requiring large sample sizes for sufficient statistical power.

7 Discussion and conclusion

This study identifies a causal relationship between the conditions a woman experienced while in utero and her maternal survival later in life. The effect of a positive rainfall shock while in utero suggests that, in addition to factors at the time of pregnancy and delivery, conditions during the in utero period also play an important role influencing maternal survival in these SSA countries. The effects are driven by extreme conditions, rather than conditions closer to the mean, suggesting that maternal survival is affected by in utero conditions only when these conditions are substantially better from average conditions within a local weather area. There is also suggestive evidence that drought-like conditions affect maternal survival later in life.

Better in utero conditions appear to improve maternal survival later in life through their effect on anemia. This is a plausible pathway, given that anemia is a major risk factor for postpartum hemorrhage which is the leading cause of maternal death (WHO 2005; Balarajan et al. 2011). Since the relationship does not appear to be mediated through an effect on education or income, it suggests that there may be permanent programing of adult anemia consistent with the fetal origins hypothesis. While there is no known research applying the fetal origins hypothesis to adult anemia, medical evidence shows that premature and underweight babies, conditions which can proxy for a poor in utero environment, have altered hematological systems and iron metabolism early in childhood (Griffin et al. 1999). It is plausible, though not demonstrated, that these changes in childhood anemia could have permanent effects into adulthood, particularly if childhood anemia is not corrected.

Another plausible way that a positive rainfall shock while in utero could affect maternal survival is through adult BMI. Adult BMI influences cardiovascular risk factors, such as hypertension, which would increase the likelihood of pregnancy-induced hypertension. The relationship between better in utero conditions and lower BMI is consistent with studies showing that low-birth weight babies and premature infants have higher rates of obesity later in life due to early-life catch-up growth (Casey et al. 2012). Other studies have also shown that poor in utero conditions are associated with elevated blood pressure in adulthood (Hult et al. 2010). Even though this sample is not overweight, according to BMI, there is evidence of a higher than expected prevalence of metabolic risk factors, independent from BMI, among poor rural populations (Jesmin et al. 2012). These results appear to be consistent with the fetal programming hypothesis, since the effect on adult BMI does not appear to be mediated through other factors such as education or income.

By investigating a previously unexplored question relating in utero conditions to maternal survival later in life, this study expands the field of research on fetal programming and the Barker hypothesis and points towards the need for further research to elucidate the unanswered questions generated by this study. Research is needed to better understand the direct pathway for the effect of rainfall shocks while in utero on fetal health and long-term health outcomes as well as to identify whether other causal pathways related to maternal mortality exist that could not be identified by this study. While this study examines whether fetal conditions have any long-term effects on maternal survival, there may be differential effects depending on whether a positive rainfall shock occurs during different trimesters. Other research has shown that, while fasting during the first and second trimesters matters for birth weight, longer-run outcomes related to disability are affected by fasting during the first month after conception (Almond and Mazumder 2011). Similarly, another study shows that exposure to the influenza pandemic in 1918 in the USA affected poverty-related outcomes if it occurred during the first trimester (Almond 2006). A review of maternal nutrition and birth outcomes found that iron deficiency during the first trimester results in fetal growth restriction, as compared to iron deficiency due to anemia during the second or third trimester (Abu-Saad and Fraser 2010). Future research examining differential effects by trimester would not only highlight which trimester matters most for maternal survival later in life but could also help elucidate which causal pathway accounts for the study’s main findings. In addition, follow-up research on the effect of the pre-conception period on later in life outcomes, such as maternal mortality, is also warranted, given suggestive evidence that this period also matters.

In summary, the findings imply that current efforts to improve maternal survival in SSA focused on interventions during pregnancy and, at the time of delivery, may be limited by the consequences of relatively poor in utero environments previously experienced by these women. Although the findings cannot point to specific policy interventions, they suggest that we are underestimating the benefits of interventions that improve conditions for pregnant women, such as conditional cash transfers for pregnant women, food and prenatal supplements, land irrigation, savings and insurance mechanisms, female education, and employment programs. Interventions which improve conditions for pregnant women will have inter-generational effects: not only will such interventions directly benefit pregnant women today but will also improve their daughters’ chances of maternal survival later in their lives.