1 Introduction

Natural disasters and extreme climate events in developing countries affected over 1.9 billion people in poor countries between 2003 and 2013 and caused over USD 494 billion in financial loss (FAO 2015), and river floods affect about 21 million people every year, largely in developing countries (World Resources Institute 2015). Given that climate change is expected to increase the probability of occurrence of these extreme climate events (IPCC 2007) and that agriculture is highly sensitive to climate variability (IPCC 2014), this presents a serious barrier to farmers in poor countries, since extreme weather shocks raise the probability of harvest loss, and may cause underinvestment in farm management because of risk aversion (Dorward and Kydd 2004; Emerick et al. 2016). The consequences caused by those recurrent extreme events may affect Africa more than other regions since 82% of the poor live in rural areas (Beegle and Christiaensen 2019), 60% of the labor force are involved in agriculture in countries such as Kenya or Tanzania, and are vulnerable to extreme weather shocks. Furthermore, those events remain unpredictable to a given location, making it challenging and costly for households to adopt ex ante risk-reduction methods to reduce the negative impact.

A country like Tanzania is exposed to recurrent natural disasters, multiple destructive flood events in particular (Erman et al. 2019),Footnote 1 and studying flood impact in Tanzania can provide some insights into how households and individuals in the region are affected. In addition, it can inform on effective channels that help affected households recover after facing a loss in assets (i.e., crops, livestock, machinery, irrigation systems, public infrastructures, supply chains). In addition, the increased frequency of disasters in the region due to higher climate variability might reduce that country’s probability of achieving the Sustainable Development Goals (SDGs) of promoting sustainable agriculture, reaching food security, and improved nutrition.

In this paper, we do two things. First, we use satellite-based flood exposure data and a nationally representative panel survey in Tanzania to estimate the causal effects of two successive large floods in Tanzania that occured in 2009 on agricultural households’ value of crop production, income, total expenditures, individual life satisfaction and child nutrition in a difference-in-differences framework. We use night-time light intensity to assess whether flooded wards in Tanzania have similar pre-trends in their outcomes than comparison wards.Footnote 2

Second, we perform a subgroup analysis to test for heterogeneous effects of floods across different population groups. For example, we investigate whether the effects vary across smallholder farmers versus large holders, across households that received any transfer versus not at baseline, and whether households living in villages located in or near forests are disproportionately affected relative to their counterparts that are farther from forests. The results can be helpful to identify the most affected groups to target quickly during post-disaster interventions.

These results are related to different literatures. There is a growing body of research looking at vulnerability and resilience in the face of various stressors and extreme events and provides characteristics of the vulnerable and the resilient subpopulations (Ligon and Schechter 2003; Lybbert et al. 2004; Barrett and Constas 2014; Cissé and Barrett 2018). Households in Bulgaria for example, with an employed and educated male head, are less vulnerable to aggregate shocks than other households with different characteristics (Ligon and Schechter 2003). There is also long-standing literature suggesting that poor households in developing countries often face problems of high risk of shocks, income variability, and incompleteness of markets such as insurance markets and rely on different strategies to smooth their consumption including savings, loans, or safety nets (Townsend 1994; Udry 1995; Morduch 1995).

The body of literature investigating the consequences of climate change on agriculture has mostly looked at the effects of events such as rainfall and temperature on several outcomes including agriculture (e.g., Schlenker and Roberts 2009), education (Maccini and Yang 2009), health (Deschênes and Greenstone 2011), and economic growth (Dell et al. 2012). However, less work has been done to look at the quantitative economic impacts of natural disasters. A few studies look at the effects of storms and cyclones and have found opposite results, which range from positive impacts on human capital accumulation (e.g., Skidmore and Toya 2002) to large negative effects in the short-run (Hsiang and Jina 2014). For example, Anttila-Hughes and Hsiang (2013) look at the effects of 411 typhoons that occurred in the Philippines between 1979 and 2008 on economic and health outcomes, and find an increase in female infant mortality, but also that income and asset loss caused a reduction in expenditures primarily in health and human capital. Natural disasters often cause negative and persistent effects after their occurrence, including direct and indirect effects including human and animal death, trauma, asset destruction, reduction in consumption levels and underinvestment, rural poverty (Long 1978; Sivakumar 2005; Cavallo et al. 2010; Del Ninno et al. 2003), decrease in nutritional status, investment in education and health, and income loss. In addition, individuals from emerging countries suffer most from the occurrence of those extreme events (Bakkensen and Mendelsohn 2016).

Floods are a class of disasters that affected more people than any other disaster between 1994 and 2013 worldwide (CRED 2015), and a growing number of studies analyze their effects on household and individual-level outcomes. The evidence from these studies suggests that households exposed to natural disasters often experience a negative change in their outcomes including agricultural yields, income, expenditures, welfare, subjective well-being, and are more likely to migrate as a coping mechanism to compensate for the loss (del Ninno et al. 2001; Guiteras et al. 2015; Michler et al. 2019; McCarthy et al. 2018; Alvi and Dendir 2011; Giannelli and Canessa 2021; Gröger and Zylberberg 2016). For example, del Ninno et al. (2001) use cross-sectional data to analyze the effects of large floods in Bangladesh that occurred in 1998 and find that households exposed to floods experienced crop loss of more than 42%. Nutritional status and food consumption were not severely affected due to transfers from NGOs and private-sector borrowing. Additionally, individuals working in the labor market experienced a decline in their participation after the shocks. Similarly, McCarthy et al. (2018) analyze the impacts of flooding in Malawi that occurred during the 2014/2015 agricultural season on households’ food consumption; using an instrumental variable approach, the authors find a significant reduction in crop yields but a mild impact on caloric intake and food consumption. Floods also negatively impact individual subjective well-being in the long run, and those effects are not fully eliminated over time (Hudson et al. 2019).

Our results are consistent with the literature and highlight high levels of vulnerability to shocks among agricultural households. Specifically, we find that Tanzanian households that were living in areas affected by large floods experienced a statistically significant 34% drop in the value of their crop production. This reduction is large (39%) 1 year after (short-run) the shock but the effects become insignificant 3 years later (long-run), although the t-test of the difference between both coefficients is insignificant.Footnote 3 In contrast, the effects on total household expenditures and child nutrition are not statistically different from zero. Yet, we do find evidence of a significant and persistent decrease in individual satisfaction and other negative psychological effects. The results across subgroups, however, show that those households that received some transfer income from the government, NGOs, or remittances were able to attenuate the negative effects of floods, while those with no transfer income experienced a larger decrease in their household expenditures compared to those who have received some transfers. This result suggests that disaster relief efforts are important to improve the welfare outcomes of affected households and individuals after a disaster. Lastly, we find that forests are an important recovery mechanism since households living in clusters near any type of forest experience a smaller drop in their crop production than those living in clusters that are far from forests.

The results are robust to a variety of checks and specifications including a standard Two-way Fixed effects Difference-in-Differences, a doubly-robust estimation, and other types of matching methods during pre-processing such as entropy balancing. We also implement Conley spatially adjusted standard errors at 10 km, 50 km, and 100 km respectively, and show that the estimates are still significant when accounting for these different cutoffs at which spatial dependence is assumed to be zero. We find no evidence of selective attrition in the sample as a response to the exposure in a way that may bias the results. We also perform a sensitivity check following Oster (2019) to assess the role of unobserved confounders and find that the results are robust to omitted variable bias.

Our work is closest to two recent papers by Baez et al. (2019) and Deryugina et al. (2018). Baez et al. (2019) analyze the impacts of three different extreme weather events (i.e., floods, cyclones, and droughts) in Mozambique using a triple differences framework. The authors show that households affected by any of the three events experience a reduction in per-capita food consumption, non-food consumption, and the likelihood of owning assets. Deryugina et al. (2018) investigate the individual-level economic impacts of Hurricane Katrina in the United States. Using an inverse propensity weighting approach with fixed effects, the authors find that Hurricane Katrina impacted individuals’ residential locations, caused an increase in short-run unemployment claims, and a decrease in labor market income.

The contribution of the present paper is threefold. First, it contributes to the growing literature exploring the impacts of natural disasters in Sub-Saharan Africa on household-level outcomes; it is one of the few studies that provide evidence on the impact of large-scale floods in Africa on household and individual outcomes using panel data. Second, it highlights the critical role that safety nets played in households’ recovery, contributing to the literature on the impacts of safety nets. Third, it contributes to the growing literature assessing the psychological effects of the distress and trauma caused by exposure to natural disasters. It is among the first papers to show how the mental well-being of Tanzanian residents is affected after exposure to disasters.

The remainder of the paper is structured as follows. Section 2.2 provides additional background on the floods and describes the data sources used in the analysis. Section 2.3 presents the estimation strategy and discusses the potential challenges in estimating the effects of floods. Section 2.4 presents the main findings. It also discusses mechanisms for the main findings, analyzes the heterogeneity effects, and presents some robustness checks and limitations of the findings. Section 2.5 concludes.

2 Background and Data

2.1 The 2009 Large Floods in Tanzania

Many regions in Tanzania experienced large-scale floods caused by heavy rains from November of 2009 through January of 2010. These rains are believed to be the consequences of El Nino in East Africa. These shocks caused damage to infrastructure including roads and houses, leaving thousands homeless. The first large flood caused by heavy rains started on November 10, 2009; it affected an area of 194,788 square kilometers, and the heavy rains lasted three days (Fig. 1). The second large flood started more than a month later, on December 25, 2009, and affected around 167,332 square kilometers, and lasted several weeks (Fig. 1). Multiple regions in Tanzania were affected, including Morogoro, Dodoma and Ruvuma. Although flooding is a recurrent event in the country, with one flood per year on average, most floods that occurred from 2000 to 2007 affected areas ranging from 238 to 75,937 square km and were not as large compared to the ones that occurred in 2009.Footnote 4 Both floods in November and December of 2009 affected almost the same regions in Tanzania and approximately 50,000 people were affected.Footnote 5 The 2009 floods caused injuries and impacted the local infrastructure such as schools and health facilities. These shocks (treatments) occurred after the completion of the first wave of the panel survey data used in this paper, which allows me to use that wave as a baseline survey, and use exposure to both floods as a treatment.

Fig. 1
figure 1

Distribution of the survey enumeration areas (points) along with November and December floods (polygons)

Figure 1 shows two maps of Tanzania and the distribution of the households, with some households affected by floods during November and December. The polygon shapes, which are produced by the Dartmouth Flood Observatory (DFO), represent the flood water extent during its whole period of occurrence, while the points represent the point coordinates of the survey clusters.

The DFO uses several flood detection tools (e.g., NASA Earth Observatory, MODIS Land Rapid Response System, or Tropical Rainfall Measuring Mission data) and works with different partners worldwide (e.g., news agencies, governmental, flood responders, and other data agencies) to discover and locate floods. MODIS is a collection of satellites that scan the surface of the Earth every few days, recording reflectance values over 36 bands in the visible and infra-red spectra. The flood detection system uses satellite-based, remote-sensing tools to monitor floods (or surface ground-water discharge above a specific threshold) over the globe daily.

The initial satellite flood data are generated at NASA Goddard Spaceflight Center (GSFC) using reflectance products in a fully automated manner and with a data lag time of only a few hours; the resulting raw product is then transmitted to DFO as it is produced, which manually reproduces the flood water extent into GIS polygon (i.e., shapefile) data form. The daily global coverage is done at 250 m spatial resolution only for floods that affect areas larger than about 0.5 km in width (Brakenridge 2012). The higher spatial resolution from the satellites minimizes measurement error in terms of mapping the extent of the areas affected by the floods.

2.2 Treatment and Control Groups Construction

We construct the dataset of the treated households, that is the households affected by the floods, by using the geo-referenced data of the households selected from the Living Standard Measurement Study—Integrated Surveys on Agriculture (LSMS-ISA) survey from Tanzania, which is then combined with the shapefiles of the flood water extent. Finally, we create a dummy variable that takes the value of one (1) if the household coordinates fall completely into the polygon shape of the flood extent, and zero (0) otherwise. Since we am studying the effects of both floods instead of just one, we consider only households living in enumeration areas that have experienced both floods in November and December, and remove households affected once in the analysis (186 households or two percent of the sample).Footnote 6 Many studies investigating the impacts of flooding construct the treatment by averaging the satellite measures of rainfall intensity instead of using actual shapefiles of flood extent as done in this paper. The approach relying on using rainfall as a proxy for flooding has some limitations because rainfall may be an imperfect proxy given additional factors such as topography, slope, or distance to rivers (Chen et al. 2017).

To infer the causal effects of floods correctly, a good comparison group is required. One could use all unaffected households as a possible comparison group, but if significant differences exist between flooded and non-flooded households before the survey period, then the coefficient estimates will be biased. For example, households living near rivers or water bodies have a higher likelihood to be flooded and households in elevated villages will be less likely to experience flooding. Therefore, one needs to account for the selection on observables and unobservables in the estimation strategy.

To minimize the possibility of differences between groups before the shock, which would make households different in their likelihood of being flooded, we use the kernel propensity score weighting approach to find a comparable counterfactual to the flooded households based on a set of covariates. This approach approximates the parallel trends assumption between treatment and comparison groups, which generally allows doing a simple difference-in-difference.

To select the variables to include in the propensity algorithm that will satisfy the conditional independence assumption, Smith and Todd (2005) suggest that one includes only variables that influence simultaneously the participation decision and the outcome variable.

There exist several balancing methods that use matching or reweighting data to increase the balance between a set of treated and control units and allow causal inference. Among those methods, there are propensity score matching (PSM), nearest neighbor matching (NNM), Mahalanobis distance matching, inverse propensity weighting (IPW), genetic matching, and recently Coarsened Exact Matching (Iacus et al. 2012), entropy matching (Hainmueller 2012) and inverse propensity weighting using covariate balance propensity scores (CBPS-weights, Imai and Ratkovic 2014). All these preprocessing methods approximate the parallel trends assumption between treatment and comparison groups before any treatment occurs, which generally allows doing a difference-in-difference (DiD) conditional on the generated weights or a selected sample.

We use kernel propensity score matching as the pre-processing method to balance the data before estimating treatment effects. Kernel matching (KM) used relies on the Epanechnikov distance function and is a non-parametric estimator that applies the weighted averages of almost all observations in the comparison group to build the relevant comparison group.

It can be seen as a weighted regression, where weights depend on the distance between the flooded and non-flooded households in terms of propensity scores in a specific bandwidth. It is different from the more traditional nearest neighbor matching (NNM) because the latter will take the observation with the closest propensity score as a valid control, whereas kernel matching will assign a weight to all observations within a specified radius and assigns a weight of zero to observations outside the radius. A control household with a closer propensity score is assigned a higher weight, and a more distant household has a lower one (Smith and Todd 2005). The preferred specification excludes households whose propensity scores are outside the range of propensity scores in the other group (i.e., the “common support” restriction), which means that the observations are limited to the range of propensity scores at which one observes both flooded and non-flooded households. Figure 2 shows the distribution of propensity scores calculated and shows that there is enough overlap in the distributions of the propensity scores. The observations outside the common support (overlap) are not considered in the weighted regression because they are not assigned any weight.

Fig. 2
figure 2

Propensity score distribution by treatment status

Before matching, differences between both groups are expected; however, after matching the observed covariates should be balanced on average in both groups, and therefore no significant differences should be found. The matching is done at the household level using household and village-level covariates. Table 1 presents the covariates' balance between both groups before and after the PSM. Columns 1 and 2 confirm that, before matching, flooded households are statistically significantly different from those that are non-flooded in terms of the use of anti-erosion technologies, as well as climatic and geographic variables. For example, non-flooded households on average live in areas with a flatter slope, a lower soil wetness index, more surrounding vegetation, a lower elevation, and had a lower number of days exposed to floods in the past 10 years before the 2009 shocks. Columns 4 and 5 are the mean values of the same covariates after matching, and column 6 performs a weighted t-test to assess the balance. The results show that the flooded and non-flooded households that are within the common support region are on average statistically significantly balanced on the distribution of the relevant covariates. The sample size at baseline decreases after matching from 2025 to 1141 households because some households could not find a match. The sample size over the 3 years after matching or weighting is 3007. As a robustness test, we implement the standard DiD approach with controls and fixed effects that uses the full sample size (7179 household-year observations), a DiD approach using the average treatment on the treated (ATT) weights as suggested by Morgan and Winship (2014). The results in Table 13 in the appendix suggest that all these approaches provide consistent estimates of the effects of the floods.

Table 1 Covariate balance before and after matching at baseline, tests for differences

2.3 Agricultural and Welfare Data

We use the National Panel Survey (NPS) of Tanzania, which are nationally representative data, which consists of households randomly selected and surveyed over 3 years (2008/2009, 2010/2011, 2012/2013).Footnote 7 The NPS is implemented by the Tanzania National Bureau of Statistics (NBS) in partnership with the World Bank’s Living Standard Measurements Study—Integrated Surveys on Agriculture (LSMS-ISA) program. During the first round of this panel survey, 3265 households (16,711 household members) were selected, some of which were already surveyed in the 2007 Tanzania Household Budget Survey (THBS). A multi-stage clustered sample approach was used to select the sample. First, clusters were chosen with a probability proportional to the cluster's size in a stratum; about 386 clusters were selected. In the second stage, approximately 8 households were chosen with equal probability in each cluster. A cluster in urban areas is defined as a census enumeration area, while a cluster in rural areas is equivalent to a whole village. The first wave of data collection was done from September 2008 to October 2009, covering both the post-planting and post-harvest seasons. All households from the first round were targeted for a revisit during the second round, from October 2010 to December 2011. However, some households had split or relocated, which increased the sample size from 3265 to 3924 households. The third round of data collection, which lasted from October 2012 to December 2013, resulted in an increased final sample of 5010 households caused by the tracking and interviewing of members from split original households. According to the NPS report, marriage and migration are two of the most common reasons for households splitting over time.

Attrition over the three survey rounds is small. In the second round of data collection, the attrition rate is 3%, and was similar across strata. The primary reason for not being able to survey a household member is the failure to locate the person rather than the refusal to be surveyed. The attrition stays low, at 3.9% in the third round of data collection. If attrition is non-random or correlated with the error term, the results will be biased. Table 17 in the Appendix tests for selective attrition by regressing an attrition dummy (i.e., value = 1 if household at baseline is not present in the sample the year after the flood and 0, otherwise) on the treatment. We find no differential attrition associated with exposure to flooding in our data.Footnote 8

We use the household-level GPS coordinates to create a binary variable for flooding; if a household coordinates fall into the polygon shapefile of the flooded area during the 2009 floods in Tanzania (Fig. 1) that household is considered “treated” by a flood. As a robustness check, we remove all observations that are within 20 km of the border of the shapefile to avoid potential misclassification of flooding status (Fig. 8) and to test whether the main estimates change. As a further robustness check, we restrict to households that are 20 km away from the shapefile boundaries to avoid potential misclassification of flooding status. The results in Table 18 are consistent with the main estimates.

To improve the usefulness of the survey data, the LSMS-ISA team has produced a set of geospatial variables using the unmodified GPS data at the household level and plot level. The environmental variables are produced using unmodified household coordinates and include different measures of the distance (e.g., distance to the nearest market or major road), climatology (e.g., annual mean temperature, annual mean precipitation), and soil and other environmental factors (e.g., majority land cover class, elevation, soil wetness, and nutrient availability). However, the GPS coordinates released have been subject to a random offset within a given range determined by population density to protect the anonymity of communities, households, and individuals. A different offset range is applied to urban areas and rural areas, depending on the local population density. In rural areas, where populations are more dispersed, a larger offset will be applied. This offset approach is similar to Measure’s Demographic and Health Surveys Program.

As part of its goal to improve Open Access tools and disseminate data, the Evans School Policy Analysis & Research Group (EPAR) at the University of Washington has constructed and released a set of household-level agricultural variables and development indicators using the LSMS-ISA raw panel data for three countries.Footnote 9 These variables for Tanzania include household nominal and real expenditures, farm income, the value of crop production, land and labor productivity, agricultural and non-agricultural income, and self-employment. The value of crop production is the crop production aggregated across all crops and valued using local currency. The total expenditures are aggregated across 12 months and include food (eaten either inside or outside the household) and non-food expenditures, and the non-agricultural wage or labor income is constructed by summing incomes from wage employment in all non-agricultural activities across all household members. Finally, self-employment income is the annual income from non-farm enterprises. Table 12 in the Appendix of this paper provides details describing how the variables were constructed.

We use some of these variables as outcome variables for the main specification to investigate the impacts of floods on household agricultural productivity and welfare.

For the analysis, we first restrict the observations to agricultural households, which reduces the sample from 11,895 to 7667 household-year observations over the three-year panel.Footnote 10 Since we consider only households that are in the common support region of the propensity scores or those assigned positive weights in the first stage, the final sample is reduced to about 3007 household-year overall. Figure 3 presents the spatial distribution of those observations in the common support region that are assigned a positive weight.

Fig. 3
figure 3

Distribution of the enumeration areas selected in the common support region. Note: These observations are assigned a positive kernel weight (used for final analysis). This figure has a smaller sample size than Fig. 1 because some observations have been dropped during the weight construction

3 Summary Statistics

Panel A of Table 2 presents the summary statistics for the outcomes and Panel B presents descriptive statistics for covariates. It uses the sample of agricultural households that are used in the analysis. Around 21% of households are affected by both large flood shocks in 2009. About 20% of those agricultural households use anti-erosion measures (e.g., stone bunds, dikes) on at least one of their plots, which suggests that some measures are taken to reduce the impact of floods. For households that have experienced floods between 1999 and 2008, the average number of days the flood lasted is two days.

Table 2 Summary statistics of agricultural households after matching

4 Empirical Strategy

We employ a difference-in-differences (DID) framework. This allows me to control for all group-level location and time-invariant differences between flooded and non-flooded households and thus will reduce bias in the estimates of the effects of floods on outcomes. To address the issue of parallel trends, we further apply a kernel propensity score weighting approach to find a proper counterfactual group at baseline (Heckman et al. 1998; Smith and Todd 2005).Footnote 11

The complete specification of the kernel matching-DID is:

$$Y_{i,j,t} = \beta_{0} + \beta_{1} T_{j} + \beta_{2} P_{t} + \beta_{3} \left( {T_{j} x P_{t} } \right) + \beta_{4} X_{i,j,t} + \varepsilon_{i,j,t} \left[ {aw = weights} \right]$$
(1)

where i indexes households, j is the enumeration area, and t indexes the year (2008, 2010, 2012). The variable \(Y\) represents a transformation of the value of crop production, farm income, labor income, or total expenditures. We transformed those financial variables using the inverse hyperbolic sine transformation that reduces the effects of outliers, approximates the normal distribution, keeps zero-valued observations, and allows one to interpret coefficients as semi-elasticities by exponentiating both sides of the regression equation (Giles 1982; Pence 2006; Bellemare and Wichman 2020). \(T_{j}\) is a binary variable representing the eventual treatment or flood-exposure that takes the value of one if the household coordinates fall into the flooded area at the time both large floods occurred in November and December of 2009, and zero otherwise. \(P_{t}\) is a post-flood indicator variable that equals one if the time is 2010 and after, and zero otherwise. \(X_{i,j,t}\) represents household-level socio-economic and demographic characteristics. The coefficient of interest, \(\beta_{3} ,\) measures the changes in means (from pre-flood to post-flood) of the outcome variable for the treatment group relative to the change in means for the control group. Weights represent the kernel weights. Standard errors allow for heteroskedasticity and are clustered at the enumeration area level.

The key assumptions of Eq. (1) to estimate causal effects are the overlap assumption of the propensity scores and that changes in outcomes for both control and treatment groups are uncorrelated with the treatment, conditional on the propensity scores or the weights constructed. Following Glewwe and Todd (2020), the equation for the conditional independence assumption can be expressed as:

(2)

where ΔY0 = Y0t′′ − Y0t′; ΔY1 = Y1t′′ – Y1t′; and tʹ′; tʹ and tʹ′ represent periods before and after the flooding, respectively; ╨ means statistical independence. Prob[P = 1| F] represents the probability of being flooded.

The common support condition or overlap assumption can be expressed as:

$$0 < {\text{Prob}}\left[ {{\text{P}} = 1|{\text{ Z}}} \right] \, < \, 1.$$
(3)

The choice of the kernel propensity score matching or weighting with DID approach in the absence of common trends is justified by the fact that it is more robust than cross-sectional matching to misspecification in the number of observables and unobservables at the household level, and thus is less observed biased (Heckman et al. 1998). While matching alone controls only for selection on time-varying observables, DID-matching solves the issue of selection on time-invariant unobservables because differencing the outcomes after and before the treatment removes the unobserved fixed time and individual effects that may be correlated with both the treatment and the outcome variables. In fact, in their re-analysis of the National Supported Work (NSW) experiment studied by LaLonde (1986) and Dehejia and Wahba (2002), Smith and Todd (2005) found that the matching-DID estimator is the closest to the experimental approach. Their findings also suggest that the type of matching method used matters little if the common support assumption is met. Furthermore, Abadie (2005) finds that in the absence of the strong assumption of parallel trends, a weaker version of conditioning on covariates can be used.

Another benefit of using the kernel propensity score regression approach over the traditional regression estimators of causal effects with controls (e.g., DiD with controls and Fixed Effects) is that the latter approach has substantial weaknesses, especially when individual-level causal effects are heterogeneous in ways that are not explicitly parameterized, and simply adding controls would not account for that non-linearity (Morgan and Winship 2014). Similarly, Abadie (2005) suggests that when there is an imbalance in characteristics between groups before the treatment, those variables should enter the regression non-parametrically, to avoid any potential inconsistency created by misspecification in the functional form. Therefore, the weighted regression approach gives the researcher two chances to "get it right" via matching and differencing or regressing such that the shortcomings at one stage can be remedied by the other. Arkhangelsky et al. (2019) also find that the weighted DiD, which relaxes the common trends assumption in the standard DiD, has more attractive and robust properties than the standard DiD estimation. For example, their findings suggest that constructing correct weights and then using a two-way fixed effects approach allows for consistent coefficients, even if the two-way fixed effects are not correctly specified over the full panel.

Estimating Eq. (1) using kernel propensity weighting DID regression is done in two stages. In the first stage, a probit approach is used to calculate the probability of being flooded. The flood indicator is constructed at the household level. The equation for the propensity score estimates is as follows:

$$T_{i,j,2009} = \alpha_{0} + \alpha_{1} H_{i,j} + \alpha_{2} D_{{\text{j}}} + \eta_{i,j}$$
(4)

where \(T\) represents an indicator for exposure to both floods in 2009, \(H_{i,j}\) represents household-level characteristics measured in the baseline year before the floods (e.g., use of anti-erosion technology on at least one of the plots). \(D_{j}\) characterizes grid-level features in which households live (e.g., soil elevation, slope, annual temperature, and rainfall), and \(\eta_{i,j}\) represents the error term.

Table 3 summarizes the findings. Each control observation is assigned a weight depending on how distant its propensity score is relative to that of the flooded observation. In the second stage, a weighted DID regression approach is used to estimate the impacts of the floods. The weights from the first stage are used to ensure that flooded and non-flooded observations are on average balanced on covariates or that outcomes for both groups follow the same trends.

Table 3 Probit model to predict probability of being flooded at baseline in 2008 (propensity score)

Several issues are worth addressing to justify the use of the covariates to construct kernel weights, since those variables should be simultaneously correlated with the treatment and outcome of interest. Exogeneity in the timing of the flood does not necessarily imply exogeneity in placement. First, households or individuals that have previously experienced a natural disaster likely imagine that they now face a greater risk of another disaster (Cameron and Shah 2014). In response, they may prepare for future disasters by adopting strategies that will reduce or eliminate their likelihood of having their plots flooded. For example, farm managers may adopt soil and water management inputs or practices, including applying ridge tillage, adopting adequate drainage facilities, placing dikes or erosion control stone bunds around the plot (to reduce the surface run-off of the land), or buying a plot in a higher elevation village on a permeable soil, which also affects the agricultural output. We address this issue by matching households on baseline variables that will account for this behavior, including a dummy for use of any anti-erosion technology on household plots (e.g., stone bunds, contour bunds).

Second, differentials in past flood exposure between households also tend to be correlated with whether a household is affected by future floods. In the analysis, we address this issue by using the total number of days the enumeration area has been exposed to floods in past years as a matching variable before the shocks. Furthermore, the use of historical flood data to account for the history of exposure allows me to check for potential differential effects between floods experienced recently and those experienced further back in the past (e.g., 2 years, 10 years, or 20 years before the first wave of the panel survey started).

Third, to ensure that the estimated impact of large floods on agricultural output or welfare captures the direct effects, we need to disentangle the effects of potential confounding phenomena including temperature and rainfall. Heavy rainfall is by far the most important factor contributing to flood occurrence (Mirza 2011) and affects agriculture as well. In addition, higher temperatures increase air moisture, which might intensify the rainfall level. For example, Mirza (2011) finds that expected changes in flood depth and extent would occur between zero and two degrees Celsius warming in Bangladesh. If floods, rainfall, and temperature levels are not modeled simultaneously, the effects of the omitted variables might contaminate or bias the estimates of the large floods. We address this issue in the study by using the annual mean temperature and the precipitation levels of the wettest quarter where the household lives as variables in the matching process. This ensures that on average, flooded and non-flooded households have similar climatic conditions in the baseline year.

Fourth, other important determinants of floods are the topography of the land, the soil moisture, the presence of vegetation, and land cover. For example, a steep slope or clay soil will increase the speed and amount of runoff, and wet soil is less permeable and facilitates flooding. We address these issues by controlling for elevation, slope, soil’s potential wetness index, the land cover class (e.g., presence of vegetation), the agro-ecological area type, and the terrain roughness where the household lives. Table 3 shows the full results of the probit model used to predict the propensity scores or the probability of being flooded. Some significant determinants of flooding include the use of anti-erosion technologies, the altitude, and the soil wetness index.

Standard errors are clustered at the enumeration area level in all estimates. Spatially clustered standard errors along the lines of Conley (1999) are used as a robustness check.Footnote 12 With spatial and panel data, one needs an extra assumption in the estimation, which is that the errors are not spatially and temporally correlated. Spatial correlation indicates that closer households tend to be more correlated or similar than far away households. If spatial clustered error terms are not computed, it can introduce incorrect standard errors. Kelly (2019) analyzed 27 studies from top journals to examine the degree of spatial correlation in their errors. The results suggest that about three-quarters of them have severe spatial autocorrelation in the spatial noise, which distorts the t-statistics or significance levels. We address this issue in this study by first testing for the presence of spatial correlation of the outcome variable using Moran’s test (Moran 1950). The results of the test reject the hypothesis that the error terms are independent and identically distributed (i.i.d).

Figure 4 also illustrates the spatial distribution of the log value of crop production across municipalities in Tanzania. It appears that there is a spatial correlation between observations because high crop production districts (i.e., areas with a darker shade) appear to be clustered together, those with lower crop production as well (i.e., areas with lighter shade).

Fig. 4
figure 4

Choropleth map or heat map of log value of crop production across municipalities in Tanzania

We compute the spatially and temporally clustered standard errors following the work by Hsiang (2010) and Fetzer (2014), which ensures that estimates are adjusted to account for spatial correlation, serial correlation, and heteroskedasticity (Hsiang 2010). We impose the assumption that spatial dependence fades away at three different cutoff radiuses (i.e., 10 km, 50 km, and 100 km) from the household coordinates. This procedure allows me to check whether the main conclusions change when standard errors are clustered at the enumeration area or when we allow for a greater degree of spatial correlation between neighboring EAs. Standard errors are clustered at the enumeration area level in all estimates. In Table 16, in the appendix, we implement the spatially adjusted standard errors at three cutoff radius (i.e., 10 km, 50 km, and 100 km) using households’ coordinates. In columns 1 and 2, where spatial dependence is assumed to be zero at a 10 km or 50 km radius from the centroid of each household coordinate, the results are still statistically significant and similar to the original results. In column 3, where the cutoff radius increases to 100 km, the negative effect of floods becomes weakly significant.

5 Main Results

5.1 Effects of Large Floods on Household Agricultural Productivity and Welfare

Value of crop production—Table 4 presents the preferred weighted difference-in-differences estimates without and with enumeration area fixed effects, using the inverse hyperbolic sine transformation of the value of crop production as the outcome variable. All coefficients reported in this section can be adjusted to estimate semi-elasticities by exponentiating both sides of the empirical equation. Column 3 shows a large and statistically significant drop in the value of crop production by 39% for the flooded group.Footnote 13 After applying the enumeration area (EA) fixed effects, the estimated coefficient falls slightly to 34%.

Table 4 Effects of the 2009 large floods on the log value of crop production

We also investigate the heterogenous effects across time by looking at the short-run (the year following the shock) and long-run (3 years post-treatment) shocks by interacting each post-flood year variable with the treatment variable. Figure 5 and Table 5 show that the negative effects of floods are significant 1 year after the event but become insignificant 3 years after, although the test of the difference between these effects is not statistically significant.

Fig. 5
figure 5

average economic effects of the 2009 floods in the short-run and long-run in Tanzania

Table 5 Decomposition of the effects on value of crop production by year post-flood: short and long-run effects

For example, looking at the estimation with fixed effects (Column 2, Table 5), floods reduce the value of crop production by 40% in the short run, but in the longer run, these effects are still negative but insignificant.Footnote 14 An explanation for this result is that after the destruction of assets and capital, farmers may underinvest in the short-run as a self-insurance strategy by adopting less improved fertilizers (e.g., inorganic fertilizer) and hybrid seeds and more traditional inputs that provide lower but stable crop production.

Total household Expenditures—In Table 6, we present the effects of the floods on households’s welfare using the inverse hyperbolic sine transformation of total household real expenditures (i.e., food and non-food expenditures) as the outcome. Column 1 shows no significant effect of the treatment on household expenditures. Even after decomposing the effects for different years post-treatment, no significant effect is found.

Table 6 Decomposition of the effects on other economic outcomes: short and long-run effects

Baez et al. (2019) find similar results of no effects of flood shocks on households’ total consumption after using a triple DiD that looks at the households affected by a shock during the growing season. This null effect on consumption seems counterintuitive at first given that there are large negative impacts on agricultural production, however, the disaggregated results in Table 10 show that only households that received transfer income at baseline were able to smooth their consumption. This suggests the importance of safety nets or savings and credit to allow households to borrow in bad times and smooth their expenditures. Similar studies that look at the effects of floods and coping mechanisms find that access to credit or loans is important to reduce the financial burden, help households maintain their consumption, and reduce the need for child labor (Del Ninno et al. 2003; Alvi and Dendir 2011).

Self-employment and Employment incomes—In columns 2 and 3 of Table 6, we consider whether the negative impacts of the 2009 floods on crop production are mitigated by transitions into non-farm self-employment and employment by using non-agricultural self-employment income and income from labor as the outcome variables. The findings show no significant effect on employment income in both the short and the long run. Even though the results on labor income are insignificant, it may be the case that hours worked increase while labor income remains the same because of an increase in labor supply due to the agricultural shock. As a robustness check, Table 14 in the Appendix shows that using entropy balancing as another pre-processing method to balance groups on covariates at baseline leads to balance across all covariates without leading to a significant drop in the sample size. The results in Table 15 in the Appendix are similar to the main results in this paper, except we find a 14% significant drop in household consumption in the short-run after the shock for flooded households.

In Column 1 of Table 7, we test that hypothesis using off-farm hours worked per capita, and the results suggest no significant change in off-farm hours worked after flood exposure. Regarding self-employment income, we find a statistically significant and negative effect only 3 years after the floods. The null result on employment income is in contrast to what Deryugina et al. (2018) found, which is a large drop in employment income in the short-run after Hurricane Katrina but their effects become significantly positive in the long run. They also find that self-employment income remains higher in the short, short, and long-run after the shock.

Table 7 Effects of flood exposure on livestock sales

Farm Income and Value of Livestock Sold—Column 4 of Table 6 shows a significant decrease in the farm income among flooded households. We also test whether agricultural households increasingly sold livestock in response to the flood to buffer against crop damage from the flood, using the value of livestock sold (live and slaughtered) as an outcome variable. We restrict the sample to agricultural households for which the Tropical Livestock Unit (TLU) is greater than zero (0). Column 2 of Table 7 shows no significant effects of flood exposure on livestock sales both in the short and the long run.

Mental Wellbeing—we also explore the short- and long-run psychological effects of the distress and trauma caused by the memory of the suffering and losses connected with flooding, which are oftentimes overlooked in risk assessment or impact evaluation studies. Floods can also cause intangible losses (e.g., stress and anxiety) that need to be considered in a complete analysis of their effects (Hudson et al. 2019; Lamond et al. 2015). To explore the emotional impacts, we look at the individual-level reported satisfaction with health, job, finances, house, and overall quality of life as outcome variables.Footnote 15 Table 8 shows that individuals living in flooded households experience at least a 0.07% point decrease in their satisfaction with their finances, housing situation and health in the short and/or long run. These results on reported satisfaction with finances and housing are consistent with other studies that find large negative effects of flood exposure on subjective well-being in France, which are also incompletely attenuated over the years (Hudson et al. 2019) or on life satisfaction using data from 17 OECD countries (Luechinger et al. 2002). Given that the psychological outcomes are qualitative, one potential explanation for these effects is that the flooded households are in a different range in the qualitative scale than those in the comparison.

Table 8 Effects of flood exposure on reported well-being

Lastly, we estimate the effects of flood exposure on child’s nutrition using the sample of children under 5 years. This is important because UNICEF estimates that more than 2.7 million Tanzanian children under 5 years old suffered from stunting in 2015, among which 600,000 suffered from acute malnutrition. This issue can be exacerbated by the increased frequency of flooding and will likely have permanent effects on future outcomes for those children (Alderman et al. 2006). In the analysis, we use height-for-age (HAZ) and weight-for-height (WHZ) z-scores as outcome variables. In columns 1 and 2 of Table 9, we find no statistically significant effects of the floods on HAZ and WHZ respectively, both in the short and long run. The result is the different from that of Rodriguez-Llanes et al. (2016), who find that children affected by floods in India are more likely to be stunted and underweight, with a larger effect on those who are younger than 1-year-old, or that children exposed to an extreme flood in Bangladesh are shorter than their counterparts, and do not recover from it.

Table 9 Effects of flood exposure on height-for-age and weight-for-height

There are potential explanations for the insignificant effects of the floods on children's anthropometric indicators. It could be the case that flood-exposed households were either able to perfectly smooth their consumption after the shock by using assets (Ninno and Lundberg 2005). An alternative explanation is that children can maintain their weight because their mothers can lower their caloric intake and transfer it, which will impact the mother’s weight, but keep the children’s weight constant (Block et al. 2004).

5.2 Heterogeneity of the Effects

Explore four key dimensions across which one might expect the economic impact of the large floods to be heterogeneous: whether the household received some assistance or transfer income (from the government, NGOs, or remittances), whether the household is a livestock owner, whether the household is a smallholder farmer, and whether the spatial grid in which the household lives has any type of forest. Although natural disasters overall increase a household’s vulnerability, it is important to investigate heterogeneity because disasters do not affect people equally. Such analysis allows one to explore whether different subpopulations display different resilience levels, and to investigate the mechanisms that affected households can rely on to reduce their losses from large disasters. The reason for investigating the effects across smallholders and large landowners is because smallholder agriculture is the principal form of farming system in Tanzania and in many parts of Africa, where 33 million small farm households cultivating less than two hectares account for 80% of all farms (FAO 2009), and these smallholders might be impacted differently than large holders. Here, we define a smallholder farming household, as one that has a total farm area of less than or equal to two hectares; and large holders are defined as more than two hectares.

It is important to explore the effects of receiving social safety nets or transfers because they serve as a tool to support households impacted by adverse shocks, raise households’ capabilities, and reduce the likelihood that they fall into chronic or persistent poverty (Barrett and Constas 2014). In the sample, about 30% of all agricultural households have a positive amount of transfer income received from either the government, NGOs, or remittances at baseline.

We also investigate whether there are heterogeneous effects in the short and long run across livestock owners. The reason is that livestock serve as a source of food (e.g., milk and meat) and also provide services such as manure, traction, and transport (Lybbert et al. 2004) for many smallholders. Rural households in African countries such as Niger, Madagascar, Malawi, and Tanzania depend heavily on livestock. In some cases, it is an important asset through which poor households store their wealth. These households are extremely vulnerable to weather shocks or large floods that often cause massive livestock death directly and through diseases, which might push the households further into a poverty trap. Thus, it is interesting to see whether farmers with livestock assets are disproportionately affected by those shocks.

Lastly, we explore whether households living near forests are affected differently than those living far from forests. Recent studies have emphasized the important role of the natural environment to decrease the occurrence of natural disasters and the associated losses. Forest cover has multiple benefits including flood control and improvement of agriculture via soil protection, moisture retention, nutrient storage, protection against pests and diseases, rainfall and temperature regulation, and important watershed functions (e.g., controlling water flow quantity and quality) (Stephenson and Petersen 1991; Myers 1997). Therefore, it is important to see whether the presence of forests is an important recovery mechanism among flooded households.

We estimate separate equations for the different subgroups and we still focus on the same household-level economic outcomes as in the previous section: the value of crop production, total real expenditures, farm income, self-employment income, and non-farm income. To estimate heterogeneous treatment effects, divide each subgroup into a separate sample and run Eq. (1). For the subgroup forest, we focus only on the value of crop production, and we estimate coefficients separately for households living in villages with and without forests.

Panels A and B of Table 10 show the effects for households that received some transfer income and their counterparts who did not receive transfers at baseline.Footnote 16 The results suggest that transfer mechanisms have large mitigating effects because households with no transfer income experience a large and persistent decrease in their value of crop production, self-employment income, and a short-run decrease in their expenditures, while households with transfers can attenuate the negative effects. However, after performing a t-test of the difference between coefficients across both panels, we reject the null hypothesis only for household expenditures or consumption. McCarthy et al. (2018) find similar results in Malawi where households that received safety nets after exposure to a flood were able to increase their food consumption compared to their counterparts. Given that transfers can serve as consumption smoothing instrument, we would expect a differential effect only on consumption. One possible explanation behind the results that transfer recipients experienced a large decline in crop production and farm income is that those recipient households used the transfer to purchase productive assets lost during the floods and increase their production.

Table 10 Heterogeneity effects of floods across transfers or assistance recipients, smallholder farmers, and livestock owners

Panels C and D of Table 10 present the results for the smallholder farm households subgroup. The findings show no statistically significant differential effects of floods between small and large holders of farms across all outcomes of interest, except a small drop in the value of crop production in the short run for the large holders.

Panels E and F of Table 10 show the results for the households that possess livestock. Interestingly, the findings show that the households that do not own any livestock experience a significant and persistent decrease in their value of crop production relative than the households that own livestock. This result makes sense because crop-only farmers are more vulnerable in flood-prone areas but after a large flood that often affects the soil quality, livestock can also help provide organic fertilizer for the plots or be used for labor purposes to improve crop production in subsequent years.

Table 11 shows the results for the households living in clusters with and without forests. Flooded households living in clusters with any type of forest (broadleaf, deciduous or mosaic forests) experience a much lower reduction in their crop production than households living in clusters with no forests. The findings corroborate the idea that forests and natural ecosystems play a protective role and are an important recovery tool after exposure to a large flood. For example, Indian villages with little to no mangroves encountered higher human losses than villages that had larger mangroves between them and the coast during a 1999 super cyclone. (Das and Vincent (2009). Yamamoto et al. (2018) find that deforestation in rural Indonesia generated severe biodiversity loss, causing a 44% loss in agricultural productivity between 2001 and 2014, which is approximately $2.63 billion among farmers.

Table 11 Heterogeneity effects of floods on households living in villages with and without forests

5.3 Robustness Checks

The results in this study will be biased if external events that happened after 2009 affected households in the treatment group differently than those from the comparison group.

One robustness check performed is to test whether the presence of other floods that occurred after the 2009 floods could bias the results. Tanzania was exposed to other floods in 2011 and 2012 and Fig. 9 in the Appendix shows the extent of both floods. To test the effects of these other floods, we run Eq. (1) including indicators of the occurrence of the 2010 and 2011 floods. The results in Table 20 in the Appendix show that including the indicators for those shocks does not bias the coefficients of interest. This result validates the idea that extreme weather events occurring in different periods and in one specific area are essentially random conditional on fixed effects (Dell et al. 2014).

We also perform randomization inference by randomly reallocating treatments across Tanzania multiple times to obtain a distribution of the treatment effects and examine whether the original coefficients obtained are contaminated by spatially simulated noise. This approach is a variant of the permutation test (Fisher 1935; Rosenbaum 2002) and allows one to create a distribution through Monte Carlo simulations to test whether the statistic obtained is consistent with the distribution. We simulated the assignment mechanism by creating 1000 placebo floods randomly across Tanzania, which will assign different households in and out of the treatment. Figure 7 in the Appendix shows that the distribution of the placebo estimates across all outcomes is centered around zero, meaning that spatial simulated floods do not affect the outcomes of interest and that the results are not driven by spatial correlation across floods.

A final concern is that flood exposure could be correlated with time-varying unobservable factors that affect our outcomes of interest. To address this concern, we perform an omitted variable test by implementing the Oster bounds as suggested by Oster (2019). It is a selection-on-unobservables diagnostic that allows to understand how important selection bias on time-varying unobserved confounders would have to be to render the estimates null. A traditional way to assess the robustness of the estimates to potential omitted variable bias is to observe how coefficients change after changing the covariates. Columns 1 to 4 of Table 4 show that after controlling for relevant time-varying observables and fixed effects, the magnitude of the effect on the value of crop production does not change by a lot, and is still significant. The delta values originating from the Oster diagnostic in Table 18 in the Appendix are closer to zero (0), suggesting that the correlation between the treatment and the unobservables is low and that the estimates are unlikely to be overturned by potential unobserved confounders.

5.4 Limitations

A limitation of this study is that we do not observe directly the differentials in the depth of the floods, we only observe flood exposure at the extensive margin (i.e., whether a household is flooded or not). Since households will likely experience floods differently with different magnitude and depth, it would be interesting to explore how the intensity of flooding is related to the economic loss in agricultural outputs for better targeting purposes during post-flood interventions. However, we believe that since we control for most variables that are responsible for flood depth (e.g., elevation, soil wetness, slope), we expect the differentials in flood depth between enumeration areas to matter very little.

Another limitation is that of external validity. The current results are valid in the context of Tanzania, and specifically for the farming households selected. The effects might be different for non-agricultural households but also different for households in other countries. we cannot extend the current analysis to the LSMS-ISA panel data from Nigeria and Ethiopia to address some of the external validity issues. In the case of Nigeria, the dataset lacks some important variables (e.g., use of anti-erosion technologies) that will cause omitted variable bias, while in the case of Ethiopia, there is no flood in between different waves of data collection to set up and use a difference-in-differences framework.

5.5 Conclusion

In this paper, we estimate the effects of large shocks from flooding on households’ value of crop production, welfare, and individual subjective well-being using a panel dataset of agricultural households in Tanzania. Overall, we find that households living in flooded enumeration areas experience a large significant reduction in their crop production, which is the most salient result in this paper. This result is consistent with previous suggestions that climate change negatively impacts agriculture and that the yield losses from climate change could be as high as 82% for some crops by the end of this century (Schlenker and Roberts 2009; Schlenker and Lobell 2010; Welch et al. 2010; Schauberger et al. 2017). We also find no significant effects on overall household total real expenditures or child’s nutrition.Footnote 17 When looking across subgroups, we find that households that received transfers from the government at baseline, NGOs, or remittances can smooth their expenditures while those who did not receive any transfer experience a large and persistent drop in their crop production in both the short and the long-run, and, a significant reduction of expenditures in the short-run. This result is important for policy-making because it suggests that policymakers can rely on social safety net programs or cash transfers as mitigating factors during and after flooding to help improve affected individuals’ states or lower their vulnerability (Deryugina 2013). Another mitigation mechanism is the introduction of weather-resistant seeds (Barrett and Constas 2014). Households do not seem to transition into self-employment or formal employment after the shocks since there are negative effects on self-employment and no significant change in labor income. Given the massive destruction reported by the Tanzanian authorities after 2009, the effects on the value of agricultural output are not surprising. These negative results are consistent with the empirical evidence discussed in the literature review (del Ninno et al. 2001; McCarthy et al. 2018; Michler et al. 2019).

Another important result for policymakers is that the presence of forests in some enumeration areas or villages, as well as transfers and remittances appear to be very important mechanisms to mitigate the negative impacts of floods. The fact that households living in villages with forest cover are significantly less impacted provides evidence that future policies should improve forest protection and implement reforestation/afforestation programs as a preventive action against future floods. Additionally, given that the use of erosion control measures (e.g., stone bunds or contour bunds) is an important determinant that seems to reduce the likelihood (Table 3), policymakers could better assist or train farmers on the importance of implementing such measures for land sustainability.

This paper contributes to a growing literature that looks at the impacts of natural disasters on economic outcomes, and the factors that allow households to mitigate the negative effects of disasters on production and expenditures. These disasters often represent significant shocks to poor households in developing countries by destroying the environmental quality, causing psychological harm, as well as indirect losses (i.e., post-disaster diseases). Given the low presence of weather index-based insurance markets in those developing countries to lower production risk, it would be interesting to investigate and understand the adaptation mechanisms or strategies that households adopt when facing future flood risk as well as their post-disaster responses (e.g., use of savings to weather disasters’ impacts). This will be the subject of future work.