Introduction

Payment for ecosystem services (PES) schemes are an approach taken by many governments and organizations around the world to address natural resource degradation. Through market mechanisms, PES programs put a price on the ecosystem services that, theoretically, would otherwise be consumed for free, mitigating the potential problem of overuse. The Grain for Green (GfG) Program in China (also known as the Sloping Land Conversion Program or the Conversion of Cropland to Forest Program) is the largest land reforestation scheme in the world in terms of investment and public participation (Delang and Yuan 2015). The primary goal of the GfG Program is to incentivize farmers to convert croplands on steep slopes and otherwise ecologically sensitive areas into forests by paying them an amount equivalent to the lost crop revenue. Begun in 1999 and expanded in 2001, the GfG Program covers the largest area and engages the most participants of any ecological program in Chinese history (Song and Zhang 2009). By increasing forest cover, the program is expected to provide ecological services, including reducing soil erosion, conserving water resources, and increasing carbon storage. Further, the GfG Program is expected to have positive socioeconomic effects such as poverty alleviation and greater diversification of labor within families. The withdrawal of land from crops was expected to increase household income by freeing up labor, allowing individuals to migrate to employment opportunities elsewhere. These socio-demographic outcomes are the subject of our research.Footnote 1

While government payments help compensate families for the loss of income from reduced agricultural production, they are not intended to continue forever. For the program to preserve forest cover in the long run, participants must pursue other sustainable sources of income so that they neither revert to farming GfG-designated land when the payments cease, nor fall into poverty. Participating households, therefore, must make durable changes to their livelihood strategies from sources such as off-farm employment, business ventures, livestock, forest products, and remittances from labor migration.

The link between forests and rural labor migration is an increasingly important subject for researchers of environmental outcomes and management, particularly in China where the migrant population increased by 90 million between 2000 and 2010 (Liang et al. 2014). Rural areas contain most of the world’s natural resources, and while future population growth will occur predominantly in cities, virtually all forest-cover changes will occur in rural environments (Carr 2009). Researchers have found statistically significant and positive correlations between remittances and reforestation in El Salvador (Hecht 2010), and between rates of off-farm labor and changes in forest and scrub cover under the GfG Program in China’s Yunnan Province (Zhang et al. 2017). Other studies have focused on the processes through which increased rural labor migration positively affects forest cover. Rural households with fewer family members due to migration have lower demand for fuelwood, reduced labor for fuelwood collection, and upgraded household assets such as electric stoves and other modern appliances (Chen et al. 2012; Qin 2010).

In this paper, we examine the role of migration in household responses to the GfG Program, estimating the extent to which individuals move away from home for long periods of time following a reduction in croplands. Policymakers should interpret an increase in migration among participating households as an encouraging sign that they will be able to diversify their long-term income streams, which should also help preserve the gains in forested land and sustain any ecological benefits.

This paper distinguishes itself in a few ways from previous studies on the effects of the GfG Program. First, we use 15 years of data on migration activity, including a pre-treatment period from which we can establish baseline migration behavior. Second, we rigorously test the typical assumption in the GfG literature that participation is free of selection bias. Finally, our dataset is both newer (2014) and generally larger (1887 individuals, 465 households) than previous studies.

Background and Conceptual Framework

In 1997, China suffered severe drought along the Yellow River and, in the following year, disastrous flooding along the Yangtze River. The flooding alone cost 250 billion yuan ($31 billion in 1998 U.S. dollars) in damages and led to the loss of more than 4000 lives (Sun et al. 2002). Both events were attributed in large part to widespread soil erosion and the accompanying loss of water retention capacity resulting from extreme deforestation in the upper and middle reaches of the river basins. the GfG Program was one of the main government responses to rescue the land from degradation, its primary goal being to reduce soil erosion and increase forest cover by retiring less productive croplands on sloping terrain and converting them into forest or grassland. A pilot program was launched in 1999 in three ecologically sensitive provinces (Sichuan, Shaanxi, and Gansu) and subsequently rolled out to 22 additional provinces starting in 2002 in pursuit of a goal of converting 14.67 million hectares of cropland to forest or grassland by 2010. However, the government scaled back those plans in 2004 due to concerns about food security after the conversion of 9 million hectares (Yin and Yin 2010). More recent studies report that by the end of 2013, GfG land had reached only 9.26 million hectares (China State Forestry Administration 2014), falling short of the original goal though contributing significantly to ecological restoration and poverty alleviation (Liu and Lan 2015).

The government incentivizes farmers under the GfG Program using a variety of mechanisms, including cash, in-kind payments, and tax breaks, along with the option of planting commercial forest (trees bearing fruits, nuts, timber, medicinal goods, etc.) from which they could draw income, or ecological forest that primarily protects the soil, but may also provide some fuelwood and salable timber. Originally, households were given 150 kg of grain annually for every mu (about 1/15th of a hectare) of cropland converted to forest in the Yangtze River Basin, or 100 kg/mu/year in the Yellow River Basin (Song et al. 2014). In addition to the grain, participating households received a cash subsidy of 20 yuan/mu/year ($1 US equaled 8.2 yuan at the time). To reduce transportation costs, the grain subsidy was replaced starting in the second year with increased cash compensation: a total of 230 yuan/mu/year and 160 yuan/mu/year for the Yangtze and Yellow River Basins, respectively. Croplands converted to forests were also exempt from agricultural taxes. The cash subsidy was intended to continue for 8 years for ecological forests, 5 years for commercial forests, and 2 years for grasslands. In 2007, however, the Chinese government renewed the GfG cash compensation, at half the initial rate, for another contract period (up to 8 years) following the end of the initial contract. The total national investment in the GfG Program is estimated to reach 431.1 billion yuan by 2021 (Yin and Yin 2009, citing paper presentation by Tang 2007).

Like many PES projects, the objectives of the GfG Program were not only to reduce environmental degradation, but also to alleviate rural poverty (Grosjean and Kontoleon 2009). Families were assisted in the transition from over-reliance on agriculture to a more sustainable mix of income from crops and other sources such as off-farm employment, business pursuits, livestock, forest products, and remittances. Early on, most households did not have a plan for a sustainable income after withdrawal of subsidies (Wang et al. 2003). In a survey of two provinces, about 30% of farmers predicted they would convert forests back to croplands whenever government supports would end, making this a potential barrier to the long-term program success (Uchida et al. 2005).

The theory underlying PES schemes relies on using market-like mechanisms (in the case of the GfG Program, cash and in-kind transfers, tax breaks, and revenue from forest product) to influence suppliers (landholders) to internalize the social benefits and costs of their decisions (afforestation on certain croplands) so that they provide the socially efficient level of ecological services (watershed management and reduced soil erosion). In reality, empirical results have not always been consistent with theory, nor have PES schemes always achieved policy goals. They have yielded a range of ecological outcomes depending on the macro-policy, political-economic context, and social and cultural environments in which programs are implemented and the unpredictable ways in which those factors interact (Pfaff et al. 2013). For instance, in rural China, community-level vegetation cover was found to be influenced by elevation and household density but not by GfG incentives (Zhang et al. 2017). A number of studies found that a prominent Costa Rican PES scheme had little or no impact on deforestation (with a small increase in net forest cover due to reforestation) but cautioned against extrapolating those findings due to unique conditions in the country (Pattanayak et al. 2010). Another meta-analysis of Costa Rica found that effects vary by scale: PES schemes made no incremental reductions at the national level, but they reduced deforestation in sub-national cases (Daniels et al. 2010). A program in Mexico resulted in a 3–4% reduction in deforestation, but the effects varied considerably depending on property type and region, and some of the positive effects were mitigated by new deforestation in nearby, non-enrolled plots (Alix-Garcia et al. 2010). The range of ecological outcomes and the fact that they do not always align with theory suggests that PES schemes must be tailored to local conditions. Community-level social, economic, and biophysical features are important mediating factors and influence the relationship between population dynamics and other rural restructuring processes (Qin and Liao 2016).

Similarly, research into the social and economic effects of the GfG Program specifically has shown heterogeneous results. Studies undertaken only a few years after implementation showed no effect on household income (Xu et al. 2004) and migration rates (Uchida et al. 2007) and virtually no effect on off-farm employment (Uchida et al. 2007). However, subsequent research, after the program had a number of years to take hold, found program effects on several different outcomes. Researchers studying program impacts on off-farm labor in different provinces found the association to be positive and significant (Uchida et al. 2009); positive and significant, but only in households required by local government and markets to meet a pre-program minimum level of agricultural production (Groom et al. 2010); positive and significant, though not large (Yao et al. 2010); and positive and significant (Kelly and Huo 2013). Regarding the association between the GfG Program and household income, researchers concluded that the relationship was positive and significant, especially for low-income households (Li et al. 2011); and positive and significant, in part reinforced by the local economy, political conditions, and program extent (Yao et al. 2010). Moreover, a significant, positive association between the GfG Program and income diversification was found, especially among poorer households and only once the program had been in effect for 3 years (Liu and Lan 2015). While most research has been conducted on outcomes other than migration, one study did look at migration in three poor areas and found it significantly, positively affected by the GfG Program (Démurger and Wan 2012).

Particularly salient to GfG households, families in poorer, rural regions are uniquely vulnerable to income shocks due to extreme market fluctuations, disastrous weather events, crop raiding, death or illness of family members, and government fiat, among other factors. However, since their agricultural land is largely fixed, and effective credit and insurance markets do not exist, responding to shocks is more difficult compared to rural households in developed nations or regions with strong local institutions. Yet the need to smooth income persists since households have limited capacity to reduce consumption in response to collapses in purchasing power. A key way of reducing the risk of a shock to agricultural income is for households to engage in multiple income-generating activities. Empirical studies have demonstrated that rural families smooth household income, accumulate wealth, and reduce the risk of poverty by pursuing a diverse set of income-generating activities (Nielsen et al. 2013). A well-diversified income stream can serve as an effective hedge against extreme volatility of any single source (Zhao and Barry 2014). Additionally, the larger a family’s size, the more effectively it can make household-level decisions that take advantage of specialization among members, (Reardon et al. 2007).

The GfG Program has the potential not just to enhance the supply of environmental goods and services in rural China, but also to help families achieve smoother, more reliable—perhaps even greater—income streams. To the extent that the program’s cash and grain subsidies relieve capital constraints and free up household labor to pursue other activities such as migration, off-farm employment, or harvesting forest products, the new sources of income can reduce the risk of poverty by acting as effective forms of insurance against income fluctuations. On the other hand, participant households that fail to make durable improvements to their income-generating activities are essentially relying heavily on subsidies to diversify their incomes, a strategy that is likely to fail after the government eventually withdraws its support (and may be insufficient even in subsidized years, given the small size of the payments). Our research explores whether GfG households have turned to migration as an alternate income stream and whether overall income levels and income diversification are affected.

Methods and Data

Survey

Our data source is an extensive survey consisting of 20 modules capturing individual- and household-level information. It was conducted in Tiantangzhai Township, located in the eastern Dabieshan mountainous region in southern Anhui Province of China. The township covers an area of 189 km2 and has a population of 17,000 people. The landscape of the township is dominated by natural forest aided by favorable weather and partially designated as a nature reserve. The study area is remote from the major development activities in the county, which is recognized by the central government as a county in poverty. Local residents depend primarily on farming staple crops, including paddy rice, corn, and sweet potatoes. Many also earn an income from other activities such as raising animals, construction, local retail businesses, and migration out of the county. The government implemented the GfG Program in Tiantangzhai Township in 2002. Approximately 20% of rural households enrolled cropland into the program, with most planting ecological forests, specifically maple or Italian poplar (Song et al. 2014).

Sampling and Data Collection

We used two-stage random sampling to select communities and then households. To achieve a balanced sample from a region where only 753 of 4369 households were participants, both stages were stratified and GfG households were oversampled. In the first stage, 40 of the 165 resident groups were randomly selected. Resident groups are clusters of 10 to 50 households living relatively close to each other and who shared communal croplands before China’s rural reforms of the 1980s. In the second stage, households were grouped according to GfG participation status, then randomly selected from each group. No more than 20 households were selected in a community. Seventeen households were dropped due to incomplete or unreliable data, leaving the final sample for our analysis of 465 households, of which 259 were enrolled in the GfG Program.Footnote 2

In addition to the data reflecting the survey year (2014), the questionnaire also captured information on migration in each year from 2000 to 2014, a feature that distinguishes our dataset from those used in other GfG research. Respondents were asked whether, and in which years, any household member had been a migrant (defined as living outside the county for at least six continuous months in a calendar year). From this question, we created our main, binary outcome variable capturing the migration status of each household member in each of 15 years. While recall error is possible, it is difficult to conceive of plausible reasons why errors would be systematically different for participants versus non-participants, suggesting a low likelihood of bias. To the extent that any randomly distributed recall error attenuates the coefficients, our results represent conservative estimates.

From the household data for survey year 2014, we created two additional outcome variables

  1. 1.

    Total income: total household gross income, equal to the sum of all income sources: businesses, off-farm wages, remittances, agriculture, forest products, livestock, and “other” (social gifts, rental property, and government subsidies, including the GfG Program). Respondents provided data from all sources with the exception of 40 missing values (about 9%) for government subsidies (interviewers reported that these respondents were simply not sure of the amount of government subsidies they received). Considering that a number of regressions and t-tests suggested that there was no association between subsidy amounts and other variables, we imputed missing values only for subsidies and only for the 40 affected households using the sample mean (1826 yuan).Footnote 3

  2. 2.

    Income diversity index: each of the seven income categories was divided by total household income, and each result was then squared. The reciprocal of the sum of these seven squared terms measures household income diversification (higher numbers indicating greater diversification).

Analysis

First we examined the individual-level data to determine whether the GfG Program influenced the likelihood of migration. One prominent challenge of measuring treatment effects is that rural migration rates, especially to China’s growing cities, were already high during the 15 years covered by this survey independent of any single policy or intervention. Liang, Li, and Ma (2014) noted that the size of the country’s migrant population increased by nearly 90 million between 2000 and 2010. Further, they described fundamental changes in the living conditions of migrants to urban centers that affect migration rates, such as the passage of more protective labor laws, rising wages, and increased educational spending.

In order to isolate the effects of GfG participation, we use a difference-in-difference (DID) method comparing participants and non-participants before and after program implementation. Specifically, we ran DID logit regressions on both the full sample and a subsample restricted to individuals in the core working/migrating ages of 16-to-50 to predict the likelihood of an individual being a migrant in pre-treatment year 2001 and post-treatment year 2006.Footnote 4 We selected 2006 initially to represent posttreatment recognizing that family decisions as profound as household migration may plausibly take a few years to initiate. Subsequently, we extended the analysis using other years as post-treatment to determine whether and how the effects changed over the many years captured by the survey’s migration question.

The first pair of regressions used a parsimonious model that included three predictor variables: indicators capturing GfG participation and the post-treatment year, along with an interaction of the two.

$$\begin{array}{*{20}{l}}{\mathrm{Prob}}\left({{\mathrm{Y}}_{it}} \right) = & {\mathrm{\Phi }}\left(\beta _0 + \beta _1{\mathrm{GfG}}_i + \beta _2{\mathrm{PostTreatment}}_t\right.\\ &+\left. \beta _3{\mathrm{GfG}}_i \ast {\mathrm{PostTreatment}}_t + \varepsilon _{it} \right)\end{array}$$
(1)

where Yit = 1 if individual i was a migrant in year t; t represents either 2001 or 2006; and Φ is the logit function.

Interaction terms like our DID variable function differently and are more complex to interpret in a non-linear logit model compared to ordinary least square (OLS) regressions. In OLS, an interaction term helps isolate the fact that the effects of a treatment may vary depending on the value of another variable like PostTreatment. In a logit model like ours, though, the effect of GfG already varies over the pre and post time periods even without the DID interaction variable. We kept GfG*PostTreatment in our models, though, because it improves their ability to fit the data. The marginal effects we report below incorporate the effects of both GfG and GfG*PostTreatment.

In our second set of regressions, we added a fixed effect for the resident group along with three covariates—sex, age, and years of education—that are well accepted by researchers as generally associated with migration (Greenwood 2016; White and Johnson 2016).

$$\begin{array}{*{20}{l}}{{\mathrm{Prob}}\left( {{\mathrm{Y}}_{it}} \right)} \hfill & = \hfill & {\mathrm{\Phi }}\left(\beta _0 + \beta _1{\mathrm{GfG}}_i + \beta _2{\mathrm{PostTreatment}}_t \right. \hfill \\ {} \hfill & {} \hfill & + \beta _3{\mathrm{GfG}}_i \ast {\mathrm{PostTreatment}}_t\\ {} \hfill & {} \hfill & \left. + \beta _4{\mathrm{female}}_i + \beta _5{\mathrm{educ}}_i + \beta _6{\mathrm{age}}_{it} + \beta _7{\mathrm{age}}_{it}^2\right.\hfill\\ {} \hfill & {} \hfill & \left. + i.{\mathrm{residentgroup}}_i + \varepsilon _{it} \right) \end{array}$$
(2)

Our age variable is not the static age of individuals captured in the 2014 survey but was expanded to reflect the actual age of each individual in each of the 15 years. Because the influence of age on migration may not be linear, we also added age2 to the regressions. Finally, the resident group fixed effects are a recognition that these tight-knit networks of households and their precise geographic location likely have an unobserved influence on families’ migration decisions.

For our third and final set of regressions, we turned our attention to the household-level end results of migration decisions and examined whether GfG participation is associated with the two 2014 household outcomes: total income and income diversification. We ran OLS regressions using a GfG dummy variable, age of the household head, and family size. Additionally, we controlled for resident groups and used robust standard errors.

$$\begin{array}{*{20}{l}}Y_i \hfill & = \hfill & \beta _0 + \beta _1{\mathrm{GfG}}_i + \beta _2{\mathrm{AgeHouseholdHead}}_i\\ \hfill & \hfill & + \beta _3{\mathrm{FamilySize}}_i + i.{\mathrm{residentgroup}} + \varepsilon _{it}\end{array}$$
(3)

Endogeneity Tests

In order to obtain unbiased estimators, we must rule out endogeneity possibly created by households self-selecting into the program. Many other researchers who have published previous studies on the impacts of the GfG Program have discounted the likelihood of self-selection by accepting the program’s merely quasi-voluntary nature (Xu et al. 2006; Yao et al. 2010; Li et al. 2011; Démurger and Wan 2012; Liang et al. 2012; Kelly and Huo 2013). That is, while participation is voluntary in principle, in reality the areas to be converted to forest under the GfG Program have been targeted by local governments according to the specific biophysical conditions (for instance, slope and contiguousness with other forested lands to be put in the program), and participation is considered strongly encouraged if not outright mandatory for those selected by local planners. Early research confirmed households’ general lack of autonomy, with only 15% of GfG participants and 28% of non-participants declaring that they had any autonomy in participation (Xu et al. 2004). Participants also generally felt they had little or no say about which plots to retire.

With a view to establishing that participating households are similar to non-participants, we took two extra steps, starting with t-tests of sample means (Table 1). The few differences that exist between the two groups are largely expected. The difference in elevation of household location was highly significant, as the program explicitly targets sloping croplands. Gross income in 2014 from forest products was also significantly higher for GfG participating households. Distances to a paved road and to the county capital were only weakly significant. Age of all individuals was also weakly significant, although notably neither the age of the household head nor the age of the working-age subsample significantly differed between the two groups. Importantly, average family size is the same across the two groups, suggesting that any differences in migration are not due to GfG households having more members to send away.

Table 1 Mean, standard deviation, and t-test results for GfG participating and non-participating individuals and households

Second, we sought to establish a critical identifying condition for a DID model: that treatment and non-treatment households were on identical paths with regard to migration before the program effect kicked in. To create a baseline period, we lumped 2002 and 2003 together with years 2000 and 2001 because implementation of the GfG program begun in 2002 in the study area, and took a year to extend across the province and take full effect. Parallel trends between individuals in treatment and non-treatment households in the 4 years from 2000 to 2003 suggest that any divergence in migration in the post-treatment period would more likely be the result of the GfG Program and provide evidence that program participation was exogenously determined. We ran a simple OLS regression twice, once for individuals in participating households and a second time for those in non-participating households:

$${\mathrm{Y}}_{it} = \beta _0 + \beta _1{\mathrm{Year}}_t + \varepsilon _{it},$$
(4)

where Yit = 1 if individual i was a migrant in year t, and t ranges from 2000 to 2014.

We then ran a second OLS regression to calculate the difference between the two lines for the years from 2000 to 2003

$${\mathrm{Y}}_{it} = \beta _0 + \beta _1{\mathrm{Year}}_t + \beta _2{\mathrm{Year}}_t \ast {\mathrm{GfG}}_i + \varepsilon _{it},$$
(5)

where Yit = 1 if individual i was a migrant in year t, and t ranges from 2000 to 2003.

Results of these robustness checks are presented in Fig. 1 and Appendix Table 5. Figure 1 graphs the coefficients from both versions of Eq. 4, providing visual evidence that before and during program implementation, GfG and non-GfG trend lines track each other closely, then diverge after 2004, frequently at increasing rates. Appendix Table 5 quantifies the differences from Eq. 5. The coefficients on the year variables are either weakly (2001) or highly (2002 and 2003) significant, suggesting that compared to 2000, overall migration rates were on the rise. The coefficients on all four interaction variables, however, are not significant, indicating that independent of broader trends toward more migration, the probability of individuals migrating from GfG households prior to 2004 was indistinguishable from the probability for individuals in non-GfG households.

Fig. 1
figure 1

Predicted likelihood of being a migrant, individuals aged 16 to 50, by GfG participation status

Findings from both the t-tests and the regressions provide support that the GfG participants are, in fact, similar to non-participants and that the two groups had been on similar paths with respect to migration. Like previous authors, but supported by our more rigorous analysis, we conclude it is likely that GfG participation is not endogenous.

Results

Regressions on Migration Behavior

Results of our logit, DID regressions from Eqs. 1 and 2 are presented in Table 2. All four specifications use 2001 as the pre-treatment period and 2006 as post-treatment, and test whether being enrolled in the GfG Program exerts an influence over an individual’s likelihood of being a migrant (that is, living away from home for at least six continuous months in a calendar year). Specifications #1 and #3 include family members of all ages, while #2 and #4 are restricted to individuals from 16 to 50 years old who our analysis suggested were most likely to be influenced to possibly migrate. Though the interaction terms in Table 2 are not significant, we cannot rely solely on that evidence (unlike in a linear OLS model) to determine the effects of GfG participation. For that purpose, we examine the average marginal effect of the GfG Program (which, in our logit regressions, incorporates the effects of both the GfG and GfG*Treatment variables) and find it to be positive and significant in Regressions #1, #3, and #4, providing evidence that members of participating families migrate at higher rates than non-participants. In the full specification for individuals 16–50, the average marginal effect is 0.0371 (p = 0.044). That is, on average, GfG participants’ likelihood of migrating increased 3.71 percentage points more than non-participants. At the average predicted probability of 0.1317, that equates to a 28.17% increase.

Table 2 Logit coefficient estimates, standard errors, and average marginal effects comparing 2001 (pre) to 2006 (post) (resident group fixed effects in Regressions #3 and #4 not reported)

While 2006 was a reasonable starting point as a DID post-treatment period, we were interested in whether our results were robust to using other years as post-treatment, and how households adjust their migration strategies over time. To test these questions, we replicated regression #2 (full set of covariates, restricted to working-age individuals) ten times, varying the post-treatment year to all the other years from 2004 to 2014. Partial results are presented in Table 3. For simplicity, we limit our reporting to only the coefficients on GfG and GfG*Post-Treatment (columns 2 and 3), along with the average marginal effects (column 4). In addition to the initial findings using 2006 as the post-treatment period, results also show that differences in changes in migration rates were highly significant when 2011 was used as post-treatment; significant when 2007 and 2013 were used; and weakly significant using 2004, 2010, 2012, and 2014. These quantitative results are also consistent with a rough visual examination of Fig. 1.

Table 3 Select difference-in-difference logit regression results on probability of being a migrant (full specification, working age individuals, pre-treatment year 2001)

Regressions on Income and Income Diversification

Having established that the GfG Program positively affected migration decisions over several years, we examine the question of whether program participation is associated with other outcomes: household income (logged value) and income diversification. Positive associations would provide evidence that families are benefiting from either the program subsidies or from the decisions (including increased migration) influenced by the program. Table 4 presents results for both versions of Eq. 3. They show a weakly significant (p = 0.059), positive association between GfG participation and income diversification, but no significant association between the GfG Program and annual income.Footnote 5

Table 4 Regression results on 2014 household-level outcomes (resident group fixed effects not reported)

Discussion

The significant, positive association we found between the GfG Program and migration rates suggests that participating households are reallocating freed-up labor from farming to migration at greater rates than non-participating households. Such a reallocation of labor suggests that participants are, in fact, using migration to diversify their income streams and make sustainable changes to their livelihood strategies. We caution that because the data comes from a single, relatively poor and remote Chinese township from 2000 to 2014, care should be taken before generalizing to different locations and times. Previous analyses of other case studies have demonstrated the difficulty of interpreting programs and their effects and highlight the importance of specific context in determining how interventions function (Agrawal et al. 2014).

The increase in migration was however not constant over time. The marginal results in the first couple years post-implementation suggest that many households perhaps needed some time to plan and implement a decision as substantive as the migration of a family member. Alternatively, results could suggest that migration was delayed in the initial years as some families took the household labor freed up by cropland reductions to perform labor-intensive planting and replanting before migrating (Zhang et al. 2017). Thus, not until 2006 and 2007 do we see increases in migration rates among GfG families that are significantly greater than increases among non-GfG families.

In 2008 and 2009, the differences in migration between treatment and non-treatment households decreased, becoming statistically insignificant. Most plausibly, this 2008–2009 shift in behavior could reflect diminished economic opportunities throughout China triggered by the global recession, because in 2010 and 2011, GfG and non-GfG families diverge again in their migration decisions. Migration rates among working age individuals in participating households exceeded that of non-GfG households by 5.9% points in 2011 (p = 0.003). A more comprehensive question may ask how all the other income sources (businesses, off-farm wages, and animal and forest products) adjusted over the years in response to the loss of some croplands, in recognition that not all families may be able or willing to allow a member to migrate. We were not able to explore that question since our historical data (2000–2014) was restricted to migration activity. Perhaps other GfG families did fill the agricultural loss by turning more to other income sources. All are worthy subjects for future analysis.

Our subsequent regression results suggest that after about 10 years, the overall incomes of GfG and non-GfG households did not significantly differ from each other. Unlike our individual-level migration analysis for which we had 15 years of data, our income regression was limited to just 1 year. We therefore could not estimate changes or growth attributable to participation, only that the two groups had statistically indistinguishable results in the survey year. Nevertheless, we believe our null results contribute to the under-explored and somewhat contradictory body of existing literature. Our analysis corroborates the findings of Xu et al. (2004) who concluded that there were no significant program effects even with two periods of panel data (1999 and 2002) and a richer DID model than our OLS. However, our findings contradict two studies that gave the program a few more years to take hold than Xu et al. (2006): Li et al. (2011) found significant program effects, albeit with a single year of data from 2007, as did Yao et al. (2010) with two periods of panel data (1999 and 2006). Finally, in our last regression, the moderately positive effect of the GfG Program on income diversification is consistent with our findings on migration: GfG households showed greater diversification among income sources (suggesting enhanced risk management) than did non-participants (p = 0.059). This evidence is a welcome sign for policymakers attempting to create a sustainable program that incentivizes families to maintain their forests over the long term. However, given data restrictions, it remains inconclusive whether GfG households changed their diversification strategies in response to the program or whether there are other unobserved reasons or predispositions explaining these positive income strategies. Determining this would be a productive topic for future research.

Conclusion

This paper contributes to the existing literature on payment for ecological services schemes in general, and the Grain for Green Program in particular, by demonstrating that participating households in Tiantangzhai Township responded to government incentives by increasing the migration of their household members at greater rates than non-participating households in a number of years between 2004 and 2014. Our analysis advances the research on GfG outcomes in two new and significant ways. First, our conclusions regarding program effects on increased migration are based on 15 years of information. Second, we corroborated other authors’ prevailing assumption of exogeneity with respect to selection bias by confirming that the GfG and non-GfG families had statistically similar migration rates before program effects fully kicked in. Our logit difference-in-difference results suggest that the GfG Program was associated with increased migration, but the effects were not uniform across all years, disappearing during the Great Recession and then rebounding and reaching their peak in 2011, 8–9 years after implementation. Additionally, the GfG Program was weakly positively associated with income diversification during the survey year 2014, though there was no significant difference in annual income. These are promising results because they suggest that households are finding other income sources beyond government payments to compensate for reductions in agricultural production.

However, policymakers should also note that our analysis examined only one township in which the GfG Program has been implemented. As this and other studies have noted that results of PES schemes are dependent on cultural, political, and biophysical contexts, subsequent research would better advance GfG analysis by examining data from a variety of geographic regions. Finally, our conclusions would also be valuable to examine within a context of other income sources. That is, policymakers would be well served knowing whether and to what extent GfG households turned to income diversification strategies beyond migration, such as business income, local employment, forest products, and animal products.