1 Introduction

Land set-aside (fallow) schemes, involving farmers being paid to take land out of production, are widely used in the European Union (EU) and the USA as an agricultural policy tool. In many cases, the initial aim of this measure was to reduce excess supply of cereals while lowering the level of public agricultural stocks. It was later complemented by tools to promote the development of non-food crops and maintain good environmental status. In the USA, the Conservation Reserve Program (CRP) was introduced in 1985 as a voluntary set-aside program designed to control crop overproduction, reduce soil erosion, improve water quality, and provide wildlife habitat by taking vulnerable agricultural land out of production. In the EU, compulsory set-aside was one of the most important measures introduced at the time of the 1992 Common Agricultural Policy (CAP) reform. In 2008, the policy package associated with the CAP Health Check abolished set-aside for arable crops, but farmers could continue to set land aside on a voluntary basis while adopting agri-environmental schemes with cross-compliance. With the reform of the CAP for 2014–2020, the European Commission introduced the Green Direct Payment program that links direct payments to farmers to requirements for mandatory “greening” farming practices. These “Greening” practices include: (1) crop diversification, (2) the maintaining of permanent grassland and (3) Ecological Focus Areas (EFAs). EFAs are areas considered to have environmental benefits such as fallow land, margins catch crops, green cover and nitrogen-fixing crops. From 2015 on, farmers with more than 15 ha must have at least 5% of their land as an EFA. This “greening” of the CAP was described as a move back to compulsory set-asideFootnote 1.

The objective of this paper is to evaluate the potential impacts of a set-aside policy on the environment, more precisely in terms of intensification of input use (fertilizer and pesticide) on cultivated crop area (intensive margin). To capture these effects, we need to account for changes in the distribution of crops, because fertilizer and pesticide requirements are often heterogeneous across crops, as are their potential environmental impacts. This means that using individual farm data and for each crop is preferable in terms of policy impact evaluation. However, this type of data raises corner solution issues that are critical in empirical work at the individual farm level.

We apply a dedicated estimation procedure to estimate acreage and agricultural practice responses to the set-aside policy, for a sample of farmers from the département of Meuse (eastern France), observed from 2006 to 2010. We estimate a multi-output profit function based on a panel of individual farmers, controlling for multiple selection using both a parametric and a semi-nonparametric quasi-maximum likelihood (QML) estimator. The purpose of this econometric strategy is to check for possible deviations from heteroskedasticity of the parametric QML estimator, by comparing it with a more flexible semi-nonparametric estimator. The latter is based on sieve estimation [1] that allows for consistent, and in general more efficient, estimates than semiparametric estimation based on, e.g., kernel nonparametric estimation of the underlying distribution of error terms. Parameters estimated from a structural multicrop system of equations allow us to estimate the change in pesticide and fertilizer demand corresponding to a set-aside policy. We propose two indicators to measure these impacts: chemical input (pesticide and fertilizer) demand and intensity elasticities with respect to the set-aside subsidy rate.

Paper Contributions

The present paper makes two contributions to the literature. The first contribution is methodological and concerns the econometric strategy for dealing with multiple selection in a micro-panel sample of farms. Our contribution in this regard is to propose a consistent and flexible estimator for output and input decisions from individual farm data. To allow for correlation between (possibly censored) equations in the multivariate Tobit framework while avoiding multiple integration, we consider a QML approach, based on pairwise joint probabilities of structural equations. Yen et al. [2] and Fezzi and Bateman [3] use a (parametric) QML estimator, which is known to be consistent under the assumption that the conditional mean of the model is correctly specified and that the density used for QML belongs to the linear exponential family [4]. However, the QML may not be consistent if higher moments of the error distribution are introduced in the QML criterion but are misspecified (typically, variance–covariance terms). To address this issue, we propose in this paper a semi-nonparametric version of the QML estimator applied to the multivariate selection model, which allows us to relax both distributional and homoscedasticity assumptions. By doing so, we go beyond the procedure proposed by [5], which is consistent only if the selection equations are uncorrelated. Our estimator is based on semi-nonparametric sieve estimation, which is faster to compute than semiparametric estimators based on kernel approximations (as in, e.g., [6]). To the best of our knowledge, such semi-nonparametric methods to obtain output supply and input demand estimates from farm-level data have not yet been used in the literature. Because crops are different in terms of input requirements and environmental impacts, accounting for such heterogeneity at the farm level is likely to produce more precise estimates of farm input demands. This will allow us to more precisely estimate the environmental impacts of the set-aside policy in terms of agrochemical input intensification.

The second contribution of our paper concerns the ex ante environmental assessment of a land set-aside policy, in terms of agro-chemical input intensification. We contribute to the debate on the environmental impacts of greening policies in general, and on set-aside policies in particular. Indeed, some studies show that set-aside land supports biodiversity with much higher population densities and numbers of species of birds, insects, spiders, and plants [7]. This contradicts studies in which set-aside land has been found to be ineffective and inefficient for conservation [8]. More recently, the report of the [9] concludes that “greening, as currently implemented, is unlikely to significantly enhance the CAP’s environmental and climate performance”. This is mainly due, according to the same report, to the fact that “Greening lacks a fully developed intervention logic with clearly defined, ambitious targets...”. Indeed, farmers were not given any environmental target to achieve through the greening reform.

Our paper proposes a quantitative assessment of a “greening” policy, which aims to increase the set-aside area by 5%. The first feature of this contribution consists of evaluating the indirect impact of this set-aside policy on crop intensification using two indicators: chemical input demand elasticities and a new indicator, namely input demand intensity elasticities, both with respect to the set-aside subsidy rate. Such a complementary indicator is better suited, in our opinion, to agricultural settings where input use per unit of land is more relevant than total input demand. The second feature of this contribution is to simulate the impacts, in terms of fertilizer and pesticide demand variation, of a policy aimed at increasing set-aside area by 5 percent, and to compute the level of the input tax that will be necessary to cope with the increased demand. Reducing fertilizer and pesticide use is an environmental objective that has been explicitly introduced in the recent Farm to Fork Strategy of the European Commission, as part of the European Green Deal initiative. This strategy aims to reduce the use of fertilizers and pesticides in EU agriculture by at least 20% and 50%, respectively, by 2030. It is important to note that the impact evaluation concerns only potential benefits (or negative externalities) from a reduction in nutrient load, following a change in land use and input use (fertilizer, pesticide). Strictly speaking, we are not evaluating actual environmental effects of the set-aside policy on, e.g., water quality, greenhouse gas emissions (nitrous oxide from chemical fertilizer) and biodiversity, but rather the potential of such policy to limit chemical input use at the farm level.

The paper is organized as follows: Section 2 discusses land set-aside policies in the context of the European CAP and presents a literature survey of both the impacts of set-aside policies and econometric models with corner solutions. Section 3 presents a production model based on a multicrop framework and econometric methods to deal with corner solutions. Parametric and semi-nonparametric estimators are discussed in the context of panel data. Section 4 describes data used in our econometric analysis and discusses estimation results. In Sect. 5, we discuss results of chemical input demand and intensity elasticities and simulate the impact of a policy consisting of a 5-percent set-aside rate as an agricultural greening measure and calculate the input tax on fertilizer and pesticide that would be necessary to cope with increased demand for these inputs. Section 6 concludes the paper.

2 Policy Context and Related Literature

2.1 CAP and Set-aside Policy Context

Compulsory set-aside was one of the most important measures introduced in the European Union (EU) at the time of the 1992 reform of the Common Agricultural Policy (CAP), which introduced a new support system for producers of cereals, oilseed and protein crops. See Table 1 for a summary of CAP measures regarding set-aside policies. Farmers with production greater than 92 tons were eligible for set-aside payments. In order to alleviate their revenue decrease due to the compulsory set-aside, farmers were allowed to cultivate energy crops (diester from rapeseed in our data) on set-aside land without losing the subsidy [10]. Among the many changes to set-aside rules during the period 1993–2007, the major ones concerned adjustments to the rate of compulsory set-aside, the introduction of voluntary set-aside against payment, and the possibility of a fixed instead of a rotational set-aside, which was the only form available at the outset. In 2008, the policy package associated with the Health Check of the CAP abolished both the energy crop scheme and the compulsory set-aside scheme. However, farmers could continue to set land aside on a voluntary basis, while adopting agri-environmental schemes with cross-compliance. The eligibility requirement for payment in that case is that at least 5 percent of land should be within an ecological focus area. Moreover, eligibility conditions remain the same regarding crop area: farms are eligible for area-based payment only if the area planted with cereals, oilseed, and protein crops is greater than 0.3 hectares, under European directive CE/1973/2004 (October 29, 2004). Since 2008, the ceiling for the maximum voluntary set-aside area eligible for CAP payment is 1/9\(^\text {th}\) of the arable area. It can be extended to 25 percent for arable crops (cereals, oilseed, and protein crops) for energy, chemical use, or animal feed. In practice, for rapeseed production, farmers were eligible for area-based payment under CAP after the 1992 reform, while voluntary set-aside over and above the compulsory rate was allowed. After 2008, rapeseed for diester production was also possible on set-aside areas, with the same subsidy rate as for non-set-aside areas, i.e., a different subsidy rate from the “agronomic” set-aside rate. In this case, farmers have to show evidence of a farming contract with energy or industrial buyers.

Table 1 Past and ongoing Common Agricultural Policy (CAP) measures regarding set-aside

In December 2013, the EU enacted the CAP for the period 2014–2020 where, in addition to the Basic Payment Scheme, each farmer now receives a Green Direct Payment per hectare for respecting specific greening agricultural practices to reduce biodiversity loss and greenhouse gas emissions [11]. Member States are requested to use 30 percent of their national budget in order to contribute to this program. The three greening measures are as follows: (i) maintaining permanent grassland; (ii) crop diversification; and (iii) maintaining an Ecological Focus Area (EFA) of at least 5 percent of the arable area of the holding for farms with an area larger than 15 hectares (excluding permanent grassland), i.e., field margins, hedges, trees, fallow land, landscape features, biotopes, buffer strips, afforested area. Indeed, set-aside originally introduced for supply control purposes could have important environmental benefits, especially where land was left fallow [12]. However, the literature review by [13] shows that the CAP greening measures could have mixed effects on ecosystem services, as they can lead to intensification on cultivated land.

In May 2020, the European Commission officially presented a “Farm to Fork” strategy as part of the European Green Deal initiative. This strategy aims to make “the entire food chain from production to consumption more sustainable and neutral in its impact on the environment”. Among the targets considered to reach these objectives, the strategy mentions a 50% reduction in the use of chemical pesticide and a reduction in fertilizer use of at least 20%, both by the year 2030.

For the last CAP reform which will be implemented from 2023, Member States are required to “contribute” to the Green Deal via national agricultural policies. For example, among Good Agricultural and Environmental Conditions (GAEC), GAEC9 stipulates that all farms should have at least 10% of their agricultural area devoted to non-productive features in order to improve farmland biodiversity [14].

2.2 Related Literature

2.2.1 Literature on the Impacts of Set-aside Policies

By modifying the opportunity cost of farm land, a set-aside policy is expected to modify farmers’ production decisions in terms of crop choice and input use. However, the impact of set-aside policies on input use and intensification remains unclear: on the one hand, removing a proportion of land from production might reduce input use and increase extensification [15, 16]. On the other hand, set-aside might have adverse effects on water and soil quality if input use (fertilizer, pesticide) increases in order to balance the reduction in cultivated land. The impact of a set-aside policy therefore depends on the degree of use of the intensive margin (i.e., intensification of agrochemical input on the same crop land) vs. the extensive margin (adapting the distribution of land to crops, some being more intensive in chemical inputs than others, but leaving input level per hectare unchanged) used by the farmer in his production decisions. See [16,17,18,19,20,21,22] for more details on slippage effects. Gorddard [23] explores the relationships between crop production, land allocation, and input-use decisions, exploring the consequences of assuming production of multiple crops is non-joint but is subject to a constraint on total land area. His results illustrate that the effectiveness of environmental policies (aimed at modifying input use) may be affected by land allocation decisions. In the case of the CRP, early analysis found that contracts were targeted to reduce production, rather than achieving environmental benefits [24]. Hendricks and Er [25] show that the government has adjusted CRP acreage over time in response to changes in market conditions but not to environmental impactsFootnote 2. In the case of the new EFA policy implemented in the EU and aimed at preserving biodiversity, the intensity of the impact depends on the type of EFA (fallow, grassland), the specific environmental issue considered (water pollution, soil pollution, biodiversity loss), and site-specific environmental conditions.

The impact of set-aside policies on the environment is not only associated with input intensification and its consequences on nutrient load, but also on climate change and biodiversity, although such environmental dimensions are more difficult to directly connect to changes in input use. For example, [26] present an impact evaluation of a set-aside policy in Finland, by combining a production model with area-based set-aside payments and estimates of monetary benefits of reduced nutrient pollution. A cost–benefit analysis is then conducted on the nutrient load reductions attributable to CAP agri-environmental payments in Finland between 1996 and 2005. In any case, policy impacts of set-aside are also relevant to the debate over land sparing versus land sharing (see, e.g., [27]).

Relying on the intensive margin in cropping systems is one of the unintended impacts of set-aside policy, as is the slippage effect that was observed in the CRP in the USA [22]. Both impacts can be explained by the increased output prices associated with reduced production on set-aside land. In particular, input decisions may be modified following a change in the set-aside subsidy rate, if a) it is more profitable to rely on the extensive margin; and/or b) with the same distribution of crops, it pays to intensify production on a smaller proportion of land, using the intensive margin by increasing application rates of fertilizer and pesticide by the same proportion on all crops. These two changes can be measured by a combination of elasticities: agrochemical input demand and intensity of agrochemical input use, both with respect to the set-aside subsidy rate.

2.2.2 Literature on Econometric Modelling of Corner Solutions

Although there have been major advances in the estimation of production technology, there remain some challenging econometric issues. One problem specifically relevant to agricultural land-use allocation in empirical studies is corner solutions. Corner solutions arise when it is optimal for the farmer not to grow a crop (or combination of crops). Using aggregate data on land allocation, for example at the regional level, we observe positive values for all (region-specific) crops, but this does not imply that all farmers grow all crops. When using individual farm-level data, we need to account for the fact that some farmers may choose not to grow some crops, thus possibly causing selection bias in the parameter estimates. For this reason, models that analyze the effects of agricultural policies should adopt an explicit methodology that accommodates and explains the existence of corner solutions in the context of micro-level farm data.

Most of the literature on multi-crop estimations relies on aggregate data or does not deal explicitly with corner solutions that occur in land-use decisions. Guyomard et al. [28] estimate a quadratic profit function with several crop groups and inputs using French aggregate data but do not discuss the issue of corner solutions in production. Moro and Sckokai [29] employ a normalized quadratic multi-output profit function on Italian FADN (Farm Accounting Data Network) data but do not exploit the panel data structure of their (individual) data set. Moreover, although they recognize the presence of possible sample selection when dealing with multiple crop groups, they do not control for these sample selection effects explicitly or consistently.

The most common technique used to estimate a structural model subject to censored observations is Tobit estimation. This was proposed in the econometric literature by [30] and has been widely utilized in the empirical literature on demand estimations. Although the Tobit model is useful, it is an ad hoc modification of the regression model, allowing it to be used in cases where there are observations “piled up” at some limiting value (usually zero), and it has no convincing behavioral theory foundation [31]. The standard solution to the problem of a censored dependent variable is to estimate a Tobit model using maximum likelihood (ML) or the [32] two-step method.

The pioneering works of [33] and [34] offer an economic interpretation of corner solutions and a direct and appropriate method for specifying the econometric model. They explain that the set of producer choices can be analyzed employing the Kuhn–Tucker conditions associated with the cost minimization program under the usual technical constraints and nonnegativity constraints on input demand. The implied fully structural approach in [35] and [34] to estimate demand and take account of corner solutions is nonlinear simultaneous zero-censored equation models. When the number of equations is large, the subset of decision outcomes in the system likely to occur at kink points increases, requiring multiple integration for ML estimation.

In response to the issue of dimensionality when estimating demand systems with binding nonnegativity constraints, many strategies are adopted in the literature. Alternative estimation methods to the ML procedure include the maximum entropy estimator [36, 37], the two-step Tobit system [38], generalized method of moments (GMM) techniques [39]. Yen et al. [2] propose a QML approach which they claim is more efficient for small- to medium-sized samples. Comparison with other estimation methods shows that the QML and SML (simulated maximum likelihood) procedures produce remarkably similar demand parameter and elasticity estimates, whereas the results of [40]’s two-step estimator differ widely. Yen and Lin [41] propose a sample-selection alternative with more flexible parameterization (than the Tobit system), using the [40] two-step estimator. This approach is used by [5] in the multiple selection case. However, the sample selection system presents more prominent computational burdens than the Tobit system since the sample likelihood function contains probability integrals with dimensions as large as the number of selection equations for all sample observations.

Simulation-based estimation methods have been suggested to overcome this problem of high-dimensional numerical integration in multivariate limited dependent variable systems. These methods include simulated moments, simulated maximum likelihood, and simulated scores. The simulated ML approach is applied by [42] and [43]. Another way to overcome the computational issue is to use Bayesian methods. Millimet and Tchernis [44] use the Gibbs sampling technique with the data augmentation algorithm to solve both the dimensionality and coherency problemsFootnote 3 while [45] employ the two-step approach proposed by [40] to take account of both censoring and unbalanced panel data structure. Platoni et al. [46] account also for the heteroscedastic structure of the error terms in the (second-step) estimation of the expectation-conditional maximization (ECM) model.

3 The Model and Estimation Issues

3.1 The Model

Consider a risk-neutralFootnote 4 farmer using K variable inputs x and a fixed but assignable factor (land) to produce C different crops, where c is the crop index, \(c=1,...,C\), \(p_{c}\) is the price of crop c; \(y_{c}\) is the output level of crop c, \(w_k\) is the price of input k; \(l_{c}\) is the land allocated to crop c and L is the total available land (\(\sum _{c=1} ^{C}l_c=L\)). \(\tau _c\) is the area-based (per hectare) subsidy rate for crop c. To simplify notation, we consider set-aside as a particular land use. Therefore, it is also indexed by c, although it is not a crop, technically speaking.

Maximizing profit over land and input choices, given the predetermined system of prices, subsidies, and total available land, will produce a unique set of solutions, provided that regularity conditions are satisfied for the profit functionFootnote 5. Following [47] and [28], the multi-crop profit function for a joint input technology given the fixed factor allocation (land) may be written as:

$$\begin{aligned} \Pi (p,w,\tau ,L)=& \ max_{y,x,l}\left\{ \sum _{c=1} ^{C} p_{c}y_{c}-\sum _{k=1} ^{K}w_{k}x_{k}\right.\\&+\left.\sum _{c=1} ^{C}\tau _{c}l_{c}; \sum _{c=1} ^{C}l_{c}=L \; ; \; y \le F(x, L)\right\} , \end{aligned}$$
(1)

where y, x, w, and \(\tau\) are C-dimension vectors and \(y \le F(x,L)\) represents the (multi-output) technological feasibility set.

Under partial decoupling of public payments to agriculture, the equation above has to be modified, to accommodate the situation where only a proportion of area-based payments is crop-specific. Let \(\phi\), \(\phi \in [0,1]\), denote the proportion of area-based subsidies that is decoupled from production, so that \(100 (1-\phi )\) percent are still “coupled” and depend on the area allocated to production for each crop c.

Land set-aside as a particular land use can be divided into a compulsory part (set-aside obligation) and a voluntary component, defined as the fraction of land set-aside above the minimum requirement of an agricultural policy. In our case, the area-based payment for set-aside is the same in both cases; we denote by G the land set-aside obligation, with unit payment g (same notation as in [28]). As a consequence, land used for crops or voluntary set-aside is a fraction of total farm land and is denoted \(L^*=L - G\).

Farmer profit therefore includes a fixed payment denotes FP, consisting of area-based subsidies not coupled with production nor land use (proportional to \(\phi\)) and the compulsory set-aside payment, gG.

Farmer’s profit then becomes

$$\begin{aligned} \Pi (p,w,\tau ,L^*)&=max_{y,x,l}\{\sum _{c=1} ^{C} p_{c}y_{c}-\sum _{k=1} ^{K}w_{k}x_{k}\\&\quad+(1-\phi )*\sum _{c=1} ^{C}\tau _{c}l_{c}+FP; \sum _{c=1} ^{C}l_{c}\\&=L^* \; ; \; y \le F(x, L^*) \}. \end{aligned}$$
(2)

Following [5], the normalized quadratic profit function is written as:

$$\begin{aligned} \overline{\Pi }=& \ \alpha _{0}+ \sum _{c=1} ^{C} \alpha _{c}\overline{ p}_{c} + \sum _{k=1} ^{K-1} \beta _{k} \overline{w}_{k} + \sum _{c=1} ^{C}\gamma _{c} \overline{\tau }_{c}+ \frac{1}{2}\sum _{c=1} ^{C}\sum _{c{^\prime}=1} ^{C}\alpha _{cc{^\prime}} \overline{p}_{c} \overline{p}_{c{^\prime}}\\&+ \frac{1}{2}\sum _{k=1} ^{K-1}\sum _{k{^\prime}=1} ^{K-1 }\beta _{kk'} \overline{w}_{k} \overline{w}_{k'} +\frac{1}{2}\sum _{c=1} ^{C-1}\sum _{c{^\prime}=1} ^{C}\gamma _{cc{^\prime}} \overline{\tau }_{c} \overline{\tau }_{c'} \\& + \sum _{k=1} ^{K-1} \sum _{c=1} ^{C} \delta ^{pw}_{ck}\overline{p}_{c} \overline{w}_{k} + \sum _{c{^\prime}=1} ^{C} \sum _{c=1}^{C} \delta ^{p\tau }_{cc'} \overline{p}_{c{^\prime}}\overline{\tau }_{c} +\sum _{k=1} ^{K-1} \sum _{c=1} ^{C} \delta ^{w\tau }_{ck} \overline{w}_{k}\overline{\tau }_{c}\\& + \sum _{c=1} ^{C} \lambda ^{pL^*}_{c}\overline{p}_{c} L^* + \sum _{c=1}^{C} \lambda ^{\tau L^*}_{c} \overline{\tau }_{c} L^* +\sum _{k=1} ^{K-1} \lambda ^{wL^*}_{k} \overline{w}_{k}L^*, \end{aligned}$$
(3)

where \(\overline{\Pi }=\frac{\Pi }{w_{K}}, \overline{p}_{c}=\frac{p_{c}}{w_{K}}, \overline{w}_{k}=\frac{w_{k}}{w_{K}}, \overline{\tau }_{c}=\frac{\tau _{c}}{w_{K}}\) indicate, respectively, normalized profit, normalized output price and normalized subsidy rate, and input price \(w_{K}\) is chosen as numeraire.

Differentiating the profit in (3) with respect to output prices \(\overline{p}_{c}\) yields the output level of crop c (Hotelling Lemma):

$$\begin{aligned} y_{c}=\frac{\partial \Pi }{\partial \overline{p}_{c}}= & \ \alpha _{c}+ \sum _{c{^\prime}=1} ^{C}\alpha _{cc{^\prime}} \overline{p}_{c{^\prime}} + \sum _{k=1} ^{K-1} \delta ^{pw}_{ck} \overline{w}_{k} + \sum _{c{^\prime}=1} ^{C}\delta ^{p\tau }_{cc{^\prime}} \overline{\tau }_{c}\\&+\lambda ^{pL^*}_{c} L^*, \, \forall \, c=1,...,C. \end{aligned}$$
(4)

Differentiating the profit in (3) with respect to input prices \(\overline{w}_{k}\) yields the variable input demand equation (Hotelling Lemma):

$$\begin{aligned} -x_{k}= \frac{\partial \Pi }{\partial \overline{w}_{k}}=& \ \beta _{k}+\sum _{k{^\prime}=1} ^{K-1}\beta _{kk{^\prime}} \overline{w}_{k{^\prime}} + \sum _{c=1}^{C} \delta ^{pw}_{ck} \overline{p}_{c} + \sum _{c=1} ^{C}\delta ^{w\tau }_{cc{^\prime}} \overline{\tau }_{c}\\&+\lambda ^{wL^*}_{k} L^*, \, \forall \, k=1,...,K-1. \end{aligned}$$
(5)

Differentiating profit in (3) with respect to subsidy rates \(\overline{\tau }_{c}\) yields the land allocation equation:

$$\begin{aligned} (1-\phi )*l_{c}= \frac{\partial \Pi }{\partial \overline{\tau }_{c}}=&\gamma _{c}+ \sum _{c{^\prime}=1} ^{C}\gamma _{cc{^\prime}} \overline{p}_{c{^\prime}} + \sum _{k=1} ^{K-1} \delta ^{w\tau }_{ck} \overline{w}_{k} + \sum _{c{^\prime}=1} ^{C}\delta ^{p\tau }_{cc{^\prime}} \overline{\tau }_{c}\\&+\lambda ^{\tau L^*}_{c} L^*, \, \forall \, c=1,...,C. \end{aligned}$$
(6)

Profit function properties imply that the profit function is (i) non-decreasing in output prices p, non-increasing in input prices w, (ii) homogeneous of degree 1 in prices (pw), (iii) convex in prices (pw), and (iv) continuous in prices (pw). These properties imply some conditions to impose on the parameters. With the normalized form of the profit, the condition of linear homogeneity is automatically satisfied. The convexity conditions imply that the Hessian matrix is symmetrical and positively semi-definite. Imposing convexity restrictions is equivalent to imposing positive semi-definiteness on the matrix of parameters.Footnote 6

Another regularity condition is the land adding-up condition \(\sum ^{C}_{c=1} l_{c}=L^*\), which implies the following conditions on the parameters:

$$\begin{aligned}\sum _{c=1} ^{C}\gamma _{cc{^\prime}}=\sum _{c=1} ^{C} \delta ^{w\tau }_{ck}=\sum _{c=1} ^{C}\delta ^{p\tau }_{cc{^\prime}}=\sum _{c=1} ^{C}\gamma _{c} =0; \; \forall k, \forall c, \end{aligned}$$
(7)
$$\sum _{c=1} ^{C} \lambda ^{\tau L^*}_{c}= 1.$$
(8)

The model to be estimated consists of the system of Eqs. (3456) after imposing convexity restrictions and land adding-up conditions 79. We let \(s_{j}(\theta )\) denote the j-th structural equation in the system, depending on exogenous covariates and a vector of parameters \(\theta\), where the number of equations depends on the number of inputs, outputs and land use. In order to obtain precise results for policy analysis, we propose in this paper the estimation of this system of equations while explicitly dealing with corner solutions and the panel structure of our data. Contrary to other applications of land-use models in agriculture [3], the system of output, input and land equations depends only on observed prices and subsidies, and on total crop land. Other applications of land-use models consider input and output equations as an explicit function of land percentages for crops. We do not follow this approach here, as the demand for crop land is part of our structural system, which contains only exogenous covariates (more precisely, from the farmer’s point of view, assuming total crop land is fixed).

Imposing regularity conditions, as discussed above, is not the only issue when estimating a system of equations derived from profit maximization with farm-level data. Another concern is the existence of corner solutions, that is, zero land area and output level for some crops, and possibly zero expenditure on some inputs. As farmers rarely consider the same cropping system for every agricultural season because of agronomic and pest-management considerations, all possible crops are not planted every year by a particular farmer, implying that land and production variables are equal to 0 in this case. It is beyond the scope of this paper to provide a structural representation of cropping rotations through a dynamic model, see, e.g., [5, 48, 49]. Nevertheless, we provide below an original solution to this problem by considering a multivariate selection problem and relaxing some assumptions underlying the usual multivariate Tobit model.

3.2 Estimation Issues

3.2.1 Corner Solutions

Estimating models with multivariate selection often implies a trade-off between computer-intensive numerical procedures and strong distributional assumptions to achieve parameter consistency and efficiency. For example, [2] discuss solutions based upon the [34] approach, which requires normality and homoscedasticity of structural error terms.

Our strategy is based on a multivariate version of the Tobit model for censored equations and, contrary to [5], we consider that prices and subsidies jointly determine the probability of a crop and its associated output level and land use (as would be the case in the original Tobit model). A drawback with the procedure described in [5] is that their estimator is consistent only if selection equations are uncorrelated, conditional on a the set of covariates. Although this condition can be tested in practice, this may limit the scope of the method. Furthermore, the correlation pattern between structural and selection equation is also restricted to a linear form, which may depend, however, on the period, as in [50]. In order to allow for correlation between all structural equations in the case of a multivariate Tobit model, one possibility is to consider the QML approach, based on pairwise joint probabilities of equations.Footnote 7 This is a simple alternative to multiple integration, which consists of exploring potential correlation between all pairs of structural equations.

Let \(s_{ij}(\theta )\) denote observation i of the j-th structural equation, evaluated at parameter vector \(\theta\). The residual of equation j is denoted \(h_{ij}(\theta )= Z_{ij } -s_{ij}(\theta )\), where \(Z_{ij}\) is the dependent variable in equation j.

The bivariate likelihood for observation i is given by

$$\begin{aligned} \mathscr{L}_i = \prod _{j=2}^{N-1}\prod _{k=1}^{j-1} \mathscr{L}_{ijk}, \end{aligned}$$
(9)

where N is the total number of observations, and

$$\begin{aligned} \mathscr{L}_{ijk} =& \ \left\{ F_2(h_{ij},h_{ik})\right\} ^{1(Z_{ij}=0, Z_{ik}=0)}\\& \times \left\{ \ f_2(h_{ij},h_{ik}) \right\} ^{1(Z_{ij}>0, Z_{ik}>0)}\\ &\times \left\{ F(h_{ik}|h_{ij}) \times f(h_{ij}) \right\} ^{1(Z_{ij}>0, Z_{ik}=0)} \\&\times \left\{ F(h_{ij}|h_{ik}) \times f(h_{ik})\right\} ^{1(Z_{ij}=0, Z_{ik}>0)}. \end{aligned}$$
(10)

Denote \(h_{ij}^*(\theta )=\left[ Z_{ij } -s_{ij}(\theta )\right] /\sigma _j\) the standardized residual of equation j, where \(Z_{ij}\) is the dependent variable and \(\sigma _j\) the standard deviation of the residual in equation j. F is the cumulative distribution function and f is the density function of \(h_{ij}^*(\theta )\).

Under the normality assumption, the individual contribution to the likelihood becomes

$$\begin{aligned} \mathscr{L}_{ijk} = & \ \left\{ \Psi (h_{ij}^*,h_{ik}^*,\rho _{jk})\right\} ^{1(Z_{ij}=0, Z_{ik}=0)} \\&\times \left\{ \psi (h_{ij}^*,h_{ik}^*,\rho _{jk})/(\sigma _j\sigma _k)\right\} ^{1(Z_{ij}>0, Z_{ik}>0)}\\ &\times \left\{ \frac{\phi (h_{ij}^*)}{\sigma _j}\Phi \left[ \frac{h_{ik}^* - \rho _{jk} h_{ij}^*}{\sqrt{1-\rho _{jk}^2}}\right] \right\} ^{1(Z_{ij}>0, Z_{ik}=0)} \\&\times \left\{ \frac{\phi (h_{ik}^*)}{\sigma _k}\Phi \left[ \frac{h_{ij}^* - \rho _{jk} h_{ik}^*}{\sqrt{1-\rho _{jk}^2}}\right] \right\} ^{1(Z_{ij}=0, Z_{ik}>0)}, \end{aligned}$$
(11)

where \(\Psi (.,.,.)\) and \(\phi (.,.,.)\) are the bivariate cumulative density and density functions, respectively; \(\phi (.)\) and \(\Phi (.)\) , respectively, denote the univariate density and cumulative density functions of the standard Normal distribution. Maximizing (10) with respect to \(\theta\) and the variance–covariance matrix yields consistent quasi-maximum likelihood (QML) estimates. As discussed in [4] and in [51], the QML estimator is consistent and asymptotically normal if and conditional mean and variance structure of the model are correctly specified, and if the log-likelihood function used for QML belongs to the generalized linear exponential family.Footnote 8

3.2.2 The Semi-nonparametric Estimator

As discussed above, pairwise computation of distribution functions is much more straightforward than full-information estimators such as maximum likelihood based on all (potentially censored) structural equations. Although QML estimation as described above is a convenient alternative to maximum likelihood under multivariate censoring, it has several drawbacks. First, efficiency of QML estimates may depend on the criterion used in practice (in the family of linear exponential distributions, see [4]) in the maximization of the log-likelihood. Second, as a corollary to the conditions above, [38] and [2] acknowledge that the QML estimator above is not robust to deviations from homoscedasticity assumptions. To deal with the first issue, an interesting alternative is a flexible estimation method, such as a semiparametric alternative to the parametric QML.Footnote 9 Semiparametric QML estimation of univariate Tobit-censored models was proposed as early as [52], and their asymptotic properties are discussed in, e.g., [6]. In semiparametric estimation, normal density and cumulative density functions are replaced in [2]’s QML objective by nonparametric approximations. The distribution of equation residuals is then left unrestricted and structural parameters \(\theta\) can be estimated jointly with nonparametric estimation of the univariate and bivariate distributions.

The semiparametric QML may also provide a solution to the second issue, because variance–covariance parameters can be considered nuisance parameters and be replaced by data-driven bandwidth parameters in, e.g., kernel approximation of distribution functions. By doing so, semiparametric estimation may significantly reduce the number of parameters to estimate, but it may be computer-intensive and have a lower convergence rate. Another alternative, which we consider in this paper, is to consider semi-nonparametric estimation instead.Footnote 10 The approach behind our semi-nonparametric model with selection is to replace unknown joint distributions by a series approximation (see [53]), while providing a solution to the fact that such approximation may cause the number of terms to grow with the sample size [54].

This solution uses sieve estimation, i.e., replacing the infinite-dimensional space associated with the nonparametric functions by a flexible parametric one. Conditions for consistency and asymptotic normality of sieve ML estimation have been provided by [55] and [1].

Inspecting the form of the QML estimator (11), we see that four functions need to be estimated: a) the bivariate cumulative density function, \(F_2(h_{ij},h_{ik})\); b) the bivariate density function, \(f_2(h_{ij},h_{ik})\); c) the univariate conditional cumulative density function, \(F(h_{ij}|h_{ik})\); d) the univariate density function, \(f(h_{ik})\). The bivariate density function can be computed from the derivative of an approximation to the bivariate cumulative distribution function, obtained from a tensor-product spline, such tensor-product spline approximations being easy to compute from a B-spline representation [55].

An interesting aspect of B-splines is that they can be constructed under some shape and smoothness restrictions that are easy to impose, as well as their derivatives of any order, which provide us with a natural procedure for estimating probability and cumulative density functions:

  • Compute the empirical cdfs in the univariate and bivariate cases, for each selection regime;

  • Approximate the univariate and bivariate cdfs for every selection regime by B-splines, imposing smoothness and non-decreasing approximations over the [0, 1] interval;

  • Compute the derivative of the spline approximation to obtain the pdf;

  • Combine joint and marginal distributions to obtain the conditional cdfs for each selection regime.

An important aspect of our semi-nonparametric procedure is that variance and covariance parameters are not estimated jointly with structural parameters, but are replaced by data-driven dispersion measures. The latter are equivalent to bandwidth parameters in nonparametric estimation and correspond to normalization parameters in sieve (B splines) estimation. Hence, they are considered nuisance parameters and homoscedasticity or other restrictions on variances and covariances are completely avoided, as opposed to the parametric QML estimation.

3.2.3 Panel Data

Equations (4), (5) and (6) in Sect. 3 have an implicit static form that can accommodate panel data, by simply adding a time index, t, to dependent and explanatory variables. This also applies to residuals from structural equations that become \(s_{ijt}(\theta )\) for farmer i, equation j and time period t).

To consider QML estimation, whether parametric or semi-nonparametric, an important condition is that sample observations are independently and identically distributed (i.i.d.). With panel data however, this is obviously not the case because unobserved individual effects are likely to be present. Model estimation procedures discussed above have therefore to be adapted to the case of panel data. Consider the total number of observations \(N= \sum _{i=1}^nT_i\), where n is the number of individuals (cross-sectional units), and \(T_i\) denotes observations for cross-sectional unit i. In other words, unbalanced panels can be accommodated for, when \(T_i\) is different across cross sections.

To control for unobserved individual heterogeneity, possibly correlated with explanatory variables, we may consider a fixed-effect approach to the production model. However, as the above model is nonlinear, within-type estimators would not be consistent with a fixed number of time periods. We choose to control for such unobserved heterogeneity by implementing the Mundlak method (see [50]). Assume

$$\begin{aligned} h_{ijt}(\theta ) = Z_{ijt} - s_{ijt}(\theta ) - \eta _{ij}, \quad i=1,2,\dots ,N, \end{aligned}$$

where \(\eta _{ij}\) is the individual effect in equation j, possibly correlated with explanatory variables in \(s_{ijt}(\theta )\). Consider first the balanced panel data case, with \(T_i=T, \; \forall i\). We further assume

$$\begin{aligned} \eta _{ij} = X_{ij1}\gamma _{j1} + \dots + X_{ijT}\gamma _{jT} + v_{ij}, \end{aligned}$$

where \(X_{ijt}\) is the K-vector of explanatory variables in structural equation \(s_{ij}(\theta )\), such that \(E(v_{ij}|X_{ijt})=0\). Under this conditional moment condition, \(\eta _{ij}\) can be replaced in the structural equation by its projection onto explanatory variables. The Mundlak approach corresponds to the special case of [56] with \(\gamma _{j1}=\dots = \gamma _{jT} =\bar{\gamma _j}/T,\; \forall j\) and is preferred in practice if the number of explanatory variables is large, because it inflates the number of parameters by \(K\times T\). Such conditioning is typically designed for discrete-choice models (Probit, Logit) and is naturally adapted to the Tobit framework. Because unobserved heterogeneity in structural and selection equations is likely to be correlated with explanatory variables, we do not consider random-effects estimation but only fixed-effects estimation, to control such possible source of bias in parameter estimates.

The Mundlak approach for fixed effects is easily adapted to the unbalanced panel-data case, with

$$\begin{aligned} \eta _{ij} =\left( \frac{1}{T_i} \sum _{t=1}^{T_i} X_{ijt}\right) \gamma _{j} + v_{ij}, \end{aligned}$$

where \(T_i\) is the individual-specific number of periods. With large N, we can consistently estimate parameters \(\gamma _j\) even when the panel data set is unbalanced.

When proceeding with estimation, we make the important assumption that, conditional on fixed effects \(\eta _{ij}\), the error terms are i.i.d., so that the QML framework for the parametric and the semi-nonparametric estimators applies.

4 Data and Econometric Estimation Results

4.1 Data

The empirical application is conducted on a sample of French farmers from the département of Meuse. The data were provided by the Centre d’Economie Rurale de La Meuse, an agricultural extension service that provides farmers with assistance in bookkeeping and auditing. The original sample consists of 2,356 farm-year observations on 638 farmers observed between 2006 and 2010. This sample represents about 21 percent of the total number of farms in this département (2,975 in 2010, of which about 2,100 are classified as medium- to large-sized farms, see [57]). We remove yearly observations corresponding to no-crop production (animal production only, or no agricultural production for a particular year) or no land use for any of the following: wheat, barley, rapeseed, diester and non-compulsory set-aside. The latter selection restricts the sample to consider only major groups of crops cultivated in the Meuse département, discarding specialty and some industrial crops that are much less represented in the region, and which are not always associated with CAP area-based payments. The final sample consists of 2,014 farm-year observations on 524 farmers over five years. The final dataset represents about 56 percent of specialized arable farms in the Meuse département in 2010 (927 farms within this category, according to [57]). An interesting feature of our data and the period considered is that, during that period, France operated a decoupling scheme with a hybrid status. Following the European CAP reform in 2003 (Luxembourg Agreement, see [58]) decoupling was introduced with environmental cross-compliance among other measures, aimed at making farmers’ production decisions more market-oriented [59]. Member States retained some flexibility in the choice between partial and full decoupling. For example, Spain, France and Portugal opted to maintain the maximum permitted amount of coupled payments (25%), in both the livestock and arable crop sectors. In the French case, the 2008 CAP health check provided for an “à la carte” selection of the tools, allowing voluntary implementation to start by 2010 and end by 2012 at the latest [60]. In terms of the model notation in subsection 3.1, the proportion of area-based subsidies that is decoupled from production, \(\phi\), is equal to 0.75. Another interesting feature of our sample is the higher variability in area-based unit payment rates over the 2006–2010 period, compared with the subsequent years. Such variability is necessary to identify subsidy effects alongside price effects in the structural equations, as partial coupling was also necessary to identify crop-specific output price effects.

4.1.1 Sample Description

The final dataset is unbalanced in the following way: farmers present from 2006 to 2010: 49.8 percent; farmers present four years out of five: 15.45 percent; three years out of five: 10.69 percent; two years out of five: 7.82 percent; one year only: 7.63 percent. In terms of yearly observations, 417 farmers were observed during year 2006, 420 in year 2007, 422 in year 2008, 386 in 2009 and 369 in year 2010.Footnote 11 The average total farm area is 197.9 ha, of which 121.41 ha of arable land (standard deviation of 75.65 ha) and 74.9 for permanent grazing and temporary land for pasture. Average production cost is 263,777.82 Euros / year and total profit per farm is 56,783.74 Euros per year, about 285.18 Euros per ha (standard deviation of 598.82). These statistics of dispersion indicate that farm diversity is limited, as far as size and economic performance are concerned. In terms of spatial location, farms are widespread over the whole Meuse département, as can be seen from Fig. 1, which represents the spatial distribution of sample farms in the cantons (an administrative unit, between the commune and the département). The total area covered by the sample is relatively limited (about 6,200 square kilometers, about 2,400 sq. miles), so that differences in climate and soil characteristics are fairly limited as well. Moreover, as fixed effect procedures are employed in the estimation, farm-specific or site-specific non-time-varying characteristics will be filtered out from the model.

Fig. 1
figure 1

Location of sample farms in cantons of the Meuse département. Note. Numbers for each canton represent the total number of farms in the sample

Concerning crops and inputs, we use the major cropping systems in the Meuse département and select wheat, barley, rapeseed and diester, the last of which is used for biofuel production. Because wheat is, in the vast majority of cases, associated with barley, and farm-gate prices of both crops move in parallel, we combine them to form a composite cereal output. Land decisions are associated with these crops (cereals, rapeseed and diester) as well as voluntary set-aside, as discussed above (it should be recalled that set-aside obligations under the CAP are already accounted for in the model, as total land \(L^*\) considered in the estimation is net of the compulsory set-aside area). We consider only three inputs: seed, fertilizer and pesticide, which are considered the most crop-specific and therefore whose demand is more likely to be influenced by a change in cropping pattern (for our set of arable crops, as opposed to, e.g., labor and fuel).Footnote 12 To obtain farm-specific input prices that vary over time, we proceed as follows. As database records contain only input expenditures and not input physical quantities nor unit prices, we use official yearly statistics on agricultural input price indexes at the département level. We then convert to farm-specific price indexes using the Tornqvist formula, with 2004 as the baseline year.Footnote 13 We then check for possible multicollinearity in input and output prices, with the condition number, computed every year and over the whole sample. The condition number is always less than 15, confirming that multicollinearity in input prices is limited.Footnote 14 Crop outputs are in tons, fertilizer input is in kg, and pesticide input is computed from pesticide expenditure divided by its price index.

All prices and unit subsidies are normalized to the unit cost of seed so that in the estimation we consider only the other two inputs.Footnote 15

Our sample covers the pre- and the post-2008 period, where compulsory set-aside was abandoned and industrial crops such as biofuels were allowed on set-aside land with the same subsidy rate as other arable crops (different from the agronomic set-aside subsidy rate). This concerns, in our case, rapeseed that can be produced for energy use or for the agrofood industry, with the same area-based subsidy rate after 2008. Our data at the farm and crop level allow us to account for the different cases, i.e., area under rapeseed for industrial use has a different area-based unit payment before and after 2008, while it remains at the same rate for rapeseed sold to the food industry. Because changes in use-specific area-based payments over the period are fully accounted for at the farm level when constructing our payment \(\tau _c\) and area \(l_c\) variables, we assume that the model parameters will not depend on such policy changes. Therefore, we do not include in the model a dummy variable for the pre- and post-2008 period, as we assume parameters are neither period- nor policy-dependent. Moreover, as mentioned in the data subsection above, the eligibility condition for payments under CAP was a minimum arable area of 0.3 ha. No farmers in our sample failed to meet this criterion, so that all were eligible and actually received some form of area-based CAP payments every year from 2006 to 2010.

Table 2 presents descriptive statistics for the sample. Almost all farmers grow cereals (wheat and barley) while the respective proportions of positive land percentages for rapeseed, diester and land set-aside are 83%, 31% and 70% of the full sample. These descriptive statistics need to be interpreted with caution since some farmers may not grow a particular crop over the whole period.

Table 2 Descriptive statistics

4.1.2 Intensity of Input Indicators

As discussed above, one reason for using detailed farm-level and crop-level production data is that input use (and therefore, environmental indicators) may differ across cropping systems, implying that farmers’ decisions including corner solutions should be modeled explicitly. Using the sample of farmers, we compute average environmental indicators for various cropping systems: cereals only, cereals-rapeseed, etc. \(II_k\) denotes the intensity of input k (= fertilizer, pesticide) demand on cultivated land, defined as

$$\begin{aligned} II_k= \frac{input \; k \; demand }{crop land}= \frac{x_k }{(l_{cereal}+l_{rapeseed}+l_{diester})}=\frac{x_k}{L-l_{set aside}}. \end{aligned}$$
(12)

Descriptive statistics for our environmental indicators are presented in Table 3. Means and standard deviations are computed for a given combination of crops (e.g., cereals-diester) over all corresponding observations across farmers and years. Crop combinations with fewer than 10 observations are discarded. Table 3 confirms that input use is strongly heterogeneous across cropping systems. This results in very different average environmental indicators and implies that accounting for farmers’ decisions over a whole cropping system (i.e., including decisions leading to corner solutions for some crops) is preferable in terms of policy impact evaluation. Cereal-only cropping systems (C) have a lower fertilizer and pesticide input use (\(x_F\) and \(x_P\)), and a lower input intensity indicator, except for fertilizer where \(II_F\) is slightly higher for cereals only than for cereal and rapeseed (\(C+R\) only). The cereal-rapeseed (\(C+R\)) crop combination is associated with a fertilizer input intensity very close to cereal alone (C), while pesticide input intensity is much higher in the (\(C+R\)) system than in C. This is presumably due to the fact that the total area of the cereals cropping system is smaller than the area of the cereals plus rapeseed cropping system, while intensities of input use for fertilizer are comparable. The difference in input intensity is more pronounced, especially for fertilizer, when diester is included in the crop combination (\(C+D\) or \(C+R+D\)). Interestingly, cropping systems that involve diester (\(C+R+D\)) have a lower fertilizer use intensity than \(C+D\) only, but a higher intensity of pesticide use (4.72 compared with 1.82, 4.52 and 4.13).

Table 3 Environmental indicators, by crop combination

4.2 Empirical Results

4.2.1 Relative Performance of Parametric and Semi-nonparametric Methods

To evaluate the relative performance of the parametric and semi-nonparametric QML estimators with respect to our panel data sample, we compute goodness-of-fit measures for the continuous and discrete parts of the model. We first compute R\(^2\)s on the full sample, for each structural equation, without accounting for the contribution of estimated fixed effects. Because such goodness-of-fit measures are difficult to interpret when the proportion of zero observations is large, we also produce R\(^2\)s for each equation on the subset of positive observations only. Columns 2 and 3 of Table 4 present computed R\(^2\)s for the parametric and semi-nonparametric QML estimators. We note that the difference between R\(^2\)s on the full sample or the subsample of positive observations is noticeable mostly when the proportion of censored observations is higher (case of q rapeseed, q diester, l rapeseed, l diester and l set-aside), with the R\(^2\)s on the full sample being lower than the restricted version in 12 cases out of 27. Interestingly, the model fits data better with parametric PQML than with semi-nonparametric SNPQML on the full sample (6 cases out of 9), but this is the opposite on the restricted samples. The semi-nonparametric estimator performs better than PQML on subsamples with positive observations only.

To check for serial correlation in residuals estimated from PQML and SNPQML, we compute the heteroskedasticity-robust test statistic (HR) proposed by [61] in the context of a panel data model with fixed effects. Test results reported in Table 4 indicate that serial correlation is not present in 15 cases out of 18, with a 5-percent significance level, so that the static specification is valid in a majority of structural equations.

Table 4 Goodness-of-fit measures for parametric and semi-nonparametric estimators

Turning now to the goodness-of-fit measures for discrete outcomes (last column of Table 4), it is more interesting to focus on the equations with a significant proposition of zero observations (q rapeseed, q diester, l rapeseed, l diester and l set-aside). For these five equations, the proportion of correct predictions (positive and negative outcomes) is fairly similar for PQML and SNPQML, so that the gain associated with the latter is only minor.

4.2.2 Estimation Results

Results of the estimated elasticities for the parametric and semi-nonparametric estimator are reported in Tables 5 and 6, respectively. The elasticity magnitude using SNPQML is similar to the estimates obtained using the PQML model, except for the impact of the set-aside subsidy, which is less significant in the output and land equations in the SNPQML estimation. Moreover, the own-price elasticity of fertilizer demand is no longer significant. Recall that Table 5 presents minus the elasticities of fertilizer and pesticide demand with respect to set-aside subsidy; most of them are not significant, although their values are close to those estimated using the parametric model. The elasticities of fertilizer and pesticide demand with respect to set-aside subsidy are not significant, even though their values are very close to those estimated with the parametric model. To compare PQML and SNPQML model specifications, a feasible closeness specification test is proposed by [62], as it is based on likelihood or quasi-likelihood contributions that are directly available from our parametric and semi-nonparametric estimations. This test is useful to decide between two non-nested models (e.g., the same subsets of parameter sets but different functional forms, see [63]). The Vuong specification test statistic equals 1.9077 (p-value=0.0564), when testing the null that both specifications are equal (against the alternative that the parametric model is preferable to the semi-nonparametric one). We therefore conclude that we do not reject the hypothesis that both model specifications are equivalent.

Table 5 Parametric QML elasticity estimates with fixed effects
Table 6 Semi-nonparametric QML elasticity estimates with fixed effects

To summarize our results in terms of comparison of the PQML and SNPQML estimators, the semi-nonparametric QML is less efficient than the parametric version even if the parameter estimates are fairly close. However, the semi-nonparametric estimator performs at least as well in terms of predicting output, input and land-use decisions, as well as in predicting the probability of corner solutions (discrete outcomes).

In the next section, we will discuss both estimators for the elasticity estimation results in terms of environmental impacts in subsections 5.1 and 5.2, but we will consider only the most efficient parametric PQML estimation for the simulation exercise in subsection 5.3.

5 Environmental Impacts of Set-aside Subsidy

The actual efficiency of set-aside policies for conserving the environment and protecting farmland biodiversity is widely debated. Some studies show that set-aside land improves biodiversity [7], while other studies find that set-aside policies are inefficient for conservation purposes [8]. Our purpose here is not to provide an assessment of direct environmental consequences of set-aside policy implementation. Rather, we insist on the importance of an ex ante evaluation of land-use policies such as set-aside and greening of the CAP, in a context of increasing “social demand” for more sustainable agricultural practices.

In the broader context of the CAP reform and, more generally, of agricultural policies embedding environmental objectives, our simulation experiment aims to document the implementation issues for tax policies on nonpoint source emissions such as from pesticide and fertilizer inputs. In the context of regulating nonpoint source pollution from agricultural sources, indirect tax-subsidy policy is sometimes advocated [64], such as subsidizing less input-intensive crops or alternative land use (e.g., land set-aside). To address the possibility that subsidizing set-aside could worsen the environmental effects associated with chemical inputs, we propose two environmental indicators, which can be linked to a policy instrument such as a set-aside subsidy.Footnote 16 Note that, because we use only production data, these indicators will be “environmental effect proxies” and will tend to measure environmental “pressure” from production rather than an actual impact on the ecosystem. The first indicator is the elasticity of input demand (fertilizer, pesticide) with respect to the set-aside subsidy rate, which measures the sensitivity of farm-level input demand to a change in the unit set-aside subsidy rate, all else remaining equal. Assuming total farm land is constant, this indicator is relevant at the farm level and depends indirectly on land set-aside and crop decisions. The drawback of using this indicator is that the intensive margin (i.e., increasing input intensity per unit of land) is relevant only if computed for the cropped area [65]. Therefore, we consider a second indicator based on intensity of input use per unit of cultivated land, which allows us to measure the intensification effect of changes in the set-aside subsidyFootnote 17. Note that this environmental indicator, as a proxy based on input use data, cannot accommodate possibly nonlinear effects of cropping practices and cropping systems on the environment, which would require more detailed data et the plot level. Moreover, depending on the local ecosystem features, spatial spillover effects are also likely to exist between land plots with intensification and another set of plots with land set-aside. Our analysis being at the farm level only, such environmental effects are not identified in our analysis. See, for example, [66] and [67].

We discuss below the results of the estimation of our first indicator, namely the chemical input demand elasticities, with respect to set-aside subsidy (subsection 5.1). Then, we present the measurement and the results of calculations of the second indicator namely the chemical input intensity elasticities with respect to the set-aside subsidy rate (subsection 5.2). Finally in subsection 5.3, we evaluate the environmental impacts of a set-aside policy, obtained as proxies from pesticide and fertilizer demand and input intensity elasticities with respect to land set-aside subsidy.

5.1 Chemical Input Demand Elasticities with Respect to Set-aside Subsidy

Elasticity estimates from parametric QML, presented in Table 5, show that the set-aside area is significantly sensitive to its unit set-aside subsidy (elasticity = 0.1479), an increase in the latter also implying an increase in output and planted area of rapeseed, as well as a minor increase in the cereal output and acreage. This means that, when the set-aside unit subsidy rate increases, farmers tend to intensify their production of these crops as they increase their set-aside area in parallel. This is confirmed by positive and significant elasticities of fertilizer and pesticide with respect to set-aside subsidy, equal to 0.0589 and 0.029, respectively. In the case of diester, an increase in the set-aside subsidy implies a reduction in both output and area of this crop, with a stronger substitution effect than the increase in cereal and rapeseed output and land use discussed above. Fertilizer demand increases with cereal and rapeseed prices (elasticities of 0.0117 and 0.0555, respectively) and decreases with the price of diester (elasticity of −0.1261). Fertilizer demand also increases with rapeseed subsidy (0.0350) and decreases with diester subsidy (elasticity of −0.1200) and does not vary significantly with the price of cereals (elasticity of −0.0010). Finally, demand for pesticide increases with diester price (elasticity of 0.0958) and decreases with the prices of cereals and rapeseed (elasticities of −0.0120 and −0.0548, respectively). Pesticide demand also increases with the subsidy for diester and decreases with the unit subsidy for cereals and rapeseed (elasticities of 0.0498 and -0.0316, respectively).

We compare our estimated elasticities with those found in the literature. Concerning the own-price elasticity of pesticide demand, the meta-analysis by [68] concludes that the median own-price pesticide demand elasticities in Europe is equal to −0.30. More specifically, in the case of France, [69] find an elasticity equal to −0.77 (−1.25 to −0.28) and [70] find a value of −0.17 (−0.24 to −0.10). Using a simulation procedure to evaluate the magnitude of tax change (or price increase) necessary to result in a given quantity of pesticide, [71] find very low pesticide demand elasticity estimates, larger for specialized farms (between −0.026 and −0.049) than for diversified farms (between −0.011 and −0.023) in the short term. According to the literature review by [72], the own-price elasticity of pesticide is equal to −0.30 in the case of France. The value of our own price-elasticity of pesticide (−0.37) therefore falls within the values found in the literature for Europe and particularly for France. Our estimates show a lower value (−0.05) for the own-price elasticity of fertilizer demand than those in the literature for France: [70] (−0.50 to −0.16), [73] (−0.278) and [5] (−0.371). Moreover, our elasticity estimates for fertilizer and pesticide demand with respect to set-aside subsidy (0.0589 and 0.029, respectively) can also be compared with the empirical literature, which, however, remains limited on the subject. Estimates provided in the European context are fairly heterogeneous in magnitude, ranging from 0.12 for [5], 0.13 for [74], to 1.52 for [26].

Several conclusions emerge from our results on the environmental impact of a set-aside policy. All else being equal, a set-aside subsidy has a positive impact on farm-level fertilizer and pesticide demand: an increase in the set-aside subsidy of 1 percent implies an increase of 0.0589 percent in fertilizer demand and 0.029 percent in pesticide demand, respectively. However, the elasticity of chemical input demand with respect to the set-aside subsidy is calculated for total land with no distinction between cultivated and set-aside lands. The next subsection presents our second indicator that explicitly accounts for chemical input use on cultivated land only.

5.2 Chemical Input Intensity Elasticities with Respect to Set-aside subsidy

The second indicator we consider here is defined as the elasticity of chemical input quantity per unit of cultivated land with respect to the set-aside subsidy. This indicator accounts explicitly for land set-aside by considering chemical input use intensity per unit of cultivated land. It accounts for farmers’ decisions about land set-aside following a change to the subsidy but is not dependent on crop distribution (only on total cultivated area). This indicator is calculated as follows:

$$\begin{aligned} \varepsilon _{II_{k}\overline{\tau }_{s}} &= \frac{\partial II_k}{\partial \overline{\tau }_{s}}\times \dfrac{\overline{\tau }_{s}}{II_{k}} = \frac{\partial (\frac{x_k}{L-l_{set aside}})}{\partial \overline{\tau }_{s}}\times \dfrac{\overline{\tau }_{s}}{(\frac{x_k}{L-l_{set aside}})} \\&= \frac{\partial x_k}{\partial \overline{\tau }_{s}}\times \dfrac{\overline{\tau }_{s}}{x_k} - \frac{\partial (L-l_{set aside})}{\partial \overline{\tau }_{s}}\times \dfrac{\overline{\tau }_{s}}{L-l_{set aside}}\\& = \frac{\partial x_k}{\partial \overline{\tau }_{s}}\times \dfrac{\overline{\tau }_{s}}{x_k} + \frac{\partial (l_{set aside})}{\partial \overline{\tau }_{s}}\times \dfrac{\overline{\tau }_{s}}{L-l_{set aside}} \\&= \varepsilon _{x_k\overline{\tau }_{s}}+ \varepsilon _{l_{setaside}\overline{\tau }_{s}} *\frac{l_{setaside}}{L-l_{setaside}}. \end{aligned}$$
(13)

The first term \(\varepsilon _{x_k\overline{\tau }_{s}}\) is the demand elasticity of input k with respect to the set-aside subsidy and measures how a variation of 1 percent in the set-aside subsidy affects the percentage demand for input k. The second term \(\varepsilon _{l_{setaside}\overline{\tau }_{s}}\) is the elasticity of set-aside area with respect to the set-aside subsidy, which measures how a variation of 1 percent in the set-aside subsidy affects the percentage of the area set-aside. A positive \(\varepsilon _{II\overline{\tau }_{s}}\) means that an increase of 1 percent in the set-aside subsidy will increase input demand k by unit of cropped land, implying that an intensification effect is observed.

Table 7 presents the results for input intensity elasticity of fertilizers \(\varepsilon _{II_{f}\overline{\tau }_{s}}\) and pesticides \(\varepsilon _{II_{p}\overline{\tau }_{s}}\) with respect to the set-aside subsidy for PQML and SNPQML specifications. These results show that input demand intensity elasticities with respect to a set-aside subsidy are positive and significant for both fertilizer (0.0645) and pesticide (0.0346) demand in the PQML model. This means that, when the set-aside subsidy increases, farmers tend to increase both their set-aside area \(l_{set-aside}\) and their input use (fertilizer and pesticide); to compensate for the loss due to a reduced crop area, farmers intensify their production by increasing their chemical input demand per hectare of crop area. In our case, this means that increasing the set-aside subsidy has an impact on agrochemical input intensification, which could have a negative impact on the environment (in terms, for example, of water contamination and biodiversity loss). Comparing the input elasticities for fertilizer and pesticide, the value of \(\varepsilon _{II_{f}\overline{\tau }_{s}}\) is always higher than \(\varepsilon _{II_{p}\overline{\tau }_{s}}\). Fertilizer is usually considered a risk-increasing input, since it jointly increases the expected crop yield and its variance. In contrast, previous results in the literature regarding the direction of risk effects of pesticides are ambiguous. Möhring et al. [75] show that the indicator of pesticide choice affects the magnitude and sign of estimated risk effects.

Table 7 Elasticities of fertilizer and pesticide intensity with respect to unit set-aside subsidy

5.3 Set-aside Policy Simulation

According to the new rules following the current CAP reform 2014–2020, farmers are required to implement greening measures or lose up to 30 percent of their basic payment scheme income. The greening rules cover three areas: crop diversification, ecological focus areas and non-intensification measures to maintain permanent grassland.

As the results of our elasticity estimation show, an increase in the set-aside subsidy could imply an increased demand for fertilizer and pesticide inputs. This means that a set-aside policy introduced as an EFA in order to preserve biodiversity could have some potential adverse environmental impacts due to intensification at the farm level. We use our elasticity estimates to simulate the impacts of a public policy that imposes a 5-percent increase in the set-aside area on demand for fertilizer and pesticide. To do this, from our elasticity of set-aside area with respect to its subsidy, we calculate the subsidy increase required to achieve a 5-percent increase in the set-aside area. This value is chosen with reference to the 2014–2020 CAP, which also contains such a requirement but with a major difference, the lack of area-based subsidy associated with land set-aside.

Let us start with the elasticity of set-aside area, \(\varepsilon _{l_{s}\overline{\tau }_{s}}\), with respect to set-aside subsidy, \(\tau _{s}\) on crop s:

$$\begin{aligned} \varepsilon _{l_{s}\overline{\tau }_{s}} = \frac{\partial l_{s}}{\partial \overline{\tau }_{s}}\times \dfrac{\overline{\tau }_{s}}{l_{s}}= \frac{\partial l_{s}}{l_{s}}/ \dfrac{\partial \overline{\tau }_{s}}{\overline{\tau }_{s}}. \end{aligned}$$
(14)

If we assume that \(\frac{\partial l_{s}}{l_{s}}=0.05\) (5 percent), we can calculate the corresponding (equivalent) variation in the set-aside subsidy as

$$\begin{aligned} \dfrac{\partial \overline{\tau }_{s}}{\overline{\tau }_{s}} = \frac{0.05}{\varepsilon _{l_{s}\overline{\tau }_{s}}}. \end{aligned}$$

We then use this variation of the set-aside subsidy above and the fertilizer and pesticide demand elasticities with respect to the set-aside subsidy (\(\varepsilon _{f \overline{\tau }_{s}}\) and \(\varepsilon _{p \overline{\tau }_{s}}\) , respectively) to calculate the corresponding fertilizer and pesticide demand variations, denoted \(\dfrac{\partial x _{f}}{x_{f}}\) and \(\dfrac{\partial x _{P}}{x_{P}}\) , respectively. We finally use these input demand variations and fertilizer and pesticide own-price elasticities (\(\varepsilon _{x_f \overline{w}_{f}}\) and \(\varepsilon _{x_p \overline{w}_{p}}\) , respectively) to calculate the corresponding fertilizer and pesticide “net” price variations (\(\dfrac{\partial w _{f}}{w_{f}}\) and \(\dfrac{\partial w _{p}}{w_{p}}\)).

Strictly speaking, output and input market prices are exogenous (because farmers are price takers), but the final prices to the farmer are “net of tax”, that is, they incorporate the unit tax. As a result, price variations only concern the part of the final price associated with the policy instrument (the tax).

For this simulation, we use elasticities calculated from the parametric PQML model with fixed effects (Table 5). This choice is motivated by more efficient estimates (with respect to the semi-nonparametric model, as discussed above), with most elasticities significant at the 1 percent level. The results of this exercise are summarized in Table 8. Confidence intervals obtained from robust standard errors of parameter estimates are presented in brackets for the key expressions in this table.

Table 8 Simulation results of a 5-percent increase in set-aside area

We find that, in order to obtain a 5-percent increase in the set-aside area, we need to increase the set-aside subsidy rate by 33.81 percent. Using fertilizer and pesticide elasticities with respect to set-aside subsidy, our simulations show that an increase of the set-aside subsidy by 33.81 percent implies an increase in fertilizer demand ranging from 1.87 to 2.11 percent, and for pesticide, an increase in demand ranging from 0.81 to 1.15 percent (columns 3 and 4 of Table 8). This could have potentially adverse effects on the environment including, e.g., nitrogen runoff and ground water pollution. Using own price elasticities of fertilizer and pesticide demand, we can calculate the tax level necessary to offset such increase in fertilizer and pesticide demand. Our simulations show that such tax rates would range from 36.9 to 41.63 percent for fertilizer, and from 2.18 to 3.12 percent for pesticide (changes in input prices from the last two columns of Table 8). In line with most empirical papers dealing with elasticities of input use in agriculture (see, e.g., [68]), our results show that it requires a substantial tax level on pesticide and fertilizer to yield a significant reduction in input use: a tax on fertilizer between 36.9 and 41.63 percent to offset an increase in demand of between 1.87 to 2.11 percent and a tax on pesticide use between 2.18 and 3.12 percent to offset an increase in demand of between 0.81 and 1.15 percent. Note, however, that this result holds if a “homogeneous” value-added tax is applied. Average tax levels might be lower and the tax might be more efficient (in reducing potential environmental risks from pesticides) if tax levels are adjusted for heterogeneous pesticide properties, see [76] and [77]. These results are also in line with those of [78], who considers a theoretical framework with a fiscal scheme consisting of both a tax on nitrogen application and a subsidy on land with cover crops. This paper shows that such a scheme is efficient in improving water quality and that it has the potential to balance the budget dedicated to the public policy.

6 Conclusion

Like any policy modifying marginal benefits of land, set-aside policies imply changes to crop choices and production practices, whose effects and intensity depend on various factors. These policies may even increase crop yield since farmers tend to use low-yield soils to meet set-aside requirements. As a consequence, average cultivated land quality may increase, implying as well an increase in aggregate crop yield per hectare. Such an effect may also be obtained because of input intensification with potentially adverse impacts on the environment, which would conflict with the initial objectives of the policy. To investigate these potential effects due to input intensification, this paper evaluates the effect of a set-aside policy, based on changes in agricultural practices and land use, for a sample of French farmers in the Meuse Département between 2006 and 2010.

We first derive fertilizer and pesticide input demand in the case of multiple crops, from a structural multi-output production model, estimated with both a parametric and a semi-nonparametric QML procedure, to account for multiple corner solutions. We use the most efficient estimator, the parametric QML, to compute elasticities with respect to the set-aside subsidy for two indicators: pesticide and fertilizer demand, and pesticide and fertilizer demand intensity per unit of cultivated land. We find that a policy targeting a 5-percent increase in the set-aside area results in an increase in fertilizer (resp. pesticide) demand from 1.87 to 2.11 (resp. 0.81 to 1.15) percent. Such policy would therefore be associated with chemical input intensification, with potentially adverse environmental effects in terms of biodiversity loss and water pollution. To offset such an intensive-margin effect, taxes ranging from 36.9 to 41.63 (resp. 2.18 to 3.12) percent on fertilizer (resp. pesticide) would be necessary. Environmental effects are likely to be more harmful if the input demand intensity indicator increases for a reduced (cultivated) area. Note, however, that this indicator measures only the environmental pressure as a potential impact from input use, and not the actual environmental impact, which is likely to depend on a variety of factors (soil type, slope, climate, distance to surface or groundwater, and the characteristics of the local ecosystem.).

Because it is a nonpoint source pollution issue, only second-best outcomes may be achieved, using, e.g., indirect taxation (pesticide and fertilizer sales, output level, land use, etc.). Policies consisting of taxing chemical inputs, alongside subsidizing land set-aside, are often advocated as feasible second-best policies, because the first-best policy of taxing environmental damage would be prohibitively costly to implement (monitoring and management costs to the environmental regulator, valuing environmental damages, etc.). Taxing fertilizer and pesticide use to correct for market failures (externalities including water contamination and human poisoning) is generally considered a cost-effective policy in theory (as opposed to command-and-control policies), provided implementation costs are limited and the tax level does not deviate excessively from the optimal (first-best) level. Note also that reducing pesticide use through a tax scheme may be beneficial to the farmer in the long run, because of a potential reduction in, e.g., resistance of plant pests. Although relevant in principle, accommodating such extension would require additional information at the plot or the farm level on the benefits of such tax in terms of pest resistance reduction, in a way similar to our discussion above on a differentiated, risk-specific pesticide tax (see the end of Sect. 5.3).

Revenue streams generated by fertilizer and pesticide taxes can be earmarked to subsidize more sustainable agricultural practices [79]. This implies that in principle, considering a policy consisting in complementing an input tax with a subsidy on set-aside would correspond to the earmarking strategy above. However, the objective of the tax simulation considered here is to offset the increase in input demand following intensification, to illustrate the magnitude of the tax on fertilizer and pesticide as equivalent policies. In other words, the level of the input tax required to offset the negative consequences of a set-aside policy can also be interpreted as the tax level that would be necessary to reduce fertilizer and pesticide use by the same amount (as the increase in demand in the first place). We do not discuss the relative advantages of tax vs. area-based subsidies as policy instruments, which can be found, e.g., in [80], with a discussion on alternative policy instruments that can be considered to correct for market failures associated with agricultural pollution.

Our analysis could be extended in several directions. First, the model could be improved by incorporating other policy instruments, for example, in the case of the recent European Common Agricultural Policy reform, the number of crops in rotation and the proportion of grassland area. Another extension would involve linking our production model to observed farm-level environmental variables, such as water quality and biodiversity. Second, crop rotations may be considered in an extended framework where previous crop decisions could be accounted for in current production decisions. As crop rotations may impose restrictions on future farmer choices in subsequent years, such extension may provide interesting insights as to the potential impact of policies designed to modify agricultural land use decisions. A further difficulty however is that plot-level data are necessary, to fully capture the benefits associated with crop rotations on a set of land plots. Third, concerning our indicators capturing environmental pressure from cropping practices and land use, we could consider other pesticide use indicators [81] or the application of different weights to each crop when computing these indicators. Based on “technical” parameters, the extended indicators could then better reflect heterogeneous environmental risks associated with each crop.

From an econometric viewpoint, several extensions could also be considered, starting with the correction of possible errors-in-variables (EIV) bias when deflating the price system by the price of seed as numeraire, as proposed by [82]. Furthermore, to obtain results that do not vary with the choice of numeraire, an extension would be to consider a symmetric normalized quadratic profit function, as in [83], however with an additional computational cost.Footnote 18 This is left for future research.