1 Introduction

Land-use changes rank among the most significant drivers of change in ecosystem services worldwide. Enhancing important services such as biodiversity and carbon sequestration requires modifications in land-use that can lead to decline in other ecosystems services. Hence for policy and management decisions the value of these services is often most meaningfully understood in term of the trade-offs with other valuable uses of the land (Turner et al. 2003; Bateman et al. 2014). Decision makers are sensitive to costs of actions, particularly where this involves non-marketed ecosystem services. Costs vary spatially but in a different way from benefits. Benefits of many ecosystem services (such as carbon sequestration) accrue at regional and global scale whereas the opportunity cost (such as foregone agricultural expansion) is borne by the local population. An improved understanding of the geographical pattern of opportunity costs is essential for making new land-use policy interventions effective and equitable (Balmford et al. 2011). Moreover, estimates of opportunity costs can provide an evidence base for more informed policy decision-making and increased efficiency in the use of public resources. Given the concerns about food security, biodiversity and carbon sequestration there is an increasing demand for such economic information at higher spatial scales.

The previous literature includes numerous regional scale and national level analyses as well as several global analyses of ecosystem services. These studies can be divided into two categories: mapping exercises and production function studies (see Nelson et al. 2009). The first category takes stock of and maps the availability of specific ecosystem services for entire regions or higher spatial scales (e.g., Foley et al. 2011; Johnson et al. 2014). In addition to mapping the spatial pattern as such, the spatial congruence between specific services is sometimes also assessed (e.g., Raudsepp-Hearne et al. 2010; Holland et al. 2011; Maes et al. 2012). As there is a paucity of primary data on most ecosystem services, proxies (spatial value transfer) are often used. Several authors have combined these mapping exercises with value proxies to assess the global value of ecosystem services (Hussain et al. 2011; Costanza et al. 2014). From a policy design perspective a shortcoming of these mapping studies is that these are often not linked to changes in land use. Measuring the total value of ecosystem services can help to raise awareness but does little to help a decision maker trying to assess the consequences of alternative actions (Keeler et al. 2012).

The second category of literature starts from economic models to predict spatially explicit land-use change. These models are combined with detailed ecological functions which represent how specific ecosystem services depend on local biophysical variables and land-use. Econometric land use models are employed by e.g. Polasky et al. (2008), Keeler et al. (2012), Bateman et al. (2013, 2014), and Lawler et al. (2014). Others use GIS, Bayesian and heuristic routines to combine the ecological and economic concepts (e.g., Bateman 2009; Bateman et al. 2011; Polasky et al. 2011; Jellinek et al. 2014; Stehfest et al. 2014). These production function studies capture the heterogeneity of the physical and economic decision environment at a disaggregate level. The simulation results can then be combined with information on market prices and also with results from non-market valuation studies. Importantly, in these production function studies the biophysical models enable the effect of alternative land-use and land management regimes on the provision of ecosystem services and goods to be estimated. This approach however has methodological as well as practical limitations for the evaluation of opportunity costs of ecosystem services.

The methodological difficulty with many of the approaches above is that they proceed in two parts: projection of land-use change based on an econometric model and then an assessment of the implications of land-use change on selected ecosystem services using biophysical models. Treating ecosystem services as separable facilitates scenario analysis but the assumption of outcome separability may eliminate statistically significant interactions in the multiple-output context. In fact, the production relationships between ecosystem services may be complementary, competitive or substitutive and can depend on local conditions. When various ecosystem services are derived from the same ecosystem, changes in their levels are physically connected through the basic biophysical function of the ecosystem. This in turn will have an impact on opportunity costs.

A second, practical issue concerns data limitations in particular with applications at higher spatial scales, which are relevant for many policy decisions. Ideally we would have spatially referenced observed panel data on land-use to assess the ecosystem services associated with it. Panel data is preferred to include fixed effects. In practice, panel data on actual land-use changes is available only for within countries studies and then only for selected examples such as the USA and the UK (Bateman et al. 2014; Lawler et al. 2014).Footnote 1 For many other countries or larger areas, including our case study countries, these are not available. In contrast, cross-sectional data on land-use is more widely available. In addition biophysical simulation models, which can be used to generate data on how land-use plus other local conditions affect the generation of ecosystem services (viz., agricultural production, carbon sequestration and biodiversity), are also widely available.

To make best use of the limited land-use data available and especially to deal with the non-separability of ecosystem services, this paper presents a methodology new to the analysis of ecosystem services and land use change. We use the cross-sectional output data generated by the biophysical models by grid cell to estimate a transformation function accounting for the production relationships between multiple ecosystem services.Footnote 2 In econometric studies, the problem of multiple output production functions is typically approached using parametric distance functions. We instead employ a two-step approach where the first step is fully non-parametric.

The method we use builds on the conditional two-stage semi-parametric frontier approach (Florens and Simar 2005), which offers several advantages given our research objective. A first major advantage is flexibility with regard to assumptions on the convexity of the production possibility set. Convexity restrictions are often introduced in economic studies based on the production function literature. Assuming convexity relations where they do not exist may result in incorrect conclusions on the potentials of multifunctional land-use or specialization and in flawed policy recommendation (Chavas 2009; Brown et al. 2011; Tschirhart 2012). The method presented in this paper does not require convexity assumptions and so we are able to allow for, and test for, non-convexities. A second advantage of the method is with the way in which (continuous and discrete) exogenous variables are accommodated. The conditional nonparametric two-stage frontier method is free of separability assumptions for the input–output space and the space of the heterogeneous background variables. This is important in our application because local conditions are clearly non-separable from how the land can be used and the outputs that result from it. The conditional non-parametric frontier approach as employed in this paper has been used in a variety of public policy setting where input–outputs relations are difficult to define and depend on local conditions (see e.g. De Witte and Geys 2011; Vidoli 2011; Verschelde and Rogge 2012; De Witte and Kortelainen 2013; Halkos and Tzeremes 2013; Cordero et al. 2015). We are not aware of the use of this frontier method in the context of ecosystem services or biodiversity.

Other authors have investigated non-separability and trade-offs of outputs using multiple-output frontier methods. Recent work in this area estimates directional distance functions; see e.g. Agee et al. (2011, 2014). Direct comparison between this paper and the approach used by Agee et al. is difficult, because the latter employ panel data and use a different estimation procedure. In both the Agee papers identification and estimation of shadow prices is based on fixed effects, although also instrumental variables are utilized to address potential endogeneity problems. We do not have panel data or additional instrumental variables. Moreover, it would likely be infeasible to utilize instrumental variables in our two-stage estimation method, even if we would have some additional variables that could be considered as valid instrumental variables. Another difference is that the approach in the two papers by Agee et al. is fully parametric in the specification of the structural model, while our approach does not require restrictive functional form assumptions.

The proposed approach enables the estimation of the marginal rates of transformation over a range of levels of ecosystem services and goods accounting for spatial heterogeneity. This trade-off derives its economic meaning from the scarcity of the underlying resources and the jointness in the generation of ecosystem goods and services. Thus the trade-off reflects the underlying relationship between priced and non-priced ecosystem services and enables ecosystem service synergies to be covered. The results provide relevant information on spatial differences in trade-offs between ecosystem services and on the areas that have a comparative advantage for supplying particular ecosystem services. This is useful information to guide decision makers in targeting and prioritizing those areas that are most or least suitable for conservation or agricultural development. If for example the regional objective is to improve biodiversity, the areas with low opportunity costs for biodiversity have a comparative advantage for conservation. Properly targeting the most suitable areas for particular land-uses will result in considerable savings for society.

We illustrate our approach for a case study of 18 Central and Eastern European countries to analyse the trade-offs between agricultural revenues, cultural services, carbon sequestration and biodiversity.Footnote 3 The empirical results show that the production possibility frontier is non-concave and that there are large differences in trade-offs between regions. Each country has regions that have a comparative advantage for a particular ecosystem service. The relationship between agricultural revenues and the other output variables exhibits diseconomies of scope. Reallocation of land use, where regions more or less specialize in the outputs for which they have a comparative advantage, can lead to a win-win situation in terms of the aggregate output. The spatial distribution of these results is displayed using GIS maps.

The remainder of this paper is organized as follows. Sections 2 and 3 present the approach and the estimation methodology in more detail, respectively. Section 4 discusses the data. The results of the analysis are presented in Sect. 5. Finally, Sect. 6 discusses the method and results and concludes.

2 Approach

Production theory states that providing multiple interacting services can be ‘costly’ in the sense that more of one service means a loss in terms of the other ones. This trade-off depends on the interactions and synergies within the production structures. How much of the various outputs can be generated as a function of land-use depends to a large extent on biophysical characteristics affecting growth potential. To represent the bundle of ecosystem services levels supplied in a certain location, let conventional marketed outputs be denoted by vector \(y_{m}\) and non-marketed outputs by vector \(y_{n}\). Let factors exogenous to the decision makers, such as geographical location and soil type, be covered by the vector of conditional variables z. The transformation function \(F(y_{m},y_{n}|x,z)=0\) then describes how in a specific area outputs \(y_{m}\) and \(y_{n}\) are jointly produced using inputs x (including land) in a given environment described by the vector z. Due to differences in z, different land-use choices will have different effects for each location within a larger region. The slope of the transformation curve \({F_{y_n^*} }/{F_{y_m^*} }\) reflects the foregone output of \(y_{m}\) due to a marginal increase of \(y_{n}\) at \((y_{m}^{*},y_{n}^{*})\). Assuming the price of the marketed outputs, \(p_{m}\), is known, it follows that the implicit producer price of the non-marketed outputs, \(p_{n}\), at output levels \((y_{m}^{*},y_{n}^{*})\) is given by

$$\begin{aligned} p_n =p_m {F_{y_n } \left( {y_m^*,y_n^*} \right) }/{F_{y_m } \left( {y_m^*,y_n^*} \right) }. \end{aligned}$$

This implicit price reflects the trade-off between the marketed and the non-marketed outputs based on a marginal land-use change. With this information it can be evaluated which areas have comparative advantages in producing more of any of the outputs.

Implementation of the opportunity cost framework to address our research question requires a number of choices to be made: which ecosystem services to include, which data to use and several considerations to identify the appropriate estimation method. We begin by noting that the opportunity costs assessment is typically restricted to marginal land-use changes in the short run. In addition we focus on the trade-offs between non-marketed ecosystem services and agricultural production only. We consider these specific trade-offs as critical because they connect local actions with global issues. In particular the trade-off between food production, carbon stocks and biodiversity is evident in recent opposing trends (Secretariat of the Convention on Biological Diversity 2014). In line with this perspective, we consider only the conversion from different types of forest, to different types of grasslands or to crop land. The unit of analysis in our empirical assessment is spatial (grid cells of size 50x50 km). Thus we take current areas of agricultural, grass- and forest land in a grid cell and analyse trade-offs between agricultural production, biodiversity and carbon sequestered within a grid cell if this land-use were to change marginally.

Secondly, after this simplification the empirical implementation is still non trivial. Key element to the analysis is the marginal rate of transformation for each grid cell, but this cannot be directly observed for the non-marketed ecosystem services. Biophysical simulation models can be used to estimate the levels of various non-market ecosystem services and agricultural production at the grid cell level. These simulation models require detailed information on essential inputs including land-use/cover, water and climatic conditions. Variable input requirements for the agricultural output result from the calibration process of the biophysical model for the agronomic conditions of the specific grid cells and a given technology. Thus, biophysical simulation models would enable the evaluation of comparative advantages (see e.g. Costinot and Donaldson 2012). However, deriving transformation functions accounting for spatial heterogeneity would require running simulation models for each cell for a range of alternative land uses. This is not feasible when the research involves a large region as in our case. Due to this practical limitation we have to estimate the production frontier from a sample of cross-sectional data provided by biophysical simulations model for a large number of grid cells. Grid cells are identical in size but vary in other characteristics z. Land has considerable variation in terms of soil, terrain, climate, and other dimensions which are important for determining how much of the various outputs can be generated as a function of land-use. In addition there will be differences in agronomic and socio-cultural backgrounds affecting productivity. Obviously, for our empirical application, the characteristics of the land in a grid-cell influences how much of it can be used as crop, grass or forest land and the ecosystem services that can be generated. Thus, in our approach the production frontier is identified from the cross-sectional variation in the grid-cell data after properly accounting for given exogenous conditions.

Third, as we have no reason to believe that the relationship between the inputs, outputs and exogenous characteristics follows a specific functional form, the first step of our modelling approach is fully non-parametric. Given that we want to control for spatial heterogeneity in the exogenous characteristics of grid-cells, the choice of modelling techniques is further narrowed to conditional non-parametric estimators (Daraio and Simar 2005, 2007) and more specifically to the conditional model which allows for discrete and continuous exogenous variables (De Witte and Kortelainen 2013). The conditional non-parametric approach can be seen as a matching procedure (e.g. Ho et al. 2007), used to pre-process the dataset such that for each observation (here grid cells) only the most similar observations are selected from the original sample in the estimation of the frontier.Footnote 4 In our application, selection of a subsample of matching grid-cells is based on observable background or conditional variables, the z-variables. This subsample is then used to assess the input–output performance of a particular grid cell (De Witte and Kortelainen 2013).Footnote 5 This implies that for each grid cell, the resulting production frontier may differ, depending on the exogenous background variables z. Basically, our approach involves for each grid cell mixed (i.e., both discrete and continuous) kernel smoothing around the background variables of the grid cell such that observations with similar characteristics have a higher probability of being selected for the analysis. This avoids the separability condition, i.e. that the background variables do not influence the attainable input–output set.Footnote 6 Thus, the conditional frontier model used in this paper assumes that the exogenous variables directly influence the shape of the frontier.

Fourth, two options arise for the non-parametric frontier resulting from the first step of the approach. One could impose convexity on the production possibilities (as in Data Envelopment Analysis) or not (as in the Free Disposable Hull method). Not imposing convexity clearly implies a more general approach. Moreover, ecological studies suggest violations of the convexity hypothesis as far as ecosystem services are concerned (see e.g., Tschirhart 2012). Hence, we concentrate on the FDH model. A known disadvantage of the traditional non-parametric FDH model is that atypical observations can influence the frontier. To mitigate the influence of outlying observations we estimate a partial frontier that allows some observations (e.g. outliers) to be above the frontier. By repeatedly drawing with replacement we obtain a so-called robust FDH frontier (Cazals et al. 2002; De Witte and Kortelainen 2009).

Fifth, in the second stage we approximate the nonparametric frontier with a parametric function. The resulting differential function enables unique opportunity costs to be derived. We adopt the translog function for its flexibility and tractability. The estimation in the first stage also shows the efficiency with which grid cells are using the available land and technology. We do not further investigate the inefficient grid cells as these are situations where land-use change can lead to a gain in services at no costs, regardless of the shape of the frontier. We do however use the inefficiency results as these are crucial for the second stage of our procedure.

Finally, for the estimation of the opportunity costs, the framework requires at least one of the potential outputs to have an observed monetary value. In our application, agricultural revenue serves as the sole priced output. This implies that agriculture has to be a viable land-use in (at least a part of) each of the grid cells included in the empirical exercise. It also assumes that land-use associated with other margins of adjustment (housing or commercial activity) can be excluded. Whether these are justifiable assumptions is an empirical question. Our case study application is for Central and Eastern Europe. In these countries rural landscapes are farming landscapes.Footnote 7 There is limited opportunity of other employment or other types of land-use. Changes in agricultural policies in Europe have led to intensification in some European landscapes, accompanied by cropland abandonment in others. In Central and Eastern Europe it is land abandonment rather than conversion to commercial or residential use that is a policy concern (Renwick et al. 2013). Hence agricultural revenue foregone is to be seen as an upper bound on the lowest opportunity cost of using available land to deliver ecosystem services. It is an upper bound because the opportunity costs reflect the trade-offs at the production possibility frontier. The production possibility frontier gives the lowest possible opportunity cost as the wider economic costs of restrictions on land-use are excluded from the analysis (Batie and Mabbs-Zeno 1985; Parson and Wu 1991). Agricultural revenue does not reflect the wider economic costs of foregone rural development opportunities leading to loss of income from prevented commercial and industrial development and foregone opportunities for job creation.

The approach outlined above is static as it is based on the (simulated) bundles of ecosystem services for each grid cell in the study area at a specific moment in time. It would be interesting to extend this to a dynamic analysis using panel data. Panel data would enable the use of a grid cell indicator as an additional conditional variable. However, dynamic non-parametric methods are still in their infancy. Moreover, it would require obtaining simulated data representing the dynamic effects of land-use changes on ecosystem services which is beyond the scope of this paper.

3 Estimation Procedure

We now more formally discuss the suggested estimation procedure. The two-stage procedure is adapted from Florens and Simar (2005). In line with the choices made in Sect. 2 above, for the first stage we employ the robust conditional FDH method proposed by De Witte and Kortelainen (2013). In the second stage, the non-parametric frontier is approximated parametrically such that unique opportunity costs can be derived.

3.1 Stage 1: Robust Conditional FDH

To explain the first stage, we first introduce the conditional FDH estimator. The basic FDH method is discussed, after which we show how the method is adapted to account for conditional variables and for atypical observations.

Introduce a vector of outputs \(y\in \mathbb {R}_+^R \)(which cover vectors \(y_{m}\) and \(y_{n}\) introduced above), inputs \(x\in \mathbb {R}_+^S \) which cover the land-use choices, and conditional variables \(z\in \mathbb {R}_+^T \) which are beyond the control of the decision makers. The production possibility set of feasible output levels is defined as

$$\begin{aligned} \Psi =\left\{ {\left( {x,y,z} \right) \left| {x\hbox { can produce }y\hbox { given characteristics }z} \right. } \right\} . \end{aligned}$$

For a sample of L observations of the input, output and conditional variables for the L grid cells, \(\left\{ {\left( {x_l ,y_l ,z_l } \right) \left| {l=1,\ldots ,L} \right. } \right\} \), the Free Disposal Hull (FDH) estimator for the production possibility set \(\Psi \) is (with bandwidth parameter h):

$$\begin{aligned} \Psi ^{FDH}\left( {x,y,z} \right) \!=\!\left\{ {\left( {x,y,z} \right) \!\in \! \mathbb {R}_+^{S+R+T} |y\le y_l ,x\ge x_l ,z_l \!\in \! \left[ {z-h,z+h} \right] \exists l=1,\ldots ,L} \right\} \nonumber \\ \end{aligned}$$
(1)

The FDH frontier of \(\Psi \) is a stairway-shaped curve connecting the Pareto optimal input–output combinations from the sample. In the left-hand panel of Fig. 1 line AB represents the FDH-frontier and region OAB the production possibility set of feasible outputs.

Fig. 1
figure 1

Left-hand panel Representation of stage 1 showing the feasible output set \(\Psi \), the FDH-frontier (AB) and the distance \(\delta _{l}\) of an observation \(y_{l}\) to the frontier, for an example with two outputs. Right-hand panel Representation of stage 2 showing the parametric frontier function with the observations projected onto the frontier. Note The minimum distance of observation \(y_{l}\) to point a at the frontier of Pareto optimal observations is \(1-\delta _a^1 \) and to point b is \(1-\delta _b^2 \). So, the minimum distance to frontier AB is \(\sup \left\{ {1-\delta _a^1 ,1-\delta _b^2 } \right\} =\inf \left\{ {\delta _a^1 ,\delta _b^2 } \right\} \). If \(y_{l}=(y^{1}, y^{2})\) improves by \(1/{\delta _b^2 }\cdot 100\,\% \), the observation would move from \(y_{l}\) to \({y_l }/{\delta _b^2 }\), which is located on the frontier

Under the assumption of free disposability (see e.g. Färe and Grosskopf 2000, for an explanation of the assumptions), the distance \(\delta _{i}\) of the ith observation \((x_{i},y_{i},z_{i})\) to the frontier can be estimated using the Shephard distance function—see also Fig. 1.

$$\begin{aligned} \delta \left( {x_i ,y_i |z_i } \right) =\inf \left\{ {\delta _i |(x_i ,y_i /\delta _i ,z_i )\in \Psi } \right\} \end{aligned}$$
(2)

If observation i is a Pareto efficient observation, \(\delta _{i}\) = 1. If it is not at the frontier, \(\delta _{i} <\) 1 and \((1/\delta _{i}-1)\cdot 100\,\%\) measures the percentage increase of each output of observation i necessary to reach the frontier and become Pareto optimal (Daraio and Simar 2007). As discussed in Sect. 2, we do not further interpret these distance levels but note these are necessary for the parametric approximation of the frontier function in stage 2.

We now discuss step-by-step the method used to derive (2). First, consider a situation without conditional variables z. Daraio and Simar (2005), De Witte and Kortelainen (2009) and Bădin et al. (2012) show that for that situation the distance measure of (2), can be written in probabilistic format as:

$$\begin{aligned} \delta \left( {x_i ,y_i } \right)= & {} \mathop {Inf}\limits _{\delta _i } \left\{ {\delta _i \left| {S_Y \left( {\left. {{y_i }/{\delta _i }} \right| x_i } \right) >0} \right. } \right\} \nonumber \\= & {} \mathop {Inf}\limits _{\delta _i } \left\{ {\delta _i \left| {\frac{I\left( {{y_i }/{\delta _i }\le y_l ,x_i \ge x_l } \right) }{I\left( {x_i \ge x_l } \right) }>0,l=1,\ldots ,L} \right. } \right\} \end{aligned}$$
(3)

for observation \((x_{i},y_{i})\), with \(S_Y \left( {y\left| x \right. } \right) ={\Pr \left( {y\le Y,x\ge X} \right) }/{\Pr \left( {x\ge X} \right) }\) the survivor function of Y and \(I(\cdot )\) the indicator function. The resulting distance \(\delta _i =\delta \left( {x_i ,y_i } \right) \) represents the smallest distance between observation \(\left( {x_i ,y_i } \right) \) and the frontier.Footnote 8

If the shape of the frontier for given grid cell i is conditioned on the cell characteristics \(z_{i}\), (3) changes to \(\delta \left( {x_i ,y_i \left| {z_i } \right. } \right) \). This implies that for each grid cell i a unique frontier is estimated, depending on the values of the conditional variables. The conditional frontier is determined from cells whose characteristics z match those of the ith grid cell, \(z_{i}\). The matching procedure uses smoothing techniques based on kernel density functions such that grid cells with matching z-values have a higher probability of being chosen (see Daraio and Simar 2005; De Witte and Kortelainen 2009, 2013). The constraints in (3) change to

$$\begin{aligned} S_Y \left( {y_i \left| {x_i ,z_i } \right. } \right) =\frac{\sum _{l=1}^L {I\left( {y_i \le y_l ,x_i \ge x_l } \right) K_h \left( {{\left( {z_i -z_l } \right) }/h} \right) } }{\sum _{l=1}^L {I\left( {x_i \ge x_l } \right) K_h \left( {{\left( {z_i -z_l } \right) }/h} \right) } } \end{aligned}$$
(4)

with \(K_{h}\)(.) a kernel function with bandwidth parameter h.

Finally, in order to correct for the known sensitivity of FDH to atypical observations, we further adapt (4). For each grid cell i, the distance \(\delta (x_{i},y_{i}|z_{i})\) to the frontier is determined repeatedly according to (3) and (4) for a subsample of m grid cells drawn with replacement from the original sample, after which the expectation is taken. The resulting distance measure is less sensitive to extreme observations than (3). Cazals et al. (2002) showed that the conditional order-m distance measure can be written as (see also Daraio and Simar 2005)

$$\begin{aligned} \delta ^{m}\left( {x_i ,y_i \left| {z_i } \right. } \right) =\left( {\int \limits _0^\infty {\left[ {1-\left( {1-S_Y \left( {uy_i \left| {x_i ,z_i } \right. } \right) } \right) ^{m}} \right] du} } \right) ^{-1} \end{aligned}$$
(5)

As a rule-of-thumb, m should be in the order of magnitude of the full sample size. The resulting distance measure for each grid cell reflects whether output is at the frontier \((\delta =1)\) or by how much this is to be increased to reach the frontier.

3.2 Stage 2: Estimation of Opportunity Costs

In the second stage, following Florens and Simar (2005) and Daraio and Simar (2007), the frontier function is approximated parametrically with a flexible parametric production function using the distance measures estimated in stage 1—see panel 2 in Fig. 1. Even though the non-parametric frontier does not allow for deriving unique opportunity costs, its parametric approximation does. In what follows, input variables are dropped from the notation. In the empirical analysis only one input variable is included—land—which is identical across all observations.

Let \(\delta (y_{i}|z_{i})\) again be the Shephard output distance function for which values are estimated non-parametrically for each grid cell i using (2) and (5). Introduce a parametric distance function \(\varphi (y_{i},z_{i};\theta \)), which is homogenous of degree one in y, and with the unknown parameters given by vector \(\theta \). The aim is to estimate the values of \(\theta \) which give the best approximation of the non-parametric multivariate output distance function \(\delta \)(.). Following Florens and Simar (2005) and Daraio and Simar (2007), this requires solving the following equation:

$$\begin{aligned} \theta _0 =\arg \mathop {\min }\limits _\theta \left[ {\sum _{i=1}^L {\left( {\ln \delta \left( {y_i |z_i } \right) -\ln \varphi \left( {y_i ,z_i ;\theta } \right) } \right) ^{2}} } \right] . \end{aligned}$$
(6)

Assume a translog function

$$\begin{aligned} \ln \varphi \left( {y_i ,z_i ;\theta } \right) =\alpha _0 +\beta ^ {\prime }\ln y_i +\frac{1}{2}\ln y_i^{\prime }\Gamma \ln y_i +\gamma ^{\prime }\ln z_i \end{aligned}$$
(7)

in which \(\theta =\left( {\alpha _0 ,\beta ,\Gamma ,\gamma } \right) \) are the parameters of function \(\varphi \), with \(\alpha _{0}\) a scalar, \(\beta \in \mathbb {R}^{R}, \gamma \in \mathrm{R}^{T}\) and \(\Gamma \in \mathbb {R}^{RxR}\) with \(\Gamma =\Gamma ^\prime \) (see Daraio and Simar (2007). Note that (6) is given in log-terms because of the functional form chosen (see Daraio and Simar 2007). Due to homogeneity of degree one in \(y_{i}\), it has to hold that \(\beta ^{\prime }\cdot i_R =1\) and \(\Gamma \cdot i_R =0\), with \(i_{R}\) the identity vector of size R.Footnote 9 It follows that for the translog function, (6) can be rewritten as [with values \(\delta _i =\delta \left( {y_i |z_i } \right) \) following from (5)]

$$\begin{aligned} \theta _0= & {} \arg \mathop {\min }\limits _\theta \left[ {\sum _{i=1}^L {\left( {\ln \delta _i -\left( {\alpha _0 +\beta ^{\prime }\ln y_i +\frac{1}{2}\ln y_i^{\prime } \Gamma \ln y_i +\gamma ^{\prime }\ln z_i } \right) } \right) ^{2}} } \right] \\= & {} \arg \mathop {\min }\limits _\theta \left[ {\sum _{i=1}^L {\left( {-\ln y_{i1}^*-\left( {\alpha _0 +\beta _{-1}^{\prime } \ln \tilde{y}_{i,-1} +\frac{1}{2}\ln \tilde{y}_{i,-1}^{\prime } \Gamma _{22} \ln \tilde{y}_{i,-1} +\gamma ^{\prime }\ln z_i } \right) } \right) } ^{2}} \right] \end{aligned}$$

with \(y_{i1}^*=y_{i1} /\delta _i \) the values of \(y_{i1}\) projected on the frontier and \(\tilde{y}_{i,-1} =y_{i,-1} /y_{i1} =y_{i,-1}^*/y_{i1}^*\).

In words, to estimate the best parametric approximation of the multivariate output distance function, the output values are projected on the frontier using the distance values estimated in the first stage, after which the frontier function

$$\begin{aligned} \ln y_{i1}^*=-\left( {\alpha _0 +\beta _{-1}^{\prime } \ln \tilde{y}_{i,-1} +\frac{1}{2}\ln \tilde{y}_{i,-1}^{\prime } \Gamma _{22} \ln \tilde{y}_{i,-1} +\gamma ^{\prime }\ln z_i } \right) \end{aligned}$$
(8)

is estimated using OLS. Using the conditions on \(\beta \) and \(\Gamma \) as given above, distance function \(\varphi \left( {y_i ,z_i ;\theta } \right) \) immediately follows. One of the major advantages of this approach is that no restrictive homoscedasticity or distributional assumptions have to be made for the error term. A disadvantage, because of the first-stage estimation, is that in the second stage standard errors have to be obtained using a computationally intensive bootstrapping procedure (see Florens and Simar 2005).

As a final step, for each grid cell opportunity costs or trade-offs between the output combinations are derived in physical and monetary terms. For simplicity of notation, in \(y_{i}\) the subscript i for an element of the set of observations \(i \in \{1, {\ldots }, L\}\) is dropped. Below \(y_{r}\) reflects the rth element of vector y, with \(r \in \{1, {\ldots }, R\}\). The slope of the frontier function (8) at the output values of a grid cell represents the marginal rate of transformation.Footnote 10 This gives the opportunity costs, the output foregone due to an increase in one of the other outputs. This opportunity cost ratio can be derived using the duality relationship between the benefit function and the distance function (Färe and Grosskopf 2000). For output price \(p \in \mathbb {R} ^\mathrm{R}\), the maximum benefits attainable are defined as \(B\left( p \right) =\sup _y \left\{ {p^{\prime }y|(y,z)\in \Psi } \right\} \). As \(y/{\delta \left( {y|z} \right) }\) is a feasible output vector at the frontier \(\Psi \), it has to hold that \(B\left( p \right) \ge p^{\prime }y/{\delta \left( {y|z} \right) }\) and so \(\delta \left( {y|z} \right) =\max _p \left[ {{p^{\prime }y}/{B\left( p \right) }} \right] \). Taking the partial derivative, it follows that for each output r = 1, ..., R

$$\begin{aligned} \frac{\partial \delta \left( {y|z} \right) }{\partial y_r }=\frac{p_r }{B\left( p \right) } \end{aligned}$$
(9)

This marginal distance function can be derived from (7). If the market price is known for one of the outputs, e.g. for the first output, (9) can be used to derive opportunity costs in monetary terms for the other outputs. For the translog distance function, opportunity costs are

$$\begin{aligned} p_r =p_1 \cdot \frac{{\partial \delta \left( {y|z} \right) }/{\partial y_r }}{{\partial \delta \left( {y|z} \right) }/{\partial y_1 }}=p_1 \frac{y_1 }{y_r }\left( {\frac{\beta _r +\Gamma _r^{\prime } \ln y}{\beta _1 +\Gamma _1^{\prime } \ln y}} \right) \end{aligned}$$
(10)

with \(\Gamma _{r}\) the rth row of vector \(\Gamma \).

For example if \(y_{1}\) is defined as agricultural production and \(p_{1}\) its market price, the second stage gives opportunity costs of the non-monetary outputs in terms of foregone agricultural revenues. These opportunity costs show in a positive (not a normative) way the trade-offs between the monetary and non-monetary outputs and can provide useful information to support the decision making process whether society is willing to make this trade-off.

3.3 Estimation of Returns to Scope

The opportunity costs estimation above shows in monetary terms the trade-offs between two output variables, one of which is traded in the market. For policy advice however, it is relevant to know also how the other, non-marketed outputs are related to each other – especially for non-marketed ecosystem services it is relevant to know whether joint production yields efficiency gains. The scope properties of the estimated frontier can provide important information in this context.

Economies of scope have obtained relatively little attention in the applied literature—see e.g. Paul and Nehring (2005) for an application to agriculture and Pope and Johnson (2013) and Preyra and Pink (2006) for applications to health care. Application of the theoretical work of Panzar and Willig (1981) gives relevant insights in the strategic question whether efficiency gains can be obtained from specialization or from diversification. The same strategic management question can be raised for land use within each grid cell—should they specialize in agriculture or biodiversity conservation, or should they attempt to realize both within a grid cell? In case of a concave frontier function a certain level of diversification is preferable within each grid cell, whereas for a convex frontier it is more efficient to specialize. In the latter case the losses in one output variable will be more than proportionally compensated for by the gains in the other output variable and specialization across grid cells may lead to a win-win situation where the aggregate of each output variable benefits.

Pope and Johnson (2013) discuss how to derive the scope properties for a multi-output production frontier in separation from scale properties and without relying on price information. They show that positive returns to scope exist between two outputs if the frontier is concave for these outputs. Returns to scope are negative if the frontier for the two outputs is a convex function. More formally, for the frontier \(F(y|z) = 0\)—for a proof see Pope and Johnson (2013),

$$\begin{aligned}&\text {Positive returns to scope exist between} y_{m} \text {and} y_{n} \text {if} \frac{\partial ^{2}y_n }{\partial y_m^2 }\le 0.\\&\text {Negative returns to scope exist between} y_{m} \text {and} y_{n} \text {if} \frac{\partial ^{2}y_n }{\partial y_m^2 }\ge 0. \end{aligned}$$

with \(m,n \in \{1,{\ldots },R\}\). For frontier function \(F\left( {y,z} \right) =\alpha _0 +\beta ^{\prime }\ln y+\frac{1}{2}\ln y^{\prime }\Gamma \ln y+\gamma ^{\prime }\ln z=0\) we can show that

$$\begin{aligned} \frac{\partial ^{2}y_n }{\partial y_m^2 }=-\frac{\partial y_n }{\partial y_m }\cdot \frac{1}{y_m }\cdot \left( {1+\frac{\tau _{nm} }{A_n }-\frac{\tau _{mm} }{A_m }} \right) \end{aligned}$$
(11)

with \(\tau _{nm}\) the nmth element of matrix \(\Gamma , \Gamma _n^{\prime } =\left[ {\tau _{n1} ,\ldots ,\tau _{nR} } \right] \), \({\partial y_n }/{\partial y_m }=-{y_n }/{y_m }\cdot {A_m }/{A_n }\) and \(A_n =\beta _n +\Gamma _n^{\prime } \ln y\).

4 Data

We illustrate the approach discussed above for a case study area of eighteen Central and Eastern European countries—see Fig. 2. We employ detailed biophysical simulation models to generate ecosystem services output variables at the level of grid cells of approximately 50 \(\times \) 50 km. The study area is divided into 1166 grid cells. Some of the data is available at a higher resolution. For presentation purposes, it is aggregated to a grid cell level of 50 \(\times \) 50 km.

Fig. 2
figure 2

Map of the Central and Eastern European countries and four sub-regions considered

Data were provided by the integrated assessment model IMAGE (Bouwman et al. 2006; Stehfest et al. 2014), the biodiversity model GLOBIO (Alkemade et al. 2009) and additional ecosystem services models (Schulp et al. 2012). These models have been calibrated using statistics on land-cover, land-use, input use and agricultural production for the period 1970–2000. Results from these models were recently used for the OECD Environmental Outlook (OECD 2012), UNEP Global Environmental Outlook (UNEP 2012) and the Global Biodiversity Outlook (Secretariat of the Convention on Biological Diversity 2014).

The Integrated Model to Assess the Global Environment (IMAGE) is a modelling framework to simulate the environmental consequences of drivers such as climate change, land-use change, biodiversity loss, modified nutrient cycles, and water scarcity. IMAGE consists of interlinked modules for land-use, crop growth, vegetation, water and climate and the agricultural economy. The agricultural economy module is based on the computable general equilibrium model GTAP extended with a highly disaggregated agricultural sector. It includes a land supply function that accounts for the availability and suitability of land for agricultural use, based on biophysical information from IMAGE. A nested land-use structure accounts for the differences in substitutability of the various types of land use (Van Meijl et al. 2006). Detailed descriptions of IMAGE and of its agricultural economy module are provided by Stehfest et al. (2014) and Woltjer and Kuiper (2014), respectively.

The GLObal BIOdiversity model GLOBIO calculates the impact of environmental drivers on biodiversity. The model is based on cause-effect relationships between biodiversity in different biomes and a number of environmental drivers (land use change, atmospheric nitrogen deposition, infrastructure, fragmentation and climate change). The model uses spatial information on environmental drivers from the IMAGE model. GLOBIO consists of modules for terrestrial ecosystems, the freshwater environment and marine ecosystems. Impacts on biodiversity are given in terms of the biodiversity indicator Mean Species Abundance (MSA). MSA is a measure of natural intactness and measures the mean abundance of original species relative to their abundance in an undisturbed ecosystem. Details of GLOBIO are provided by Alkemade et al. (2009).

IMAGE and GLOBIO use current land use patterns for each grid cell as input. Shares of different types of land-use (share of each cell covered with agricultural land, grassland, shrub land, forest or cultivated land) are based on the GLC2000 land-use map (EC-JRC, 2003). Table 1 shows the frequency distribution of the land use types over the grid cells. Of the total land area, only 1 % is classified as ‘artificial surface and associated areas’, 57 % is used for agriculture and grazing and 42 % is nature (different types of forest and shrubland). The table shows that most cells have at least some forest or shrubland and some agricultural land. Only a small number of the cells is almost entirely used as agricultural land or covered with forest.

Table 1 Frequency distribution of the fractions of a cell used for a particular land use

For the empirical application, we used simulation results of IMAGE and GLOBIO for the year 2000. This yields a vector of outputs for each grid cell. The following output variables are included in the empirical analysis.

  • Agricultural revenues (provisioning services; in 2000 international $/km\(^{2})\). For each cell total revenues for the production of cereals, grass, maize, pulses, roots, tubers, and oil crops are calculated based on land-use data from the GLC2000 map, the cropping pattern, cropping intensities and potential yields from IMAGE, and prices from FAOstat. Aggregate production per crop is based on FAO-data, which is allocated over the cells using IMAGE and the GLC2000 land use pattern. Note that calculating net revenues is not feasible due to the difficulty to estimate production costs in a reliable way.

  • Cultural services: a composite index consisting of attractiveness for tourism and recreation and for hunting and gathering activities (Schulp et al. 2012). Tourist and recreation attractiveness is an index ranging from 0 (unattractive) to 1 (attractive) and depends on the percentage protected area, percentage urban and arable land, distance to coast and geographic relief. Potential for gathering and hunting is based on statistics from FAO and the European Forestry Institute.

  • Biodiversity: As discussed above, we measure biodiversity as mean species abundance (MSA) as calculated by GLOBIO. MSA is contingent upon land-cover, habitat, percentage of the cell covered with certain vegetation, nitrogen deposition, and distance to roads and cities.

  • Carbon sequestration: Carbon sequestration is used as a proxy variable for climate regulation. It is measured as net biome productivity in tonnes C per km\(^{2}\) which is calculated as net primary production minus soil respiration minus the carbon sequestered in the biomass harvested. For respiration and sequestration factors long-term averages are taken that are a function of land-use, climate and location conditions. Data generated by IMAGE is based a.o. on the GLC2000 land-use map and the EURURALIS carbon model (see e.g. Schulp et al. 2008).

Land-use is the only input variable. To proxy for the heterogeneity in the operational and natural environment in which land-use across grid cells takes place a number of conditional variables are included (see Sect. 2). The conditional variables included are composite proxy variables. The following conditional variables are used:

  • Potential yield: potential yield of temperate zone cereals in tonnes/km\(^2\). This conditional variable is a composite proxy variable depending on climate, soil and slope characteristics. Temperate zone cereals (wheat, rye, corn, barley, oats) are chosen as this is the main crop grown (around 60 % of the cropland is covered with temperate zone cereals).

  • Share of agricultural and grassland: share of each cell used for production of agricultural crops and for grazing. In the analysis, a distinction is made between arable land, grassland, forests, shrub and herbaceous land, and artificial surface.

  • Sub-region typology: categorical variable reflecting differences in historical, political and social development patterns which may affect the technical possibilities available to the regions. Four sub-regions are considered: (1) Member countries of the Commonwealth of Independent States (CIS) Belarus, Estonia, Latvia, Lithuania, Moldova and Ukraine, (2) Central European countries (CE) Czech Republic, Hungary, Poland and Slovakia, (3) The republics that formerly constituted Yugoslavia (YUG) Bosnia, Croatia, Macedonia, Serbia and Slovenia, (4) The south-eastern European countries (SE) Albania, Bulgaria and Romania (see, Fenger 2007).

  • GDP PPP per \(\hbox {km}^{2}\) for the year 2000 (in international $/\(\hbox {km}^{2})\): GDP levels per grid cell are based on World Bank data for GDP per country, agricultural shares in GDP and for urban and rural population per cell from the IMAGE modelling system. National GDP is allocated over the grid cells by considering differences in income for the rural and urban population.

The question can be raised to what degree these four proxy variables capture the essential characteristics of a grid cell, and how reasonable it is to estimate a production possibility frontier for a given grid cell using other grid cells chosen for their similarity as captured by the vector of conditioning variables. We intend these four proxy variables to capture the major factors affecting regional potentials. Potential yield depends on ecological, climatological, hydrological, morphological, geographical and agronomic characteristics of each cell. The share of agricultural land conditions the agricultural revenues that can be generated. GDP per \(\hbox {km}^{2}\) reflects population density, regional welfare and therefore also partly captures other differences due to variation in human capital and other factors influencing productivity. Finally, the sub-regional typology reflects differences in agronomic and socio-cultural backgrounds that also affect productivity and human capital.

The selection of the type and number of conditional variables is an empirical question as there is no test to underpin the choice. In addition, the number of conditional variables that can be included is dependent upon the number of observations. The more conditional variables, the lower the number of matching cells and so the less accurate the estimates of the frontiers’ shape will be. We repeated our analysis with several variants of the vector of conditional variables in which we included or excluded individual conditional variables. As the nature and pattern of the results did not change significantly, we conclude that the four proxy variables included in this paper satisfactorily represent the essential characteristics of the grid cells.

Table 2 Averages and standard deviations of the different variables included for the different sub-regions and different land-covers (standard deviations are given in brackets)
Table 3 Correlation coefficients of the variables included in the analysis

Maps of the base data are shown in the Appendix. Tables 2 and 3 give some descriptive statistics. The large standard deviations for some variables reflect differences in population density and differences between average country development levels. Signs of correlation coefficients are as expected and related to land-cover and land-use. Agricultural production is higher in cells with higher percentages of agricultural land. MSA, cultural services and carbon sequestration levels are higher in areas with less cultivated plots and therefore prevailing in cells with lower agricultural production. Levels of cultural services are higher in areas with higher MSA levels as these areas are more attractive for recreation and hunting. These are, generally, also the areas sequestering more carbon. The high correlation between some of the ecosystem services corresponds with similar observations by Raudsepp-Hearne et al. (2010) for Canada.

5 Results

The two-stage approach discussed in Sects. 2 and 3 was applied for the data discussed above.Footnote 11 We are particularly interested in the following questions: What are the shape characteristics of the frontier function, what are the opportunity costs of marginal changes in the different output variables and to what extent do they depend on regional characteristics, and are there economies or diseconomies of scope? This information can guide decision makers in targeting the regions having a comparative advantage for conservation or for agricultural development.

The estimation results, discussed in detail below, can be summarized in the following three main conclusions:

  • The production possibility frontier, showing the Pareto optimal output combinations, is non-concave. This has implications for the interpretation of the opportunity costs.

  • Differences in trade-offs are large. Each country has regions that have a comparative advantage for a particular ecosystem service. Opportunity costs for provision services (agriculture) are in general higher in the regions characterised by higher agricultural production values. In contrast, enhancing carbon sequestration levels or biodiversity is less expensive in regions with higher carbon sequestration or biodiversity levels.

  • The relationship between agricultural revenues and the other output variables (biodiversity, carbon sequestration and cultural services) exhibits diseconomies of scope. As a result, specialization in agricultural production or any of the other ecosystem services considered may lead to efficiency gains. Reallocation of land use across the cells, where cells more or less specialize in the outputs for which they have a comparative advantage, leads to a win-win situation in terms of the aggregate output.

5.1 Shape of the Production Possibility Frontier

We first test for concavity of the production possibility frontier or, similarly, for convexity of the output distance function. Few applied studies test for concavity of the frontier function (O’Donnell and Coelli 2005). As discussed above, most studies estimating opportunity costs with frontier methods simply impose it by the particular choice of the functional form or by adding concavity constraints (see e.g. Färe et al. 2005; Bostian and Herlihy 2014). This ensures the production possibility frontier is concave, but as Pope and Johnson (2013) argue, this is merely “for its analytical convenience rather than its economic realism” (p. 241). Curvature violations have consequences for the interpretation of the opportunity costs because the duality assumption between the distance function and benefit function no longer holds. For an observation on a concave frontier, opportunity cost \(p_{m}\) in (9) is a benefit maximizing opportunity cost. For a downward sloping but convex frontier this does not apply. In all cases, however, the opportunity cost ratio (10) reflects the trade-off between \(y_{m }\) and \(y_{1}\) in the neighbourhood of the observation y.

We tested for concavity in two ways: by means of assessing the returns to scope of the frontier function and the Hessian of the distance function. The coefficients of the Shephard output distance function (7) and frontier function (8) are listed in Table 4. Based on the coefficients in Table 4 and using (11), the returns to scope were determined for each grid cell. The second order derivatives of the frontier function (11) were found to be positive in all grid cells for the pairs of agricultural revenues with any of the other output variables. They are negative for the pairs of cultural services with biodiversity or carbon. Finally, for the output pair of biodiversity and carbon sequestration, they are positive for some but negative for other grid cells. These results imply that the trade-off between agricultural revenues and the other outputs, and for some cells also the trade-off between biodiversity and carbon sequestration, is non-concave in shape.

Table 4 Parameter estimates of translog functions (7) and (8)\(^{1} \)

The non-concavity of the frontier (or non-quasi-convexity of the distance function) is confirmed by the eigenvalues of the Hessian of the distance function. All grid cells have both positive and negative eigenvalues implying that the frontier has a saddle point. This result is robust, as the same result is also obtained for model formulations with more or fewer output and conditional variables and for subsets of data only containing the data rich portions of the sample. As an illustration, Fig. 3 visualises the frontier in 3-dimensional plots for the sub-region SE, each time fixing one of the output variables and the conditional variables at their mean value. As the plots are extrapolations, they should be interpreted with care, especially at the boundaries and for the areas with few estimates.

Fig. 3
figure 3

3-Dimensional plots of the frontier for the sub-region SE. Note The plots for the other three sub-regions are similar. Contour plots are given on the xy plane. The dots on the frontier are the observations on which the regression is based projected on the frontier. To draw the different plots, the output not given in the plot and the conditional variables are fixed at their mean values. For the range of values given by the xy coordinates, the corresponding level of z-values is determined using (8)

Dasgupta and Maler (2003) and Tschirhart (2012) argue that violations of the convexity assumptions are common. In fact, Dasgupta and Maler (2003) argue that “the word “convexity” is ubiquitous in economics, but absent from ecology”. Our results show that non-convexity arises with the production of multiple ecosystem services given a fixed input (land). Other applied studies have come to the same conclusions when analysing the joint production of multiple ecosystem services (including agriculture, biodiversity, carbon and regulating services), see e.g. Bowes and Krutilla (1989), Boscolo and Vincent (2003), Tschirhart (2012), Vincent (2012), and Hart et al. (2014). There is a caveat in our case. The translog function used to approximate the frontier function in the second stage can also have contributed to the result. Other studies have noted that translog functions violate regularity conditions more often than other flexible functional forms (Sauer 2006). In addition, the use of the FDH-approach in the first stage can have contributed to the concavity violations, especially in the data poor parts of the output space, because of the known sensitivity to outliers. Despite this caveat, following Brown et al. (2011) it is worth noting that the results suggest that it is crucial to test for the standard concavity assumption before making policy recommendations for individual ecosystem services. The price system can be an efficient allocation mechanism if transformation possibilities constitute a convex set. However, in non-convex environments, taxes or subsidies are untenable and command and control regulation is likely the only admissible policy.

5.2 Opportunity Costs and Returns to Scope

The second set of results concentrates on opportunity costs. Using equation (10) and the coefficients in Table 4, opportunity costs or marginal rates of transformation were estimated for each grid cell—see Tables 5, 6 and Fig. 4 .Footnote 12 These reflect the agricultural revenue foregone due to a marginal increase in one of the other output variables.

Table 5 Estimated opportunity cost for biodiversity (MSA), cultural services and carbon
Table 6 Median opportunity costs and standard deviations cost for biodiversity (MSA), cultural services and carbon by country

As shown in Fig. 4, opportunity costs differ substantially both across the four sub-regions distinguished in the application (CIS, CE, YUG and SE) as well as within countries. A further analysis of the opportunity costs for carbon sequestration reveals interesting patterns. The opportunity costs are in general higher in grid cells characterized by high levels of agricultural revenues (according to GLC2000 land-use map and the simulation results). These are the important agricultural production areas with above average agricultural potentials and shares of agricultural land especially in Ukraine, Romania, Bulgaria, Serbia and Poland. Due to the higher yield potentials, foregone agricultural revenues are higher in these main agricultural production areas (note that potential yields are assumed homogenous throughout a cell in the parts of the cell that are suitable for agriculture). Because these grid cells have low shares of forest land, carbon sequestration levels are relatively low.

Fig. 4
figure 4

Maps of opportunity costs per cell for mean species abundance ($ per % MSA – left panel), cultural services ($ per % cultural services index – middle panel) and carbon sequestration ($ per tonne carbon – right panel). Note Classification of the cells is such that each colour corresponds with 10 % or 20 % of the observations. Grey cells are non-monotonic observations or outliers. The cells with low estimated opportunity costs (the yellow cells) have a comparative advantage for delivery of biodiversity, cultural services or carbon sequestration, respectively. (Color figure online)

The cells with a comparative advantage in sequestering more carbon, i.e. those with low opportunity costs, generally are the grid cells characterized by less agricultural production but higher sequestration levels. These are especially the areas surrounding the Alpes in Slovenia, the Carpathian Mountains ranging from Slovakia to Romania and the Pinsk Marshes on the border of Belarus and Ukraine. The share of agricultural and grassland in these grid cells is generally lower than in the cells that make up the important agricultural production areas. It may well be that in these cells less carbon can be sequestered per unit of land than in the productive lands. Yet, the foregone agricultural revenues to sequester an extra unit of carbon are still lower in these cells because of their lower yield potentials. This implies that it may be cost-effective to have a certain level of specialization per cell, with those cells having a comparative advantage in sequestration focusing on carbon instead of attempting to improve all services simultaneously.

The estimates for carbon sequestration were compared to those from other studies. Antle et al. (2003) estimate marginal opportunity cost of 20–100 per ton carbon sequestered based on five cropping system in the US. Similarly, MacLeod et al. (2010) estimate a carbon abatement cost curve for agricultural emissions from crops and soils for the UK and find that 11.5 % of agricultural emissions can be abated at a marginal cost of £168 \(\approx \) $261 per ton carbon sequestered. In both of these studies, the opportunity cost estimates are given in terms of net farm revenue and not in terms of gross revenue, as in our case. To transform gross to net revenue, the social profit rate should be used to translate the value added of an economic activity to the value of gross output of this activity at world prices. Hughes and Hare (1994) provide an estimate of 7.25 % for the average medium run social profitability of agriculture in Eastern Europe using this rate. Our average opportunity cost of $263 of gross revenue lost due to an extra ton of carbon sequestered implies a loss of net revenues of $19 per ton of carbon sequestered, which is relatively low compared to the studies mentioned above.

For biodiversity and cultural services, opportunity costs are again higher for the grid cells in the main agricultural areas. Thus similar to carbon sequestration, the cells characterized by high levels of agricultural productivity have a comparative disadvantage in providing more biodiversity and cultural services. The relationship between biodiversity levels and their opportunity costs is, however, more complex than for carbon sequestration—compare Fig. 4 with the maps in the appendix. The grid cells with a comparative advantage in providing biodiversity are generally the biodiversity-rich cells. But some cells with biodiversity-poor or intermediate biodiversity levels have low opportunity costs as well. For many biodiversity rich cells, especially those in the mountainous and marsh areas, agricultural potentials are low and therefore the loss of agricultural revenues due to a marginal increase of biodiversity is low. The results further show a more than proportional increase in MSA with an increase in forested land for those grid cells with larger shares of forested land areas. This is likely due to simultaneous reduction in pressures (reduced nitrogen deposition, reduced disturbance from roads or urbanized areas) leading to positive feedback effects. Such positive feedback effects are absent if a cell contains mainly agricultural land. Opportunity costs may also be low in cells having low and intermediate biodiversity levels. This especially occurs in the more urbanized cells characterized by both low MSA levels and also low agricultural revenues but higher non-agricultural economic development levels. For these cells opportunity costs in terms of the loss of agricultural revenues may be low. Opportunity costs in terms of economic development will likely be higher in these cells, though. For cultural services, more or less the same pattern is observed as for biodiversity. As there are no related studies using similar biodiversity and cultural service indicators, it is difficult to directly compare our opportunity cost estimates for these two ecosystem services with those from other studies.

Finally, Fig. 4 visualises which regions have comparative advantages (low opportunity costs) in the provision of biodiversity, cultural services or carbon sequestration. There is considerable variation in opportunity costs within each country, but each country has regions where increasing any of the outputs considered is more cost-effective. Interesting from a cost-effectiveness perspective are regions with low opportunity costs. Investments in the outputs having low opportunity cost are more cost-effective in these areas than in other areas.

Similar conclusions can be drawn on the basis of the returns to scope characteristics of the frontier. Above, we showed that there are diseconomies of scope between agricultural revenues and the other three ecosystem services considered. Moreover, cultural services exhibit economies of scope characteristics with biodiversity and carbon sequestration. Biodiversity and carbon sequestration exhibit both economies and diseconomies of scope, depending on grid cell characteristics. This implies that efficiency gains can come from specialization, either in agriculture or in a combination of the other ecosystem services considered—see also Vincent (2012). The loss of e.g. biodiversity in a region specializing in agricultural production will be more than compensated by the gain of biodiversity in a region that specializes in biodiversity conservation.

The mixed results for the returns to scope between biodiversity and carbon sequestration are evident by comparing the maps for biodiversity and carbon in Fig. 4. Only a limited number of cells has low opportunity costs for carbon sequestration and biodiversity conservation simultaneously. Many cells have high opportunity costs for one of them, making it costly to increase both simultaneously. In the cells with low opportunity costs for both output variables, a win-win situation can be obtained. In that case e.g. an investment in biodiversity also results in higher levels of carbon sequestration (even though at the expense of agricultural revenues as agricultural land has to be taken out of production). Grid cells with a large share of extensive grassland or monoculture forests have low opportunity costs for sequestration. These cells provide already a high level of sequestration and have a comparative advantage in providing more but in most of these cells, opportunity costs for MSA are considerable. Low opportunity cost for both MSA and carbon sequestration can be found only in cells with a large share of non-agricultural land and suitable biodiversity characteristics. Only in those cells external pressures are low and ecosystem processes are complementary instead of competitive for carbon sequestration and biodiversity generation.

5.3 Policy Implications

To illustrate the policy consequences of the observed economies and diseconomies of scope we compared two management regimes to illustrate the potential gain of smart land management. In both management regimes a social planner aims to increase biodiversity and carbon sequestration. In the first, equity based management regime, this change is to be realized within each cell. In the second, optimal management regime, the increase in biodiversity and carbon sequestration can be realised across all grid cells.

To implement the second regime the social planner maximizes agricultural production over the entire area, subject to four constraints. First, total biodiversity in each country has to change by a certain percentage. Secondly, total carbon sequestration over all cells together has to increase by a certain percentage. Thirdly, the total level of cultural services in each country remains constant and finally land use cannot change too much within each cell.Footnote 13 Table 7 illustrates the effect of the changes imposed in terms of agricultural revenues. Note that the results are for the comparative static situation and that these show neither the path towards nor the full economic costs of the change in land use imposed.

Table 7 Simulated changes in agricultural revenues due to imposed changes on total biodiversity and carbon sequestration

Table 7 shows that a simultaneous increase of carbon and biodiversity by 10 % in each cell results in a loss of agricultural revenues by 27 %. In contrast, an optimal allocation of land use can lead to a situation where, in addition to a 10 % increases in carbon and biodiversity, also agricultural revenues increases by 20 %. Reallocating land use such that cells specialize in those activities for which they have a comparative advantage can yield a gain in agricultural revenues of 50 % without a loss in carbon and biodiversity.

The policy implication of this is that spatial regulation that considers the comparative advantages of each region would lead to efficiency gains. Agriculture would be concentrated in certain areas and a limited number of regions would implement land use policy to promote biodiversity, carbon sequestration and cultural services simultaneously. The biodiversity degradation in areas specializing in agriculture would be more than compensated for in other areas.Footnote 14

6 Discussion and Conclusion

The main aim of this paper was to present a method capable of providing monetary estimates of opportunity costs of ecosystem services, to capture the dependence on regional characteristics in this method and apply it to assess the existence of comparative advantages for producing particular ecosystem services across a case study area.

The method is based on the two-stage semi-parametric frontier technique (Florens and Simar 2005). Important advantages of the proposed frontier approach are that no assumptions have to be made on the concavity of the frontier and the distribution of the error term and that the approach allows multiple outputs without imposing any restrictive functional form assumptions. These advantages turned out to be crucial for our application—the empirical analysis clearly shows that concavity assumptions would have led to spurious trade-off relationships between the output variables.

The empirical implementation of the method adds to the growing literature on land-use change and ecosystem services by addressing three main questions: what are the opportunity costs of changes in ecosystem services, to what extent do they differ per region and which regions have a comparative advantage in producing particular ecosystem services? These empirical insights are helpful in the design of cost-effective polies and in understanding how the trade-offs depends on the spatial variation in biophysical interactions between ecosystem services.

The application to a case study of 18 counties in Central and Eastern Europe provides relevant policy insights in the trade-off between agricultural revenues, biodiversity (mean species abundance), cultural services and carbon sequestration. First, the production possibility frontier was found to be non-concave. While an inconvenient result from an economic point of point this will be of no surprise to ecologists. Secondly, opportunity cost information shows that trade-offs differ substantially between regions. On average, higher income countries have lower opportunity costs than poorer countries in our sample. Within-country variation, however, is large. Generally, opportunity costs are higher in the regions characterized by higher levels of agricultural revenues. In addition, opportunity costs are lower in the regions with higher existing levels of biodiversity, cultural services or carbon sequestration levels. The latter regions have a comparative advantage for the delivery of more biodiversity, cultural services or carbon sequestration and expanding these services becomes cheaper the more there is of it. Finally, the analysis has further shown that considering returns to scope may potentially yield substantial efficiency gains in which all services benefit. Losses of one output variable can be more than proportionally compensated for by gains in other regions.

There are several possible extensions to the application of the method. First, the number and type of policy implications offered by future analyses will benefit from fewer restrictions on data availability. If more outputs are included in the biophysical simulation models employed in our approach, more trade-offs can be analysed. This applies to both regulating and supporting ecosystem services affecting the provisioning and cultural services (pollination, erosion prevention, water infiltration and natural pest management). Similarly, it would be interesting to include changes in the intensity of land-use (capturing variable inputs including fertiliser, pesticide, labour and machinery input). By including both more ecosystem services and variation in land-use intensity, trade-offs between conventional and less intensive agriculture can be analysed in more detail. To enable such an extension, reliable spatial data on variation in land-use intensity needs to be become available first.

Secondly, with pooled cross-section and annual data, changes in the shape and position of the frontier can be assessed. Positions of the frontier may change due to technical changes or changes in climate. Moreover, due to differences in economic development patterns, evolution of country frontiers may follow different patterns. In addition, the position of each region on the frontier may change over time. Evaluation of the inter-temporal changes of the frontier and the position on the frontier provides relevant information on the dynamic effects of land-use choices on the opportunity cost of ecosystem services. Such an analysis, however, requires dynamic non-parametric methods, that still need further development.