Keywords

JEL codes

1 Introduction

Concerns about the sustainability of economic activity has led to an increasing interest in the concept of eco-efficiency and the literature on this topic has been growing in recent years (Oude Lansink and Wall 2014). The term eco-efficiency was originally coined by the World Business Council for Sustainable Development in their 1993 report ( Schmidheiney 1993) and is based on the concept of creating more goods and services using fewer resources. In turn, the OECD defines eco-efficiency as “the efficiency with which ecological resources are used to meet human needs” (OECD 1998). Clearly, the concept of eco-efficiency takes into account both the environmental and economic objectives of firms.

When evaluating firm performance in the presence of adverse environmental impacts, production frontier models are a popular tool ( Tyteca 1996; Lauwers 2009 ; Picazo-Tadeo et al. 2011; Pérez-Urdiales et al. 2015). The measurement of eco-efficiency in a frontier context, which Lauwers (2009) refers to as the ‘frontier operationalisation’ of eco-efficiency, involves comparing economic results derived from the production of goods and services with aggregate measures of the environmental impacts or ‘pressures’ generated by the production process. To date, only the non-parametric Data Envelopment Analysis (DEA) method has been used in the literature. While DEA has many advantages, it has the drawback that it can be extremely sensitive to outliers and measurement errors in the data.

In the present work we propose a Stochastic Frontier Analysis (SFA) approach to measuring eco-efficiency, which has the advantage that it well-suited to dealing with measurement errors in the data. Using a stochastic frontier model to measure eco-efficiency involves the estimation of only a few parameters, so the model can be implemented even when the number of observations is relatively small. Moreover, the SFA approach permits an analysis of the potential substitutability between environmental pressures and can incorporate determinants of eco-efficiency in a one-stage procedure.

We illustrate our simple proposal with an empirical application using a sample of 50 dairy farmers from the Spanish region of Asturias. This data set includes information from a questionnaire specifically carried out to permit the accurate measurement of eco-efficiency and provides information on farmers’ socioeconomic characteristics and attitudes towards the environment, and has been used by Pérez-Urdiales et al. (2015) to measure eco-efficiency and identify its determinants using the DEA-based bootstrapped truncated regression techniques of Simar and Wilson (2007). The results from that paper therefore provide a useful point of comparison for the results from our proposed stochastic frontier model.

The paper proceeds as follows. In Sect. 12.2 we discuss the concept of eco-efficiency and the DEA approach often used to estimate eco-efficiency scores. Section 12.3 introduces our stochastic frontier model, which can be viewed as a counterpart of the DEA eco-efficiency model. Section 12.4 describes the data we use. The results are presented and discussed in Sect. 12.5, and Sect. 12.6 concludes.

2 Background

To measure eco-efficiency using frontiers, Kuosmanen and Kortelainen (2005) defined eco-efficiency as a ratio between economic value added and environmental damage and proposed a pressure-generating or pollution-generating technology set \( T = \{ (\pi ,p) \in R^{{\left( {1 + K} \right)}}| \, \pi\, {\text{can be generated by}}\,p\} \). This technology set describes all the feasible combinations of economic value added, π, and environmental pressures, \( p \). Environmental damage, \( D(p) \), is measured by aggregating the K environmental pressures \( \left( {p_{1} , \ldots ,p_{K} } \right) \) associated with the production activity.

Figure 12.1 provides an illustration for the simple case of two environmental pressures, \( p_{1} \) and \( p_{2} \). The set of eco-efficient combinations is represented by the eco-efficient frontier, which represents the minimum combinations of the two environmental pressures which can be used to produce an economic value added of π0. Combinations of pressures below the frontier are unfeasible whereas combinations above it are eco-inefficient. For example, the combination of pressures represented by point A is clearly eco-inefficient as the environmental pressures could be reduced equiproportionally to point E on the frontier without reducing value added.

Fig. 12.1
figure 1

Eco-efficiency

Eco-inefficiency can be measured using the radial distance from a point A to the efficient frontier. The eco-efficiency score is given by the ratio OE/OA which takes the value 1 for eco-efficient combinations of pressures and economic value added and values less than 1 for inefficient combinations such as A. This is the approach we will consider, although it should be pointed out that alternative measures of eco-efficiency could be devised if we depart from radial (equiproportional) reductions in pressures. For example, instead of measuring the extent to which pressures can be reduced while maintaining value added, we could measure the extent to which the firm, given its present combination of pressures, could increase its value added. Thus, if the firm was using the combination of pressure represented by A efficiently, it would be operating on a new eco-efficient frontier passing through that point, and could achieve a higher value added corresponding to this new frontier. Other alternatives exist where the possibility of simultaneously reducing pressures and increasing economic value added can be explored . Picazo-Tadeo et al. (2012), for example, propose using a directional distance function approach which allows for this possibility, as well as that of reducing subsets of pressures in order to reach the eco-efficient frontier.

These different ways of approaching the eco-efficient frontier will all lead to valid measures of eco-inefficient behaviour but we will follow the existing literature by focusing on the capacity of firms to reduce environmental pressures equiproportionally while maintaining value added. It should be underlined that our eco-efficiency scores are defined directly in terms of environmental pressures and not in terms of reductions of input quatities which can be transformed into an associated reduction in in overall environmental damage. This latter approach was followed by Coelli et al. (2007) and permitted them to disaggregate environmental inefficiency into technical and allocative components using “iso-pressure” lines.

Individual eco-efficiency scores for producer \( i \) can be found using the following expression:

$$ EEF_{i} = \frac{\text{Economic value added}}{\text{Environmental pressure}} = \frac{{\pi_{i} }}{{D_{i} (p)}} $$
(12.1)

where \( D_{i} (p) \) is a function that aggregates the environmental pressures into a single environmental pressure indicator. This can be done by taking a linear weighted average of the individual environmental pressures:

$$ D_{i} (p) = w_{1} p_{1i} + w_{2} p_{2i} + \cdots + w_{K} p_{Ki} $$
(12.2)

where \( w_{k} \) is the weight assigned to environmental pressure \( p_{k} \). Kuosmanen and Kortelainen (2005) and Picazo-Tadeo et al. (2012), among others, use DEA as a non-subjective weighting method. The DEA eco-efficiency score of firm i can be computed from the following programming problem

$$ \mathop {\hbox{max} }\nolimits_{{w_{ki} }} EEF_{i} = \frac{{\pi_{i} }}{{\mathop \sum \nolimits_{k = 1}^{K} w_{ki} p_{ki} }} $$
(12.3)

subject to the constraints

$$ \begin{aligned} & \frac{{\pi_{j} }}{{\sum \nolimits_{k = 1}^{K} w_{ki} p_{kj} }} \le 1 \quad j = 1, \ldots ,N \\ & w_{ki} \ge 0 \quad k = 1, \ldots ,K \\ \end{aligned} $$

This formulation involves a non-linear objective function and non-linear constraints, which is computationally difficult. This problem is often linearized by taking the inverse of the eco-efficiency ratio and solving the associated reciprocal problem (Kuosmanen and Kortelainen 2005 ; Picazo-Tadeo et al. 2011).

The two constraints in the problem force weights be non-negative and eco-efficiency scores take values between zero and one, that is:

$$ EEF_{i} = \frac{{\pi_{i} }}{{\sum \nolimits_{k = 1}^{K} w_{ki} p_{ki} }} \le 1, \quad \forall i = 1, \ldots ,{\text{N}} $$
(12.4)

The DEA eco-efficiency score which solves this problem for firm i indicates the maximum potential equiproportional reduction in all environmental pressures that could be achieved while maintaining economic value constant, i.e., it corresponds to the ratio OE/OA for a firm operating at point A in Fig. 12.1 and would take the vaue 1 for an eco-efficient firm.

3 The SFA Eco-efficiency Model

In this section we introduce our SFA counterpart of the above DEA eco-efficiency model. We first introduce a basic (i.e. homoskedastic) specification of the model in order to focus our attention on the main characteristics of the model and the differences between the SFA and DEA approachesFootnote 1. We then present a heteroskedastic specification of the model that allows us to identify determinants of firms’ eco-efficiency in a simple one-stage procedure. Finally, we explain how we obtain the estimates of eco-efficiency for each farm.

3.1 Basic Specification

Our SFA approach to modelling eco-efficiency relies on the constraint in Eq. (12.4). If we assume that the environmental pressure weights, \( w_{k} \), in (12.2) are parameters to be estimated, we can impose that these be positive by reparameterizing them as \( w_{k} = e^{{\beta_{k} }} \). As each environmental pressure contributes positively to overall environmental damage, this restriction, which is stated in (12.3), follows naturally. The natural logarithm of Eq. (12.4) can be written as:

$$ \ln EFF_{i} = { \ln }\left(\frac{{\pi_{i} }}{{\sum \nolimits_{k = 1}^{K} e^{{\beta_{k} }} \cdot\,p_{ki} }}\right) \le 0 $$
(12.5)

The above equation can be rewritten as:

$$ \ln \left( {\pi_{i} } \right) = \ln \left(\sum\nolimits_{k = 1}^{K} {e^{{\beta_{k} }} \cdot\,p_{ki} } \right) - u_{i} $$
(12.6)

where \( u_{i} = - \ln \;EFF_{i} \ge 0 \) can now be viewed as a non-negative random term capturing firm i’s eco-inefficiency.

Equation (12.6) is a non-linear regression model with a nonpositive disturbance that can be estimated using several techniques, including goal programming, corrected ordiinary least squares (COLS) and modified ordinary least squares (MOLS)—see Kumbhakar and Lovell (2000, Sect. 3.2.1). If we were using a multiplicative aggregation of environmental pressures, we would get a linear (i.e. Cobb-Douglas) regression model where positive parameter values would need to be imposed. Both models are rougly equivalent but the Cobb-Douglas specification would depart from the tradition in the eco-efficiency literature of using linear combinations of environmental pressures.

Regardless of the technique, however, note that in (12.6) we are measuring firms’ eco-efficiency relative to a deterministic environmental pressure frontier. This implies that all variation in value added not associated with variation in individual environmental pressures is entirely attributed to eco-inefficiency. In other words, this specification does not make allowance for the effect of random shocks, which might also contribute (positively or negatively) to variations in value added.

As is customary in the SFA literature in production economics, in order to deal with this issue we extend the model in (12.6) by adding a symmetric random noise term, \( v_{i} \), and a non-zero intercept θ:

$$ \ln \left( {\pi_{i} } \right) = \theta + \ln \left(\sum\nolimits_{k = 1}^{K} {e^{{\beta_{k} }} \cdot\,p_{ki} } \right) + v_{i} - u_{i} $$
(12.7)

This model is more complex than a deterministic eco-efficiency frontier model but it is also more realistic as deviations from the frontier due not only to eco-inefficiency but also to uncontrollable or unobservable factors (i.e. random noise) are incorporated. We have also added a non-zero intercept in order to obtain unbiased parameter estimates in case the unobservable factors or measurement errors have a level effect on firms’ profit.

The error term in (12.7) thereby comprises two independent parts. The first part, \( v_{i} \), is a two-sided random noise term, often assumed to be normally distributed with zero mean and constant standard deviation, i.e. \( \sigma_{v} = e^{\gamma } \). The second part, \( u_{i} \), is a one-sided error term capturing underlying eco-inefficiency that can vary across firms. Following Aigner et al. (1977) it is often assumed to follow a half-normal distribution, which is the truncation (at zero) of a normally-distributed random variable with mean zero. Moreover, these authors also assumed that the variance of the pre-truncated normal variable (hereafter \( \sigma_{u} \)) is homoskedastic and common to all farms, i.e. \( \sigma_{u} = e^{\delta } \). The identification of both random terms in this model (ALS henceforth) relies on the asymmetric and one-sided nature of the distribution of \( u_{i} \) (see Li 1996) If the inefficiency term could take both positive and negative values, it would not be distinguishable from the noise term, \( v_{i} \).

It should be pointed out that under these distributional assumptions the density function of the composed error term \( \varepsilon_{i} = v_{i} - u_{i} \) in (12.7) is the same as the well-known density function of a standard normal-half normal frontier model. Following Kumbhakar and Lovell (2000, p. 77), the log likelihood function for a sample of N producers can then be written as:

$$ \ln \;L(\theta ,\beta ,\gamma ,\delta ) = \frac{N}{2}\ln \left[ {\sigma_{v}^{2} + \sigma_{u}^{2} } \right] + \sum\limits_{i = 1}^{N} {\ln \;\Phi } \left[ { - \frac{{\varepsilon_{i} \left( {\theta ,\beta } \right)\cdot\sigma_{u} /\sigma_{v} }}{{\left( {\sigma_{v}^{2} + \sigma_{u}^{2} } \right)^{1/2} }}} \right] - \frac{1}{{2\left( {\sigma_{v}^{2} + \sigma_{u}^{2} } \right)}}\sum\limits_{i = 1}^{N} {\varepsilon_{i} (\theta ,\beta )} $$
(12.8)

where \( \beta = (\beta_{1} , \ldots ,\beta_{K} ) \), and

$$ \varepsilon_{i} \left( {\theta ,\beta } \right) = \ln \left( {\pi_{i} } \right) - \theta - \ln \left( {\sum\limits_{k = 1}^{K} {e^{{\beta_{k} }} \cdot p_{ki} } } \right) $$
(12.9)

The likelihood function (12.8) can be maximized with respect to \( (\theta ,\beta ,\gamma ,\delta ) \) to obtain consistent estimates of all parameters of our eco-efficiency model. The only difference between our SFA eco-efficiency model and a traditional SFA production model is the computation of the error term \( \varepsilon_{i} \left( {\theta ,\beta } \right) \). In a traditional SFA production model, this is a simple linear function of the parameters to be estimated and hence the model can be estimated using standard econometric software, such as Limdep or Stata. In contrast, \( \varepsilon_{i} \left( {\theta ,\beta } \right) \) in Eq. (12.9) is a non-linear function of the β parameters. Although the non-linear nature of Eq. (12.9) prevents using the standard commands in Limdep or Stata to estimate our SFA eco-efficiency model, it is relatively straightforward to write the codes to maximize (12.8) and obtain our parameter estimates.

The model in (12.7) can also be estimated using a two-step procedure that combines ML and method of moments (MM) estimators. In the first stage, the intercept \( \theta \) and the environmental pressure parameters \( \beta \) of Eq. (12.7) can be estimated using a non-linear least squares estimator. In the second step, the aforementioned distributional assumptions regarding the error terms are made to obtain consistent estimates of the parameters describing the variance of \( v_{i} \) and \( u_{i} \) (i.e., \( \gamma \) and \( \delta \)) conditional on the estimated parameters from the first step. This two-step approach is advocated for various models in Kumbhakar and Lovell (2000). The main advantage of this two-step procedure is that no distributional assumptions are used in the first step. Standard distributional assumptions on \( v_{i} \) and \( u_{i} \) are used only in the second step. In addition, in the first step the error components are allowed to be freely correlated.

An important issue that should be taken into account when using a two-step procedure is that the expectation of the original error term in (12.7) is not zero because \( u_{i} \) is a non-negative random term. This implies that the estimated value of the error term \( \varepsilon_{i} \) in Eq. (12.7) should be decomposed as follows:

$$ \varepsilon_{i} = v_{i} - u_{i} + E(u_{i} ) $$
(12.10)

If \( u_{i} \) follows a half-normal distribution, then \( E(u_{i} ) = \sqrt {2/\pi } \cdot\sigma_{u} \). Thus, the stochastic frontier model in the second stage is:

$$ \varepsilon_{i} = v_{i} - u_{i} - \sqrt {2/\pi } \cdot e^{\delta } $$
(12.11)

Note that there are no new parameters to be estimated. The parameters \( \gamma \) and \( \delta \) are estimated by maximizing the likelihood function associated to this (adjusted) error term. As Kumbhakar et al. (2013) have recently pointed out, the stochastic frontier model based on (12.11) can accommodate heteroskedastic inefficiency and noise terms simply by making the variances of \( \sigma_{u} \) and \( \sigma_{v} \) functions of some exogenous variables (see, for instance, Wang 2002; Álvarez et al. 2006). This issue is addressed later on.

Before proceeding, it should be pointed out that an alternative two-step approach based only on MM estimatorscan also be used. This empirical strategy relies on the second and third moments of the error term \( \varepsilon_{i} \) in Eq. (12.7). This approach takes advantage of the fact that the second moment provides information about both \( \sigma_{v} \) and \( \sigma_{u} \) whereas the third moment only provides information about the asymmetric (one-sided) random inefficiency term. Olson et al. (1980) showed using simulation exercises that the choice of estimator (ML vs.MM) depends on the relative value of the variances of both random terms and the sample size. When the sample size is large and the variance of the one-sided error component is small compared to the variance of the noise term, ML outperforms MM. The MM approach has, in addition, some practical problems. It is well known in the stochastic frontier literature, for example, that neglecting heteroskedasticity in either or both of the two random terms causes estimates to be biased. Kumbhakar and Lovell (2000) pointed out that only the ML approach can be used to address this problem. Another practical problem arises in homoskedastic specifications of the model when the implied \( \sigma_{u} \) becomes sufficiently large to cause \( \sigma_{v} < 0 \), which violates the assumptions of econometric theory.

Compared to the DEA eco-efficiency model, our SFA approach will attenuate the effect of outliers and measurement errors in the data on the eco-efficiency scores. Moreover, it is often stressed that the main advantage of DEA over the SFA approach is that it does not require an explicit specification of a functional form for the underlying technology. However, the ‘technology’ here is a simple index that aggregates all environmental pressures into a unique value. Thus, we would expect that the parametric nature of our SFA approach is not as potentially problematic in an eco-efficiency analysis as it may be in a more general production frontier setting where theses techniques are used to uncover the underlying (and possibly quite complex) relationship between multiple inputs and outputs. Another often-cited advantage of the DEA approach is that it can be used when the number of observations is relatively small. We reiterate, however, that the ‘technology’ of our SFA model is extremely simple, with few parameters to be estimated, so that the model can be implemented even when the number of observations is not large.

Finally, note that the estimated \( \beta \) parameters have an interesting interpretation in the parametric model. In the expression for eco-efficiency in (12.1), we note that eco-efficiency is constant and equal to 1 along the eco-efficiency frontier. Differentiating (12.1) in this case with respect to an individual pressure \( p_{k} \) for firm i we obtain:

$$ \frac{{\partial D_{i} (p)}}{{\partial p_{k} }} = \frac{{\partial \pi_{i} }}{{\partial p_{k} }} $$
(12.12)

For any two pressures \( p_{j} \) and \( p_{k} \), therefore, we have:

$$ \frac{{\frac{{\partial D_{i} (p)}}{{\partial p_{j} }}}}{{\frac{{\partial D_{i} (p)}}{{\partial p_{k} }}}} = \frac{{\frac{{\partial \pi_{i} }}{{\partial p_{j} }}}}{{\frac{{\partial \pi_{i} }}{{\partial p_{k} }}}} $$
(12.13)

From the expression for eco-efficiency in the reparameterized model in (12.5) it is clear that \( \frac{{\partial D_{i} (p)}}{{\partial p_{k} }} = e^{{\beta_{k} }} \), so that in this particular case (12.13) becomes:

$$ \frac{{e^{{\beta_{j} }} }}{{e^{{\beta_{k} }} }} = \frac{{\partial \pi_{i} /\partial p_{j} }}{{\partial \pi_{i} /\partial p_{k} }} $$
(12.14)

Once the \( \beta \) parameters have been estimated, \( e^{{\beta_{k} }} \) therefore represents the marginal contribution of pressure \( p_{k} \) to firm i’s value added, i.e., it is the monetary loss in value added if pressure \( p_{k} \) were reduced by one unit.

As expression (12.14) represents the marginal rate of technical substitution of environmental pressures, it provides valuable information on the possibilities for substitution between pressures. If this marginal rate of substitution took a value of 2, say, we could reduce pressure \( p_{j} \) by two units and increase \( p_{k} \) by one unit without changing economic value added. This also sheds light on the consequences for firms of legislation requiring reductions in individual pressures. Continuing with the previous example, it would be relatively less onerous for the firm to reduce pressure \( p_{k} \) rather than \( p_{j} \) as the fall in value added associated with a reduction in \( p_{k} \) would be only half that which would occur from a reduction in \( p_{j} \).

3.2 Heteroskedastic Specification

Aside from measuring firms’ eco-efficiency, we also would like to analyse the determinants of eco-efficiency. The concern about the inclusion of contextual variables or z-variables has led to the development of several models using parametric, non-parametric or semi-parametric techniques. For a more detailed review of this topic in SFA and DEA, see Johnson and Kuosmanen (2011, 2012). The inclusion of contextual variables in DEA has been carried out in one, two or even more stages. Ruggiero (1996) and other authors have highlighted that the one-stage model introduced in the seminal paper of Banker and Morey (1986) might lead to bias. To solve this problem, other models using several stages have been developed in the literature. Ray (1988) was the first to propose a second stage where standard DEA efficiency scores were regressed on a set of contextual variables. This practice was widespread until Simar and Wilson (2007) demonstrated that this procedure is not consistent because the first-stage DEA efficiency estimates are serially correlated. These authors proposed a bootstrap procedure to solve this problem in two stages which has become one of the most-widely used method in DEA to identify inefficiency determinants.

As the inefficiency term in the ALS model has constant variance, our SFA model in (12.7) does not allow the study of the determinants of firms’ performance. It might also yield biased estimates of both frontier coefficients and farm-specific eco-inefficiency scores (see Caudill and Ford 1993). To deal with these issues, we could estimate a heteroskedastic frontier model that incorporates z-variables into the model as eco-efficiency determinants. The specification of \( u_{i} \) that we consider in this paper is the so-called RSCFG model (see Alvarez et al. 2006), where the z-variables are treated as determinants of the variance of the pre-truncated normal variable. In other words, in our frontier model we assume that

$$ \sigma_{ui} = h\left( {z_{i} } \right) \cdot \sigma_{u} $$
(12.15)

where

$$ h(z_{i} ) = e^{{\alpha^{\prime}z_{i} }} $$
(12.16)

is a deterministic function of eco-inefficiency covariates, \( \alpha = (\alpha_{1} , \ldots ,\alpha_{J} ) \), is a vector of parameters to be estimated, and \( z_{i} = (z_{i1} , \ldots ,z_{iJ} ) \) is a set of J potential determinants of firms’ eco-inefficiency. This specification of \( \sigma_{ui} \) nests the homoskedastic model as (12.15) colapses into \( e^{\delta } \) if we assume that \( h(z_{i} ) = 1 \) or α = 0.

The so-called ‘scaling property’ ( Alvarez et al. 2006) is satisfied in this heteroskedastic version of our SFA model in the sense that the inefficiency term in (12.7) can be written as \( u_{i} = h(z_{i} ) \cdot u_{i}^{*} \), where \( u_{i}^{*} \to N^{ + } (0,e^{\delta } ) \) is a one-sided random variable that does not depend on any eco-efficiency determinant. The defining feature of models with the scaling property is that firms differ in their mean efficiencies but not in the shape of the distribution of inefficiency. In this model \( u_{i}^{*} \) can be viewed as a measure of “basic” or “raw” inefficiency that does not depend on any observable determinant of firms’ inefficiency.

The log likelihood function of this model is the same as Eq. (12.8), but now \( \sigma_{ui} \) is heteroskedastic and varies across farms. The resulting likelihood function should then be maximized with respect to \( \theta , \beta , \gamma , \delta \) and \( \alpha \) to obtain consistent estimates of all parameters of the model. As both frontier parameters and the coefficients of the eco-inefficiency determinants are simultaneously estimated in one stage, the inclusion of contextual variables in our SFA model is much simpler than in DEA.

3.3 Eco-efficiency Scores

We next discuss how we can obtain the estimates of eco-efficiency for each firm once either the homoskedastic or heteroskedastic model has been estimated. In both specifications of the model, the composed error term is simply \( \varepsilon_{i} = v_{i} - u_{i} \). Hence, we can follow Jondrow et al. (1982) and use the conditional distribution of \( u_{i} \) given the composed error term ε i to estimate the asymmetric random term \( u_{i} \). Both the mean and the mode of the conditional distribution can be used as a point estimate of \( u_{i} \). However, the conditional expectation \( E\left( {u_{i} |\varepsilon_{i} } \right) \) is by far the most commonly employed in the stochastic frontier analysis literature (see Kumbhakar and Lovell 2000).

Given our distributional assumptions, the analytical form for \( E\left( {u_{i} |\varepsilon_{i} } \right) \) can be written as follows:

$$ E\left( {u_{i} |\varepsilon_{i} } \right) = \bar{\mu }_{i} + \bar{\sigma }_{i} \left[ {\frac{{\phi \left( { - \bar{\mu }_{i} /\bar{\sigma }_{i} } \right)}}{{1 -\Phi \left( { - \bar{\mu }_{i} /\bar{\sigma }_{i} } \right)}}} \right] $$
(12.17)

where

$$ \begin{aligned} \sigma_{i}^{2} & = \sigma_{v}^{2} + h(z_{i} )^{2} \sigma_{u}^{2} \\ \bar{\mu }_{i} & = \frac{{\varepsilon_{i} h(z_{i} )^{2} \sigma_{u}^{2} }}{{\sigma_{i}^{2} }} \\ \bar{\sigma }_{i} & = \frac{{h(z_{i} )\sigma_{u} \sigma_{v} }}{{\sigma_{i} }} \\ \end{aligned} $$

To compute the conditional expectation (12.17) using the heteroskedastic model, we should replace the deterministic function \( h(z_{i} ) \) with our estimate of (12.16), while for the homoskedastic model we should assume that \( h(z_{i} ) = 1 \).

4 Data

The data we use come from a survey which formed part of a research project whose objective was to analyse the environmental performance of dairy farmers in the Spanish region of Asturias. Agricultural activity has well-documented adverse effects on the environment, and the increasing concerns among policymakers about environmental sustainability in the sector are reflected in the recent Common Agricultural Policy (CAP) reforms in Europe. Dairy farming, through the use of fertilizers and pesticides in the production of fodder, as well as the emission of greenhouse gases, has negative consequences for land, water, air, biodiversity and the landscape, so it is of interest to see whether there is scope for farmers to reduce environmental pressures without value added being reduced and identify any farmer characteristics that may influence their environmental performance.

A questionnaire was specifically designed to obtain information on individual pollutants, including nutrients balances and greenhouse gas emissions. These individual pollutants were then aggregated using standard conversion factors into a series of environmental pressures. Questions were included regarding farmers’ attitudes towards aspects of environmental management as well as a series of socioeconomic characteristics. The data collected correspond to the year 2010.

A total of 59 farmers responded to the questionnaire and the environmental and socioeconomic data were combined with economic data for these farmers which is gathered annually through a Dairy Cattle Management Program run by the regional government. Given that there were missing values for some of the variables we wished to consider, the final sample comprised 50 farms.

These data were used by Pérez-Urdiales et al. (2015) to measure the farmers’ eco-efficiency and relate it to attitudinal and socioeconomic factors. These authors used the two-stage DEA-based bootstrapped truncated regression technique proposed by Simar and Wilson (2007) to estimate eco-efficiency and its determinants, finding evidence of considerable eco-inefficiency. We will use the same variables as Pérez-Urdiales et al. (2015) to estimate eco-efficiency and its determinants using the SFA methods proposed in the previous section, which will permit us to see whether the SFA model yields similar results. We will use the results from Pérez-Urdiales et al. (2015) as a reference for comparison but it should be stressed that the dataset is far from ideal for using a SFA approach. In particular, the number of observations is relatively small and there are several determinants of eco-efficiency whose parameters have to be estimated.

The variables are described in detail in Pérez-Urdiales et al. (2015) but we will briefly discuss them here. For the numerator of the eco-efficiency index, we use the gross margin for our measure of economic value added (Econvalue). This is the difference between revenues from milk production (including milk sales and the value of in-farm milk consumption) and direct (variable) costs. These costs include expenditure on feed, the production of forage, expenses relate to the herd, and miscellaneous expenses. Costs related to the production of forage include purchases of seeds, fertilizers and fuel, machine hire and repairs, and casual labour, while herd-related costs include veterinary expenses, milking costs, water and electricity. The environmental pressures comprise nutrients balances and greenhouse gas emissions. The nutrients balances measure the extent to which a farm is releasing nutrients into the environment, defined as the difference between the inflows and outflows of nutrients. The nutrients balances used are nitrogen (SurplusN), phosphorous (SurplusP) and potassium (SurplusK), all measured in total kilograms. These environmental pressures are constructed using the farm gate balance approach and are calculated as the difference between the nutrient content of farm inputs (purchase of forage, concentrates, mineral fertilizers and animals, legume fixation of nitrogen in the soil and atmospheric deposition) and the nutrient content of outputs from the farm (milk sales and animal sales). The volume of greenhouse gas emissions captures the contribution of the farm to global warming and the dataset contains information on the emissions of carbon dioxide (CO2), methane (CH4) and nitrous oxide (N2O). Each of these greenhouse gases is converted into CO2 equivalents, so that the variable used is (thousands of) kilos of carbon dioxide released into the atmosphere (CO2).

The second set of variables are the potential determinants of eco-efficiency, which comprises socioeconomic characteristics and attitudes of farmers. The socioeconomic variables are the age of the farmer (Age); the number of hours of specific agricultural training that the farmer received during the year of the sample (Training); and a variable capturing the expected future prospects of the farm and which is defined as a dummy variable taking the value 1 if the farmer considered that the farm would continue to be in operation five years later, and 0 otherwise (Prospects). As explained in Pérez-Urdiales et al. (2015), eco-efficiency would be expected to be negatively related to age (i.e., older farmers should be less eco-efficient) and positively related to professional training and the expectation that the farm continue.

Three attitudinal variables were constructed from responses to a series of questions on farmers’ beliefs regarding their management of nutrients and greenhouse gas emissions as well as their attitudes towards environmental regulation. Thus, on a five-point Likert scale respondents had to state whether they strongly disagree (1), disagree (2), neither agree nor disagree (3), agree (4) or strongly agree (5), with a series of statements regarding their habits and attitudes towards environmental management. The variables HabitsCO 2 and HabitsNutrients are constructed as dummy variables that take the value 1 if respondents stated that they agreed or strongly agreed that management of grenhouse gases and nutrients was important, and 0 otherwise. The final variable measuring attitudes towards environmental regulation, defined as a dummy variable taking the value 1 if respondants agreed or strongly agreed that environmenatl regulation should be made more restrictive and 0 otherwise (Regulation).

Some descriptive statistics of the variables used for measuring eco-efficiency and the determinants of estimated eco-efficiency are presented in Table 12.1.

Table 12.1 Descriptive statistics of variables

5 Results

We focus initially on the results from the stochastic frontier models and then on the comparison of these with the DEA results.

Table 12.2 presents estimates from different specifications of the homoskedastic (ALS) and heteroskedastic (RSCFG) stochastic eco-efficiency frontier, with their corresponding eco-efficiency scores presented in Table 12.3. Columns (A) and (B) of Table 12.2 report estimates from the ALS model with all environmental pressures included and it can be seen that all the estimated coefficients on the pressures were highly significant. The parameter \( \delta \) corresponding to \( \ln \sigma_{u} \) was also highly significant, implying that the frontier specification is appropriate.

Table 12.2 Parameter estimates
Table 12.3 Eco-efficiency scores

As the pressure function parameters \( \beta_{k} \) enter the eco-efficiency specification exponentially rather than linearly (12.5), in the bottom part of Table 12.2 the exponents of the coefficients are presented. The t-statistics here correspond to the null that \( e^{{\beta_{k} }} \) is equal to zero for each of the k pressures, and this is rejected in all cases.

However, focusing on the magnitudes rather than the statistical significance, it can be seen that the marginal contribution of the phosphorous balance to value added is almost negligible. Also, the value of \( e^{{\beta_{k} }} \) for potassium is almost twice as large as that of nitrogen. Recalling our discussion of the interpretation of these parameters after Eq. (12.14) above, this implies that potassium contributes twice as much to value added as nitrogen and would therefore be more costly for the farmer to reduce. Similarly, if farmers were required to reduce nitrogen, this could in principle be substituted by potassium: for a given reduction in kilos of nitrogen, farmers could increase their use of potassium by half this number of kilos and maintain the same value added. In this particular application, such substitution could be achieved through changes in the composition of feed, fertilizers, and a change in the composition of forage crops. Reducing phosphorous, on the other hand, would be virtually costless.

In light of the negligible contribution of phosphorous to value added, we reestimate the ALS model eliminating the phosphorous balance from the pressure function, and the results are presented in columns (C) and (D). The parameters on the nutrients are not significantly differently from 0, implying that the \( e^{{\beta_{k} }} \) are not significantly different from 1. Note that the frontier specification is still appropriate and a comparison of the the efficiency scores from the two models in Table 12.3 shows that are practically identical.

We now turn to the heteroskedastic (RSCFG) specification of the stochastic frontier where we incorporate the determinants of eco-efficiency described in the previous section. Some of the farms had missing values for one or more of these determinants, and after eliminating these observations we were left with 40 farms with complete information. When estimating the model for these 40 observations with all nutrients balances included, it did not converge. We then eliminated the phosphorous balance as we had done in columns (C) and (D) for the homoskedastic (ALS) specification, but the model still did not converge. Following our earlier strategy of eliminating the nutrient balance with the lowest marginal contribution, from column (A) we see that the nitrogen balance has a far lower marginal contribution than the potassium balance. We therefore specified the model without the nitogen balance, keeping only the potassium balance and grenhouse gas emissions as pressures. With this specification the model converged successfully and the results are reported in columns (I) and (J). The homoskedastic specification with the potassium and greenhouse gases as the only pressures for both the complete sample of 50 observations (ALS-50C) and the reduced sample of 40 observations (ALS-40C) are reported in Columns (E)-(H), and a comparison of these estimates reveals that the coefficients on the pressures change very little across the three models.

To compare the eco-efficiencies estimated by DEA and SFA, a scatterplot of the DEA efficiency scores and the efficiency estimates from the SFA model is presented in Fig. 12.2. These DEA scores are based on a simple DEA calculation as opposed to the bootstrapped DEA scores reported in Pérez-Urdiales et al. (2015). As can be seen, the eco-efficiencies are almost identical. Also plotted on Fig. 12.2 is the regression line from the regression of the SFA estimates on the DEA scores, where the R2 is 0.9805. The Spearman Rank Correlation Coefficient (Spearman’s rho) was 0.996, showing that the models yielded virtually identical rankings of eco-efficiency levels. Even when the reduced sample of 40 observations is used, the eco-efficiencies are again very similar, with almost identical mean values and a Spearman Rank Correlation Coefficient of 0.959.

Fig. 12.2
figure 2

Comparison of Eco-efficiency scores

While the raw eco-efficiency scores between DEA and SFA are very similar, the questions remains as to whether the models yield similar results with regard to the determinants of eco-efficiency. The estimates of the efficiency determinants from the SFA model from Table 12.2 are presented in Table 12.4 alongside the parameter estimates reproduced from Pérez-Urdiales et al. (2015). While all the determinants in Pérez-Urdiales et al. (2015) were found to be significant at the 95% level, only two of the determinants—HabitsCO 2 and Prospects—are significant at this level in the heteroskadastic SFA model (though two other variables—Age and Regulation - were significant at the 90% level). Notably, however, the SFA model yields exactly the same signs on the eco-efficiency determinants as the bootstrapped DEA-based truncated regression used by Pérez-Urdiales et al. (2015).

Table 12.4 Estimated coefficients and significance of eco-efficiency determinants

6 Conclusions

Measurement of eco-efficiency has been carried out exclusively using non-parametric DEA techniques in the literature to date. In the present work we have proposed using a (parametric) stochastic frontier analysis (SFA) approach. While such models are highly non-linear when estimating eco-inefficiency, in an empirical application we find that such an approach is feasible even when the sample size is relatively small and determinants of eco-inefficiency - which increases the number of parameters to be estimated - are incorporated. Using data from a sample of 50 Spanish dairy farms previously used by Pérez-Urdiales et al. (2015), we begin by estimating a stochastic frontier model without eco-efficiency determinants, and find that our model yields virtually identical eco-efficiency scores to those calculated by DEA. Estimating eco-efficiency without determinants involves relatively few parameters, so sample size should not be a major obstacle to using SFA. Our results corroborate this.

We then estimated a heteroskedastic SFA model which incorporated determinants of eco-inefficiency. We use the same determinants used by Pérez-Urdiales et al. (2015), who carried out their analysis applying bootstrapped truncated regression techniques. As extra parameters have to be estimated, the small sample size became more of an issue for the stochastic frontier model. Indeed, in order for the model to converge we had to use fewer environmental pressures in our application than Pérez-Urdiales et al. (2015). Encouragingly, however, we found the exact same signs on the determinants of eco-efficiency as those found by Pérez-Urdiales et al. (2015). Thus, even with a small sample size and multiple determinants of eco-inefficiency, the stochastic frontier model yields similar conclusions to those obtained by truncated regression techniques based on DEA estimates of eco-efficiency.

Using stochastic frontier models for eco-efficiency measurement has some advantages over the bootstrapped truncated regression techniques that have been employed in the litrature to date. In particular, the stochastic frontier model can be carried out in one stage and the coefficients on the environmental pressures (‘technology’ parameters) have interesting interpretetations which shed light on the contribution of these pressures to firm economic value added. The estimated coefficients also uncover potentially useful information on the substitutability between environmental pressures. As such, we advocate the use of SFA for measuring eco-efficiency as a complement to or substitute for DEA-based approaches. When sample size is small and we wish to incorporate determinants of eco-efficiency, the DEA-based truncated regression techniques may permit more environmental pressures to be included in the analysis. However, with larger sample sizes, we would expect this advantage to disappear and the stochastic frontier models can provide extra valuable information for producers and policymakers, particularly with regard to substitutability between pressures.