1 Introduction

Nonparametric analysis of technology and productivity has been the subject of much interest (e.g., Afriat 1972; Färe et al. 1994; Varian 1984). It has provided the basis for data envelopment analysis (DEA) now commonly used in the investigation of productivity and firm efficiency (e.g., Banker 1984; Banker et al. 1984; Ray 2004; Cook and Seiford 2009). DEA has been seen as an attractive approach for three reasons: it allows for a flexible representation of multi-input multi-output technology; it involves solving simple linear programming problems; and it can provide firm-specific estimates of productivity and efficiency. Yet, it has one significant limitation: it assumes that the feasible set is always convex (where diminishing marginal productivity applies everywhere). As such, DEA is not appropriate in the investigation of non-convex technologies. How important are non-convexity issues in the analysis of productivity and firm efficiency? There are situations where non-convexity has significant implications for economics and management. For example, it is an important issue in the analysis of multi-product firms: non-convexity contributes to generating productivity benefits from specialization (e.g., Bogetoft 1996; Chavas and Kim 2007). This implies a need to develop empirical methods that can support the analysis of non-convex technology. Such methods are needed to examine empirically when and where non-convexity may arise.

The objective of this paper is to propose a refined nonparametric method for the analysis of technology under non-convexity. Note that non-parametric representations of technology under non-convexity are not new. Relaxing convexity assumptions in DEA has been explored by Deprins et al. (1984), Petersen (1990a, b), Bogetoft (1996), Chang (1999), Kerstens and Vanden Eeckaut (1999), Bogetoft et al. (2000), Briec et al. (2004), Podinosvki (2005), Leleu (2006, 2009), De Witte and Marques (2011), Briec and Liang (2011) and others. The most common approach is the “free disposal hull” (FDH) representation investigated by Deprins et al. (1984), Tulkens (1993), Kerstens and Vanden Eeckaut (1999) and Agrell and Tind (2001). But while the FDH model allows for non-convexity, we argue that its representation is too extreme: it tends to find evidence of non-convexity “too often”. Note that other approaches have also been used to relax the convexity assumption in nonparametric analyses. They include Petersen (1990a, b), Bogetoft (1996), Agrell et al. (2005) and Podinosvki (2005). Petersen (1990a, b) and Bogetoft (1996) have proposed to restrict convexity only to the input space or the output space. Agrell et al. (2005) have considered technology represented by unions of pairs of convex input and output sets. And Podinosvki (2005) has put forward an approach where convexity is evaluated individually for each input or output.

In this paper, we propose a new nonparametric model that relies on a neighborhood-based assessment of technology. Our approach allows for non-convexity to arise in any part of the feasible set. It differs from the approaches proposed by Petersen (1990a, b), Bogetoft (1996), Agrell et al. (2005) or Podinosvki (2005), who explored departures from non-convexity based on specific inputs and/or outputs. Our approach has three useful characteristics: it provides a flexible representation of non-convexity; it nests as (restrictive) special cases both the DEA model and the FDH model; and it is easy to implement empirically. As such, our new nonparametric approach extends the related literature both theoretically and empirically. Its usefulness is illustrated in an application to the analysis of technical and scale efficiency on Korean farms. The empirical results show how allowing for non-convexity reduces the extent of technical inefficiency. They report evidence that non-convexity is more common on large farms. Finally, they document how non-convexity matters in the analysis of scale effects.

The new model and its neighborhood-based assessment of technology are presented in Sect. 2. Its use in the evaluation of non-convex technologies is discussed in Sect. 3. Using a directional distance function, Sect. 4 presents productivity analysis under non-convexity and proposes a new measure to evaluate the extent of non-convexity. Section 5 examines the evaluation of returns to scale and scale efficiency under non-convexity. In Sect. 6, we show how our approach can be implemented easily by solving simple optimization problems. The usefulness of the method is illustrated in an application presented in Sect. 7. Finally, Sect. 8 concludes.

2 The model

Consider the observation of production activities on a set of N firms in an industry. Each firm produces m netputs z ∈ Rm and faces a production technology represented by the feasible set T ⊂ Rm. We use the netput notation where inputs are negative and outputs are positive. Let zi ≡ (z1i, …, zmi) ∈ Rm be the netput vector produced by the i-th firm, where zji is the j-th netput used/produced by the i-th firm, and zi ∈ T means that netputs zi are feasible, i ∈ N ≡ {1, …, N}. The technology T may exhibit different scale properties. It is said to exhibit \(\left\{ {\begin{array}{l} {\text{non-decreasing returns to scale (NDRS)}} \\ {\text{constant returns to scale (CRS)}} \\ {\text{non-increasing returns to scale (NIRS)}} \\ \end{array} } \right\}\) if T \(\left\{ {\begin{array}{*{20}c} \supset \\ { = } \\ \subset \\ \end{array} } \right\}\) δ T for any scalar δ > 1. And the technology is said to exhibit variable returns to scale (VRS) if no a priori restriction is imposed on returns to scale. Throughout the paper, we assume that the technology T satisfies free disposal, where free disposal means that T = T − \({\text{R}}_{ + }^{\text{m}}\).

First, consider the case where T is convex.Footnote 1 Then, under free disposal, a nonparametric representation of the technology is given by

$${\text{T}}_{\text{v}} = \, \{ {\text{z}}:{\text{ z}} \le \sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} {\text{z}}_{\text{i}} } ;\lambda_{\text{i}} \in {\text{R}}_{ + } ,{\text{i}} \in {\rm N};\sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} = { 1}} \}$$
(1)

Tv in (1) is the smallest convex set containing all data points {zi: i ∈ N} under free disposal and VRS (e.g., Afriat 1972; Varian 1984). It is the representation commonly used in DEA (e.g., Banker 1984; Banker et al. 1984; Ray 2004; Cook and Seiford 2009).

Alternative representations have been proposed depending on the scale properties of the technology. Following Färe et al. (1994) and Banker et al. (2004), they are

$${\text{T}}_{\text{s}} = \, \{ {\text{z}}:{\text{ z}} \le \sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} {\text{z}}_{\text{i}} ;\lambda_{\text{i}} } \in {\text{R}}_{ + } ,{\text{i}} \in {\rm N},\sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} \in {\text{S}}_{\text{s}} } \} ,$$
(2)

where s ∈ {v, c, ni, nd}, with Sv = 1 under VRS, Sc = [0, ∞] under constant returns to scale (CRS), Sni = [0, 1] under non-increasing returns to scale (NIRS), and Snd = [1, ∞] under non-decreasing returns to scale (NDRS). Indeed, when Sv = 1, Tv in (2) reduces to Eq. (1) under VRS. Alternatively, when Sc = [0, ∞], Tc in (2) provides a representation of a convex technology under CRS. Tc is the smallest convex cone containing all data points {zi: i ∈ N}. When Sni = [0, 1], Tni in (2) provides a representation of a convex technology under NIRS. Finally, when Snd = [1, ∞], Tnd in (2) represents a convex technology under NDRS. Since Sv ⊂ Sni ⊂ Sc and Sv ⊂ Snd ⊂ Sc, it follows from (2) that Tv ⊂ Tni ⊂ Tc and Tv ⊂ Tnd ⊂ Tc. Also, Sc = Sni ∪ Snd implies that Tc = Tni ∪ Tnd. Note that the sets Tv, Tni, Tnd and Tc are all convex.

Next, we want to introduce non-convexity in the analysis. For that purpose, consider the following nonparametric representation of technology

$${\text{T}}_{\text{FDHv}} = \, \{ {\text{z}}:{\text{ z}} \le \sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} {\text{z}}_{\text{i}} } ;\lambda_{\text{i}} \in \left\{ {0,{ 1}} \right\},{\text{i}} \in {\rm N};\sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} = { 1}} \} ,$$
(3)

where FDH stands for “FDH” (Deprins et al. 1984; Tulkens 1993; Kerstens and Vanden Eeckaut 1999; Agrell and Tind 2001). Under free disposal, TFDHv is the smallest set containing all data points {zi: i ∈ N} under VRS. It provides a non-convex representation of the technology under VRS.

Alternative non-convex representations have been proposed depending on the scale properties of the technology. Following Kerstens and Vanden Eeckaut (1999), they include

$${\text{T}}_{\text{FDHs}} = \, \{ {\text{z}}:{\text{ z}} \le \sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} {\text{z}}_{\text{i}} } ;\lambda_{\text{i}} \in \{ 0,\delta \} ,{\text{i}} \in {\rm N};\sum\limits_{{{\text{i}} \in {\rm N}}} {\lambda_{\text{i}} = \delta ;\delta \in {\text{S}}_{\text{s}} } \} .$$
(4)

where s ∈ {v, c, ni, nd}, and the Ss’s are as defined above. When Sv = 1, TFDHv in (4) reduces to Eq. (3) under VRS. Alternatively, when Sc = [0, ∞], TFDHc in (4) provides a representation of a FDH technology under CRS. TFDHc is the smallest cone containing all data points {zi: i ∈ N}. When Sni = [0, 1], TFDHni in (4) provides a representation of a FDH technology under NIRS. Finally, when Snd = [1, ∞], TFGHnd in (4) represents a FDH technology under NDRS.Footnote 2 Since Sv ⊂ Sni ⊂ Sc and Sv ⊂ Snd ⊂ Sc, it follows from (4) that TFDHv ⊂ TFDHni ⊂ TFDHc and TFDHv ⊂ TFDHnd ⊂ TFDHc. Also, Sc = Sni ∪ Snd implies that TFDHc = TFDHni ∪ TFDHnd. Note that each of the sets Tv, Tni, Tnd and Tc is in general non-convex. Finally, note that the λ‘s are restricted to take discrete values in (4) but not in (2). It follows that TFDHs ⊂ Ts, i.e., that TFDHs is a subset of Ts, for s ∈ {v, c, ni, nd}.

The sets Tv, Tc and TFDHv are illustrated in Fig. 1. Figure 1 shows that these sets satisfy TFDHv ⊂ Tv ⊂ Tc. Note that the sets Tv and Tc are convex, but that the set TFDHv is in non-convex. This indicates that DEA is clearly inappropriate in the analysis of non-convexity. Indeed, since Tv is always convex, DEA offers no prospect to uncover any evidence of non-convexity and produces biased estimates of technical efficiency under a non-convex technology. In contrast, FDH can provide a basis to represent a non-convex technology. Yet, it has a rather undesirable characteristic: it has a tendency to find non-convexity at many places. This can be seen in Fig. 1, where the frontier technology is given by the line ABDHJ under Tv and by ABCDEFGHJ under TFDHv. While the frontier line ABDHJ is concave, the frontier line ABCDEFGHJ is not. The two lines coincide only along the segments AB and HJ, where marginal products are either zero or infinite under Tv. At all other points, the two lines differ. It means that, under FDH, the frontier technology would basically exhibit non-convexity at all points where marginal products are positive and bounded under Tv. Yet, we are usually interested in situations where marginal products are positive and bounded. The fact that FDH would always reveal non-convexity in these situations seems undesirable. In other words, while TFDHv can provide a representation of non-convexity, it may reveal it “too often”.Footnote 3 This indicates a need to develop alternative representations of technology that can capture non-convexity in a more useful and credible way. Below, we explore alternative formulations that allow for flexible representations of the technology T under non-convexity.

Fig. 1
figure 1

Representations of Technology under Tv, Tc and TFDHv

Define a neighborhood of z ≡ (z1, …, zm) ∈ Rm as Br(z, σ) = {z′: Dp(z, z′) ≤ r: z′ ∈ Rm} ⊂ Rm, where r > 0 and Dp(z, z′) ≡ \(\sum\nolimits_{\text{j = 1}}^{\text{m}} {}\)[(|zj − zj’|/σj)p]1/p is a weighted Minkowski distance between z and z′, with weights σ = (σ1, …, σm) ∈\({\text{R}}_{ + + }^{\text{m}}\)and based on a p-norm 1 ≤ p < ∞.Footnote 4 Let I(z, r) = {i: zi ∈ Br(z, σ), i ∈N} ⊂ N, where I(z, r) is the set of firms in N that are located in the neighborhood Br(z, σ) of z.Footnote 5 Define a local representation of the technology T in the neighborhood of point z as:

$${\text{T}}_{\text{rv}} \left( {\text{z}} \right) \, = \, \{ {\text{z}}:{\text{ z}} \le \sum\limits_{{{\text{i}} \in {\rm I}\left( {{\text{z}},{\text{r}}} \right)}} {\lambda_{\text{i}} {\text{z}}_{\text{i}} ;\lambda_{\text{i}} \in {\text{R}}_{ + } ,{\text{i}} \in {\rm I}\left( {{\text{z}},{\text{ r}}} \right);} \sum\limits_{{{\text{i}} \in {\rm I}\left( {{\text{z}},{\text{r}}} \right)}} {\lambda_{\text{i}} = { 1}\} } .$$
(5)

Equation (5) corresponds to Eq. (1) except that it applies locally using information limited to points in the neighborhood Br(z, σ) of z under VRS. Using (2), alternative local representations of the technology can be obtained depending on its scale properties. They are

$${\text{T}}_{\text{rs}} \left( {\text{z}} \right) \, = \{ {\text{z}}:{\text{ z}} \le \sum\limits_{{{\text{i}} \in {\rm I}\left( {{\text{z}},{\text{r}}} \right)}} {\lambda_{\text{i}} {\text{z}}_{\text{i}} ;\lambda_{\text{i}} \in {\text{R}}_{ + } ,{\text{i}} \in {\rm I}\left( {{\text{z}},{\text{ r}}} \right);} \sum\limits_{{{\text{i}} \in {\rm I}\left( {{\text{z}},{\text{r}}} \right)}} {\lambda_{\text{i}} \in {\text{S}}_{\text{s}} \} } .$$
(6)

where s ∈ {v, c, ni, nd}, and the Ss’s are as defined above. When Sv = 1, Trv(z) in (6) reduces to Eq/ (5) under VRS. Alternatively, when Sc = [0, ∞], Trc(z) in (6) provides a local representation of the technology under CRS. When Sni = [0, 1], Trni(z) in (6) is a local representation of the technology under NIRS. Finally, when Snd = [1, ∞], Trnd(z) in (6) gives a local representation of the technology under NDRS. Since Sv ⊂ Sni ⊂ Sc and Sv ⊂ Snd ⊂ Sc, it follows from (6) that Trv(z) ⊂ Trni(z) ⊂ Trc(z) and Trv(z) ⊂ Trnd(z) ⊂ Trc(z). Also, Sc = Sni ∪ Snd implies that Trc(z) = Trni(z) ∪ Trnd(z). Finally, note that, for a given z, the sets Trv(z), Trni(z), Trnd(z) and Trc(z) are all convex.

Definition 1

Consider the following neighborhood-based representation of the technology T:

$$T_{\text{rs}}^{*} = \cup_{{{\text{i}} \in {\text{N}}}} T_{\text{rs}} (z_{\text{i}} ),\quad {\text{for}}\;{\text{s}}\; \in \;\left\{ {{\text{v}},\,{\text{c}},\,{\text{ni}},\,{\text{nd}}} \right\}.$$
(7)

Equation (5) defines the set T *rs as the union of the sets Trs(zi), i ∈ N. In the neighborhood of point zi, the set Trs(zi) is convex and provides a local representation of the technology T under free disposal and returns to scale characterized by s ∈ {v, c, ni, nd}. Since the union of convex sets is not necessarily convex, it follows that T *rs defined in (7) is not necessarily convex for each s ∈ {v, c, ni, nd}. Since the sets Trs(zi) in (6) are convex, it means that the rise of non-convexity in T *rs necessarily comes from the union of the neighborhood-based sets Trs(zi). As discussed below, this provides useful flexibility in investigating a non-convex technology.

Equation (7) is our proposed neighborhood-based representation of technology. It extends previous literature by allowing for non-convexity to arise in any part of the feasible set. Our approach has two points in common with Agrell et al. (2005): 1/we both rely on the fact that unions of convex sets are not necessarily convex; and 2/like Agrell et al.’s approach, our approach can nest FDH as a special case (as shown below). But the convex pair approach proposed by Agrell et al. (2005) did not rely on neighborhood-based measures used in (). As such, the neighborhood-based sets Trs(zi) (7) is specific to our approach. As argued below, our neighborhood-based characterization provides useful flexibility in the characterization of a non-convex technology.

Equation (7) differs from the approaches proposed by Petersen (1990a, b), Bogetoft (1996), or Podinosvki (2005), who explored departures from non-convexity based on inputs and/or outputs. Petersen (1990a, b) and Bogetoft (1996) assume full convexity in the output set or the input set. The selective convexity approach proposed by Podinosvki (2005) is more general in the sense that it allows for non-convexity to arise for specific inputs or outputs. By defining non-convexity for all values of selected sets of inputs or outputs, the approaches proposed by Petersen (1990a, b), Bogetoft (1996) or Podinosvki (2005) focus on a global characterization of non-convexity. It means that they cannot examine the possible presence of non-convexity in particular subsets of feasible inputs/outputs. As such, they do not allow for a local specification of convexity (Podinosvki 2005, p. 556). Our approach does. Indeed, our neighborhood-based approach is flexible enough to allow for non-convexity to arise in any region of the feasible set. As noted above, the non-convexity of T *rs in (7) comes from the union of the neighborhood-based convex sets Trs(zi). This provides useful guidance in the choice of neighborhoods: choose a neighborhood to be “large” in parts of the feasible region that are thought to be convex, but “small” in parts that are thought to be non-convex (see Sect. 6.3 below). Our proposed approach offers a flexible representation of parts of the feasible set that exhibit non-convexity. This local flexibility can apply to specific ranges of values taken by given inputs or outputs (as discussed in Sect. 6). Importantly, this useful property is not shared with the global approaches proposed by Petersen (1990a, b), Bogetoft (1996) or Podinosvki (2005). The flexibility can also apply to all values taken by specific netputs (in a way similar to the approach proposed by Podinosvki (2005)). To see that, given σ = (σ1, …, σm), choosing σj determines how large (or small) a neighborhood Br(z, σ) is for the j-th netput. In this context, the choice of σ = (σ1, …, σm) implies that convexity would apply for the inputs/outputs that have a “large” neighborhood while non-convexity can arise for inputs/outputs that have a “small” neighborhood.

As showed below, T *rs has three useful characteristics: 1/it provides a flexible representation of non-convexity; 2/it nests as (restrictive) special cases both the DEA model and the FDH model; and 3/it is easy to implement empirically.

3 Evaluating non-convexity

Our evaluation of non-convexity of the technology relies on the properties of the representations Ts and T *rs . The following properties will prove useful.

Lemma 1

For s ∈ {v, c, ni, nd}, the set T * rs satisfies

$${ \lim }_{r \to \infty } {\text{T}}_{\text{rs}}^{*} = {\text{T}}_{{{\text{s}} \cdot }}$$
(8)

Proof

Note that limr→∞ I(z, r) = N for any finite z ∈ Rm. Using Eqs. (2), (6) and (7), it follows that Ts = limr→∞ Trs(zi) = limr→∞ T *rs for any i ∈ N and s ∈ {v, c, ni, nd}.

Lemma 2

For s ∈ {v, c, ni, nd}, the set T * rs satisfies

$${ \lim }_{{{\text{r}} \to 0}} {\text{T}}_{\text{rs}}^{*} = {\text{ T}}_{\text{FDHs}} .$$
(9)

Proof

Note that limr→0 Br(zi, σ) = {zi} and limr→0 I(zi, r) = {i} for any i ∈ N. Using Eq. (6), we have limr→0 Trs(zi) = {z: z ≤ γ zi, γ ∈ Ss}. Eq. (7) can be alternatively written as T *rs  = {Σi∈Ν αi Trs(zi): αi ∈ {0, 1}, i ∈ Ν; Σi∈Ν αi = 1}. Letting ηi = αi γ, this implies that limr→0 T *rs  = {z: z ≤ Σi∈Ν ηi zi; ηi ∈ {0, γ}, i ∈ Ν; Σi∈Ν ηi = γ, γ ∈ Ss}. Using Eq. (4), this gives (9).

Given s ∈ {v, c, ni, nd}, Eqs. (8) and (9) show that T *rs includes two important special cases. From Eq. (8), the set T *rs reduces to the set Ts when r → ∞, i.e., when the neighborhood Br(z, σ) of any z becomes “very large”. And from Eq. (9), the set T *rs reduces to the set TFDHs when r → 0, i.e., when the neighborhood Br(zi, σ) become “very small” for any i ∈ N.

Proposition 1

For s ∈ {v, c, ni, nd}, the sets satisfy

$$T_{\text{FDHs}} \subset {\text{T}}_{\text{rs}}^{*} \subset {\text{T}}_{\text{r's}}^{*} \subset {\text{T}}_{\text{s}} ,\quad {\text{for any r'}} > {\text{r}} > 0.$$
(10)

Proof

Note that limr→0 Br(zi, σ) ⊂ Br(zi, σ) ⊂ Br′(zi, σ) ⊂ limr→∞ Br(zi, σ) for any r′ > r > 0. Thus, for any r′ > r > 0, limr→0 I(zi, r) ⊂ I(zi, r) ⊂ I(zi, r′) ⊂ limr→∞ I(zi, r) = N. Then, Eq. (6) implies that limr→0 Trs(zi) ⊂ Trs(zi) ⊂ Tr’s(zi) ⊂ limr→∞ Trs(zi) for any r′ > r > 0 and any i ∈ N. Using Eqs. (7), (8) and (9), this proves (10).

Proposition 1 states that TFDHs is in general a subset of Ts: TFDHs ⊂ Ts, for s ∈ {v, c, ni, nd}. It also establishes that the set T *rs , our neighborhood-based representation of technology, is bounded between TFDHs and Ts, with TFDHs as lower bound and Ts as upper bound. Noting that the set Ts is convex, and the set TFDHs is in general non-convex, it means that T *rs provides a generic way of introducing non-convexity in production analysis. The sets Tv, TFDHv and T *rv are illustrated in Fig. 2 under VRS. Figure 2 shows that these sets satisfy TFDHv ⊂ T *rv  ⊂ Tv. Note that the set Tv is convex, but that the sets T *rv and TFDHv are non-convex. These representations apply under alternative scale properties: under VRS when s ∈ v (with Sv = 1), under CRS when s = c (with Sc = [0, ∞]), under NIRS when s = ni (with Sni = [0, 1]), as well as under NDRS when s = ni (with Snd = [1, ∞]). Finally, Eq. (10) states that the set T *rs becomes larger when r increases, i.e., when the neighborhoods used to evaluate T *rs become larger. As further discussed below, this provides some flexibility in the empirical analysis of non-convexity issues.

Fig. 2
figure 2

Representations of Technology under Tv, TFDHv and T *rv

4 Productivity under non-convexity

Let g ∈ \({\text{R}}_{\text{m}}^{ + }\)be a reference bundle satisfying g ≠ 0. Following Chambers et al. (1996), consider the directional distance functionFootnote 6

$$\begin{aligned} {\text{D}}\left( {{\text{z}},{\text{ T}}} \right) \, & = \, { \sup }_{\beta } \{ \beta : \, ({\text{z }} + \beta {\text{g}}) \in {\rm T}\} {\text{ if there is a scalar}}\beta {\text{satisfying }}({\text{z }} + \beta {\text{g}}) \in {\rm T}\} , \\ & = - \infty {\text{otherwise}}. \\ \end{aligned}$$
(11)

The directional distance function is the distance between point z and the upper bound of the technology T, measured in number of units of the reference bundle g. It provides a general measure of productivity. In general, D(z, T) = 0 means that point z is on the frontier of the technology T. Alternatively, D(z) > 0 implies that z is technically inefficient (as it is below the frontier),Footnote 7 while D(z, T) < 0 identifies z as being infeasible (as it is located above the frontier). Luenberger (1995) and Chambers et al. (1996) provide a detailed analysis of the properties of D(z, T). First, by definition in (11), z ∈ T implies that D(z, T) ≥ 0 (since β = 0 would then be feasible in (11)), meaning that T ⊂ {z: D(z, T) ≥ 0}. Second, D(z, T) ≥ 0 in (11) implies that (z + D(z, T) g)) ∈ T. When the technology T exhibiting free disposal, it follows that D(z, T) ≥ 0 implies that z ∈ T, meaning that T ⊃ {z: D(z, T) ≥ 0}. Combining these two properties, we obtain the following result: under free disposal, T = {z: D(z, T) ≥ 0} and D(z, T) provides a complete representation of the technology T. Importantly, besides being convenient, this result is general: it allows for an arbitrary multi-input multi-output technology; and it applies with or without convexity.

Using (10) and (11), we obtain the following key result.

Proposition 2

For any point z ∈ R m where D(z, T s ) > −∞, the directional distance function satisfies

$${\text{D}}({\text{z}},{\text{T}}_{\text{FDHs}} ) \le {\text{D}}\left( {{\text{z}},{\text{T}}_{\text{rs}}^{*} } \right) \le {\text{D}}\left( {{\text{z}},{\text{T}}_{\text{r's}}^{*} } \right) \le {\text{D}}\left( {{\text{z}},{\text{T}}_{\text{s}} } \right),\quad {\text{for any r}}' > r > 0,\quad {\text{for s}} \in \{ {\text{v, c, ni, nd}}\}$$
(12)

Proposition 2 shows that D(z, T *rs ) is bounded between D(z, TFDHs) and D(z, Ts), with D(z, TFDHs) as lower bound and D(z, Ts) as upper bound. When s = v, Eq. (12) implies that DEA (relying on Tv) is more likely to find evidence of technical inefficiency than FDH. This is illustrated in Fig. 1, which shows that the production frontier tends to be higher under DEA compared to FDH. With s ∈ {v, c, ni, nd}, Eq. (12) shows that this result applies under alternative characterizations of returns to scale. It also shows that D(z, T *rs ) tends to increase with r, where T *rs is our neighborhood-based representation of technology given in (7). Finally, as discussed next, Proposition 2 provides a basis to evaluate the role of non-convexity in productivity analysis.

Definition 2

At point z, define the following measure of non-convexity

$${\text{C}}_{\text{rs}} \left( {\text{z}} \right) \equiv {\text{D}}\left( {{\text{z}},{\text{ T}}_{\text{s}} } \right) \, - {\text{ D}}\left( {{\text{z}},{\text{ T}}_{\text{rs}}^{*} } \right),\quad {\text{for s}} \in \left\{ {{\text{v}},{\text{ c}},{\text{ ni}},{\text{ nd}}} \right\}.$$
(13)

Proposition 3

At point z where D(z, T v ) > −∞,

$$\lim_{{{\text{r}} \to 0}} {\text{C}}_{\text{rs}} ({\text{z}}) \ge {\text{C}}_{\text{rs}} ({\text{z}}) \ge {\text{C}}_{\text{r's}} ({\text{z}}) \ge \lim_{{{\text{r}} \to \infty }} {\text{C}}_{\text{rs}} ({\text{z}}) = 0,{\text{for any r'}} > {\text{r}} > 0,\quad {\text{for s}} \in \{ {\text{v, c, ni, nd}}\} .$$
(14)

Proof

The inequalities in (14) are obtained from combining (12) and (13), and using Eqs. (8) and (9).

Proposition 3 applies under alternative characterizations of returns to scale: under VRS (when s = v), CRS (when s = c), NIRS (when s = ni), as well as NDRS (when s = nd). Equation (13) defines Crs(z) as a measure of non-convexity, evaluated in number of units of the reference bundle g. From Eq. (14), this measure is always non-negative: Crs(z) ≥ 0. Equation (14) states that limr→∞ Crs(z) = 0. This is intuitive: DEA assumes convexity and does not provide any opportunity to uncover the presence of non-convexity. It means that the search for non-convexity must rely on the case where r < ∞. Then, for a given r < ∞, finding Crs(z) > 0 at some point z implies that the set T *rs is non-convex. In addition, (14) states that limr→0 Crs(z) is an upper bound measure for Crs(z). This reflects the fact that, under free disposal, FDH offers the greatest prospects to uncover non-convexity. Finally, Eq. (14) shows that Crs(z) tends to decrease with r, indicating that the opportunity to uncover non-convexity declines with the size of the neighborhoods used to evaluate T *rs . The effects of r on the evaluation of non-convexity are further discussed below.

5 Evaluating returns to scale

Since our analysis applies under alternative scale characterization, it can also be used to investigate returns to scale. While evaluating scale efficiency is well known under convexity (e.g., Färe et al. 1994; Banker et al. 2004), this section explores how this can be done under non-convexity.

Proposition 4

The sets satisfy

$${\text{T}}_{\text{rv}}^{*} \subset {\text{T}}_{\text{rni}}^{*} \subset {\text{T}}_{\text{rc}}^{*} ,$$
(15a)
$${\text{T}}_{\text{rv}}^{*} \subset {\text{T}}_{\text{rnd}}^{*} \subset {\text{T}}_{\text{rc}}^{*} .$$
(15b)

Proof

We have seen that Trv(z) ⊂ Trni(z) ⊂ Trc(z) and Trv(z) ⊂ Trnd(z) ⊂ Trc(z). Then, (15a) and (15b) follow from (7).

Definition 3

At point z, define the following measure of scale efficiency

$${\text{SE}}_{\text{rs}} \left( {\text{z}} \right) \equiv {\text{D}}\left( {{\text{z}},{\text{ T}}_{\text{rc}}^{*} } \right) \, - {\text{ D}}\left( {{\text{z}},{\text{ T}}_{\text{rs}}^{*} } \right),\quad {\text{for s}} \in \left\{ {{\text{v}},{\text{ c}},{\text{ ni}},{\text{ nd}}} \right\}.$$
(16)

Proposition 5

At point z where D(z, T v ) > −∞, the scale efficiency measures SE rs (z) satisfy

$${\text{SE}}_{\text{rv}} \left( {\text{z}} \right) \, \ge {\text{ SE}}_{\text{rni}} \left( {\text{z}} \right) \, \ge \, 0,$$
(17a)
$${\text{SE}}_{\text{rv}} \left( {\text{z}} \right) \, \ge {\text{ SE}}_{\text{rnd}} \left( {\text{z}} \right) \, \ge \, 0.$$
(17b)

Proof

Equations (11), (15a) and (15b) imply that \({\text{D}}\left( {{\text{z}},{\text{ T}}_{\text{rc}}^{*} } \right) \, \ge {\text{ D}}\left( {{\text{z}},{\text{ T}}_{\text{rni}}^{*} } \right) \, \ge {\text{ D}}\left( {{\text{z}},{\text{ T}}_{\text{rv}}^{*} } \right),{\text{ and D}}\left( {{\text{z}},{\text{ T}}_{\text{rc}}^{*} } \right) \, \ge {\text{ D}}\left( {{\text{z}},{\text{ T}}_{\text{rnd}}^{*} } \right) \, \ge {\text{ D}}\left( {{\text{z}},{\text{ T}}_{\text{rv}}^{*} } \right)\). Using (16), this gives (17a) and (17b).

Equation (16) defines SErs(z) as a measure of departure from CRS, evaluated in number of units of the reference bundle g. From Eqs. (17a)–(17b), evaluated under VRS (with s = v), the measure is always non-negative: SErv(z) ≥ 0. This is intuitive: it follows from the fact that the set T *rc is always at least as large as T *rv , as stated in (15a)–(15b). In addition, (17a) states that, under NIRS (with s = ni), SErni(z) is also non-negative but has SErv(z) as an upper bound. This follows from the fact that the set T *rni is always at least as large as T *rv but never larger than T *rc , as stated in (15a). And (17b) establishes a similar result under NDRS (with s = nd): SErnd(z) is non-negative but has SErv(z) as an upper bound. This shows how SErs(z) in equation (16) provides a basis to measure scale efficiency under non-convexity. Indeed, finding SErs(z) > 0 at point z implies that the set T *rs exhibits a departure from CRS and that point z is scale inefficient. The effects of r on the evaluation of scale efficiency will be evaluated below.

6 Empirical assessment

Consider a data set involving observations of m netputs chosen by N firms: {zi = (z1i, …, zmi): i ∈ N}, where zji is the j-th netput used by the i-th firm. As suggested in propositions 2–5, we want to find some convenient way to solve for the directional distance function D(z, T) under alternative representations of the technology T.

6.1 Empirical evaluation of directional distance functions

This section examines empirical applications using the data {zi = (z1i, …, zmi): i ∈ N}. First consider the optimization problem (11) under Ts in (2), where s ∈{v, c, ni, nd}, Sv = 1, Sc = [0, ∞], Sni = [0, 1] and Snd = [1, ∞1]. For each s ∈{v, c, ni, nd} and assuming that a solution exists, this gives the standard linear programming (LP) problems: D(z, Ts) = maxβ {β: z + β g ≤ Σi∈Ν λi zi; λi ∈ R+, i ∈ Ν, Σi∈Ν λi ∈ Ss}. In all these cases, convexity is imposed. Second, consider the optimization problem (11) under TFDHs in (4) for s ∈{v, c, ni, nd}. Assuming that a solution exists, this gives D(z, TFDHs) = maxβ {β: z + β g ≤ Σi∈Ν λi zi; λi ∈ {0, δ}, i ∈ Ν; Σi∈Ν λi = δ; δ ∈ Ss}, which is a mixed integer linear programming (MILP) problem for s = v (where Sv = 1), but a mixed integer nonlinear programming (MINLP) problem for s ∈{c, ni, nd}.Footnote 8

Below, we explore how to solve (14) under T *rv , the neighborhood-based representation of technology given in (7). For s ∈ {v, n, ni, nd}, note that Eq. (7) can be alternatively written as

$${\text{T}}_{\text{rs}}^{*} = \{ \sum\limits_{{{\text{j}} \in N}} {\upalpha_{\text{j}} {\text{T}}_{\text{rs}} \left( {{\text{z}}_{\text{j}} } \right);\upalpha_{\text{j}} \in \left\{ {0,{ 1}} \right\},{\text{j}} \in N;} \sum\limits_{{{\text{j}} \in N}} {\upalpha_{\text{j}} = { 1}\} } ,$$
(18)

for s ∈{v, c, ni, nd}. Let λij be the weight λi associated with z = zj in (7). Letting ηij = αj λij, it follows from (6), (11) and (18) that

$$\begin{aligned} {\text{D}}\left( {{\text{z}},{\text{ T}}_{\text{rs}}^{*} } \right) \, & = {\text{Max}}_{{\upbeta,\uplambda,\upeta,\upalpha}} \{\upbeta: \, ({\text{z}} +\upbeta{\text{g}}) \le \sum\limits_{{{\text{j}} \in {\rm N}}} {\sum\limits_{{{\text{i}} \in I({\text{zj}},{\text{r}})}} {\upeta_{\text{ij}} {\text{z}}_{\text{i}} :\upeta_{\text{ij}} =\upalpha_{\text{j}}\uplambda_{\text{ij}} ,\uplambda_{\text{ij}} \in {\text{R}}_{ + } ,} } \\ &\quad \sum\limits_{{{\text{i}} \in I({\text{zj}},{\text{r}})}} {\uplambda_{\text{ij}} \in {\text{S}}_{\text{s}} ,\upalpha_{\text{j}} \in \left\{ {0,{ 1}} \right\},} \sum\limits_{{{\text{j}} \in {\rm N}}} { \upalpha_{\text{j}} = { 1},{\text{i}} \in I\left( {{\text{z}}_{\text{j}} ,{\text{ r}}} \right),{\text{j}} \in N\} {\text{ if a solution exists}},} \\ & = \, - \infty {\text{otherwise}}, \\ \end{aligned}$$
(19)

for s ∈{v, c, ni, nd}. Equation (19) is a MINLP problem. Solving it numerically can provide a way to assess the directional distance functions D(z, T *ra ) for s ∈{v, c, ni, nd}.

Yet, dealing with non-linear constraints in (19) can be empirically challenging. In this context, alternative formulations that avoid non-linear constraints are of interest. One such formulation is the following optimization problem

$$\begin{aligned} {\text{D}}^{ + } \left( {{\text{z}},{\text{ T}}_{\text{rs}}^{*} } \right) \, & = {\text{Max}}_{{\upbeta,\upeta,\upalpha}} \{\upbeta: \, ({\text{z}} +\upbeta{\text{g}}) \le \sum\limits_{{{\text{j}} \in N}} {\sum\limits_{{{\text{i}} \in I({\text{zj}},{\text{r}})}} {\upeta_{\text{ij}} {\text{z}}_{\text{i}} :\upeta_{\text{ij}} \in {\text{R}}_{ + } ,} } \sum\limits_{{{\text{i}} \in I({\text{zj}},{\text{r}})}} {\upeta_{\text{ij}} \in\upalpha_{\text{j}} {\text{S}}_{\text{s}} ,} \\&\upalpha_{\text{j}} \in \left\{ {0,{ 1}} \right\},\sum\limits_{{{\text{j}} \in N}} {\upalpha_{\text{j}} = { 1},{\text{i}} \in I\left( {{\text{z}}_{\text{j}} ,{\text{ r}}} \right),{\text{j}} \in N\} {\text{ if a solution exists}},} \\& = \, - \infty \;{\text{otherwise}}. \\ \end{aligned}$$
(20)

for s ∈{v, c, ni, nd}. Equation (20) is a MILP problem. Because it does not include the nonlinear restrictions ηij = αj λij, solving (20) is simpler than solving (19). But the absence of the restrictions ηij = αj λij in (20) implies that D+(z, T *rs ) is in general an upper bound to D(z, T *rs ): D+(z, T *rs ) ≥ D(z, T *rs ). When would the two objective functions coincide? They would coincide (with D+(z, T *rs ) = D(z, T *rs )) when the solution to (20), (η*, α*), satisfies η *ij  = 0 for all i when α *j  = 0, j ∈ N. Otherwise, they would differ, and D+(z, T *rs ) would be strictly larger than D(z, T *rs ): D+(z, T *rs ) > D(z, T *rs ). In this later case, solving the simpler problem (20) would provide upward biased estimates of D(z, T *rs ).

6.2 Linear programming formulation

Given the potential empirical difficulties in solving the nonlinear optimization problem (19), we now explore a simpler way to evaluate D(z, T *rs ) in (19). From (7), note that T *rs is defined from Trs(zj), j ∈ N. This suggests obtaining D(z, T *rs ) using the following two-step approach.

In step one, solve (11) under Trs(z′) in (6). For s ∈{v, c, ni, nd}, this corresponds to the (primal) linear programming (LP) problem

$$\begin{aligned} D(z,T_{rs} (z')) & = {\text{Max}}_{{\upbeta,\uplambda}} \{\upbeta:({\text{z}} +\upbeta{\text{g}}) \le \sum\limits_{{{\text{i}} \in I\left( {{\text{z}}',{\text{r}}} \right)}} {\uplambda_{\text{i}} {\text{z}}_{\text{i}} ;\uplambda_{\text{i}} \in {\text{R}}_{ + } ,{\text{i}} \in I({\text{z}}',{\text{r}});} \sum\limits_{{{\text{i}} \in I\left( {{\text{z}}',{\text{r}}} \right)}} {\uplambda_{\text{i}} \in {\text{S}}_{\text{s}} } \quad {\text{if a solution exists,}}\} \\ & = - \infty \,\;{\text{otherwise}} \\ \end{aligned}$$
(21)

or its dual LP formulation

$$\begin{aligned} {\text{D}}({\text{z}},{\text{T}}_{\text{rs}} ({\text{z}}')) & = {\text{Min}}_{\text{u,v}} \{ {\text{v}} - {\text{z}}^{\text{T}} {\text{u}}:{\text{z}}_{\text{j}}^{\text{T}} {\text{u}} \le {\text{v}},{\text{j}} \in I({\text{z}}',{\text{r}});{\text{g}}^{\text{T}} {\text{u}} = 1;{\text{u}} \in ;{\text{v}} \in V_{\text{s}} \} ,\quad {\text{if a solution exists}}, \\ & = - \infty \;{\text{otherwise,}} \\ \end{aligned}$$
(21’)

where u and v are the Lagrange multipliers associated with the constraints [(z + β g) ≤ Σi∈I(z′,r) λi zi] and [Σi∈I(z′,r) λi ∈ Ss] in (21), with Vv = [−∞, ∞], Vc = 0, Vni = [0, ∞] and Vdi = [−∞, 0].

Then, in step two, assuming that D(z, Trs(zi)) > −∞ for some i ∈ I, and using (18), D(z, T *rs ) can be obtained as

$${\text{D}}\left( {{\text{z}},{\text{ T}}_{\text{rs}}^{*} } \right) \, = {\text{Max}}_{\text{i}} \{ {\text{D}}\left( {{\text{z}},{\text{ T}}_{\text{rs}} \left( {{\text{z}}_{\text{i}} } \right)} \right):{\text{ i}} \in N\} .$$
(22)

In this two-step approach, step one involves solving linear programming (LP) problems in (21) or (21’). And step 2 stated in (22) is a simple maximization problem. This shows how (21)-(22) can be used to obtain D(z, T *rs ) by solving simple linear programming problems. This provides a convenient way to solve (11) under T *rs , our neighborhood-based representation of technology given in (7).

6.3 Defining the neighborhood Br(z, σ)

As discussed in Sect. 2, our analysis relies on the definition of a neighborhood Br(z, σ) = {z′: Dp(z, z′) ≤ r: z′ ∈ Rm} ⊂ Rm, where Dp(z, z′) is a weighted Minkowski distance with 1 ≤ p < ∞. Below, it will be convenient to rely on a weighted Chebyshev distance defined as limp→∞ Dp(z, z′) = Maxj {|zj − zj’|/σj: j = 1, …, m}. In this context, Br(z, σ) can be written as Br(z, σ) = {z′: −r σj ≤ zj − zj′ ≤ r σj; j = 1, …, m; z′ ∈ Rm} and I(z, r) can be written as I(z, r) = {i: –r σj ≤ zj − zjj’ ≤ r σj; j = 1, …, m; i ∈ N}.

Below, we discuss general rules that can be used in choosing this neighborhood. Sometimes, we may have a priori information about the regions where non-convexity is likely to arise. Assume that one of these regions is region A(z) around point z. In general we want to choose the neighborhood of Br(z, σ) to be no larger than A(z). Indeed, choosing Br(z, σ) ⊃ A(z) may just “hide” the non-convexity in A(z) within the larger region Br(z, σ). This generates the following rule:

Rule R1

Around point z, choose a neighborhood Br(z, σ) that is no larger than the region A(z) where non-convexity is suspected: Br(z, σ) ⊂ A(z).

Rule R1 assumes that we do have a priori information about the presence of non-convexity. This a priori information can come from theoretical considerations. For example, the presence of fixed cost is a well-known source of non-convexity. It means that non-convexity can be expected in any region of the feasible set where “fixed resources” are being used. This can include labor or management (e.g., “fixed” labor or management wasted in the process of switching between tasks) as well as capital (e.g., “fixed” machinery, equipment or infrastructure used in the production process). This could also include “resource fixity” on the output side (e.g., for perishable products).

What if we do not have the a priori information stipulated in rule R1? Then we need to find other ways to identify the neighborhood Br(z, σ). In this context, we can use the data to help choose these neighborhoods. To see that, let Mj ≡ [Maxi∈N {zji} − Mini∈N {zji}] be the range of observations for zj, j = 1, …, m. For the j-th netput, consider partitioning the line [Mini∈N {zji}, Maxi∈N {zji}] into k intervals, j = 1, …, m, where k is an integer satisfying 1 ≤ k ≤ N. One way is to choose these intervals to be equally spaced.Footnote 9 Then, for the j-th netput, the width of an interval is Mj/k. Given Br(z, σ) = {z′: –r σj ≤ zj − zj’ ≤ r σj; j = 1, …, m; z′ ∈ Rm}, associate these intervals with a neighborhood of point z by letting r σj = Mj/k, k being a positive integer, j = 1, …, m. For a given k, it follows that the neighborhood of z can be written as Br(z, ·) = {z′: − Mj/k ≤ zj − zj′ ≤ Mj/k; j = 1, …, m; z′ ∈ Rm}. When z ad z′ are points within the range of the data, then choosing k = 1 implies that Br(z, ·) is a “large neighborhood” of z which includes all data points. And choosing k > 1 means that we partition the range of each netput into k equally spaced intervals, the neighborhood Br(z, ·) of z becoming smaller as k becomes larger.

Next, we propose the following rule to guide us in the choice of neighborhoods.

Rule R2

Around point z, choose a neighborhood Br(z, σ) that includes more than one data point.

R2 has important implications. First, it implies that point z cannot be outside the range of the data. That is intuitive: in any analysis, we should always try to avoid extrapolating beyond the data. Second, Rule R2 requires that there are sufficient data points to support the analysis. It hints that the number of observations N should be “large enough” to provide credible evidence on non-convexity in the neighborhood of point z. Third, R2 rules out FDH. Indeed, from Eq. (9) in Lemma 2, FDH is obtained when r → 0, implying that the neighborhood of any point zj would include just the point zj. This would be inconsistent with R2. As discussed in Sect. 2, the FDH approach seems undesirable as it can find evidence of non-convexity “too often”. Intuitively, R2 stresses the importance of having a minimal number of observations (more than 1) to evaluate the characteristics of technology in any neighborhood within the data. As such, R2 can help improve the credibility of finding evidence that a technology is non-convex. Fourth, Rule R2 puts some upper bound on the number of intervals k discussed above. Indeed, increasing k would also reduce the number of observations in each interval. Again, to be credible, evidence of non-convexity in the neighborhood of point z should rely on a sufficient number of data points. Overall, Rule R2 implies that the number of observations N should be “large enough” while the number of intervals should “not be too large”. As such, it provides useful guidance to support productivity analysis under non-convexity.

7 Empirical illustration

To illustrate the usefulness of our proposed approach, we apply it to a data set on production activities from a sample of Korean farm households.

7.1 Data

The data were collected in 2007 in a Farm Household Economy Survey conducted by the Korean National Statistical Office. Our analysis focuses on a sample of farms classified as paddy rice farms located in the Jeon-Nam province, a rice-producing province in the southern part of Korea. Being in the same region, all farms face similar agro-climatic conditions. The sample includes 122 rice farms. It provides data on ten outputs: rice, vegetable, soybean, fruit, potato, barley, miscellaneous, specialty, livestock, and others; and four inputs: labor, size of paddy land, size of upland, and other inputs. Labor input is measured in hours, and land inputs are measured in hectares (ha). Other netputs are measured in value, assuming that all farmers face the same prices.

Descriptive statistics on the variables used in our analysis are presented in Table 1. The average revenue from rice production is 15,398.81 (measured in 1,000 wonFootnote 10), accounting for 62.7 % of total farm revenue. The second largest source of revenue is vegetable production: 3,608.15 (measured in 1,000 won), accounting for 14.7 % of total farm revenue. The average size of a farm is 1.31 ha (including both paddy land and upland).

Table 1 Descriptive statistics

7.2 Results

Our analysis uses data on production activities from our sample of 122 Korean farms. It covers 14 netputs: 10 outputs treated as positive, and 4 inputs treated as negative. For the i-th farm, the netputs are zi = (zji: j = 1, …., 14), i ∈ N ≡ {1, 2, …, 122}.

The estimation of the directional distance function in (11), (19) or (21)–(22) produces a nonparametric estimate of the distance between point z and the boundary of the feasible set, as measured by the number of units of the reference bundle g. When z is the netput vector for the i-th farm, then the distance function D(zi, T) ≥ 0 provides a measure of technical inefficiency for the i-th farm, with D(zi, T) > 0 when the i-th farm is technically inefficient. The reference bundle g = (g1, …, g14) is chosen as follows. We let gj = 0 when j is an input, and gj = sample mean for the j-th output when j is an output. Thus, our reference bundle g = (g1, …, g14) is the typical bundle associated with the outputs of an average farm. This choice leads to a simple interpretation of our directional distance estimates. For example, for a given T, finding that D(zi, T) = 0.2 would mean that the i-th farm is technically inefficient: it could move the production frontier and increase its outputs by a maximum of 20 percent of the average outputs in our sample by becoming technically efficient. Note that this interpretation remains valid under alternative characterizations of the technology T.

We evaluate the directional distance function D(zj, T) in (11) for each farm under alternative representations of the technology. First, we start with DEA analysis and solve for D(zj, T) under technologies Tv under VRS and Tc under CRS (as given in Eqs. (1) and (2)). Second, using TFDHv in (3), we obtain FDH measures D(zj, TFDHv) under VRS technology by solving the corresponding mixed integer programming problems. The results are reported in the “Appendix” for each farm. Since our neighborhood-based representation of technology allows for non-convexity to arise in any part of the feasible set, it can provide a basis to evaluate productivity and non-convexity for different firm types. We investigate this issue for three categories of farms: small farms, medium farms, and large farms.Footnote 11 The results are summarized in Table 2. Table 2 presents the average technical inefficiency estimates D(zj, T) for each group of farms under alternative representation of the technology. It shows that DEA finds evidence of technical inefficiency across all farm sizes. The mean value of D(zj, Tv) is 0.063 for small farms, 0.159 for medium farms, and 0.119 for large farms. Table 2 also reports that FDH finds that all farms are technically efficient, with D(zj, TFDHv) = 0 for all j = 1, …, 122. Note that this is consistent with Proposition 2, which showed that DEA (relying on Tv) is more likely to find evidence of technical inefficiency than FDH (as the production frontier tends to be higher under DEA compared to FDH). But in this case, allowing for non-convexity under FDH eliminates all evidence of technical inefficiency. This has two implications. First, there can be a large difference between the DEA measure of technical inefficiency D(zj, Tv), and its FDH counterpart D(zj, TFDHv). Second, this difference is due entirely to relaxing the convexity assumption. One must wonder whether this difference is “credible”. As discussed in Sect. 2, this raises the question: Does the FDH approach finds non-convexity “too often”? We believe that it does (as further discussed below).

Table 2 Average technical inefficiency D(z, T) and non-convexity C(z) under alternative representations of the technology, by farm size

Next, using the neighborhood-based representation of technology T *rs in (7) or (18), we obtain estimates of the directional distance D(zj, T *rs ) by solving the linear programming problems in (21)–(22). In the absence of strong a priori information about where non-convexity may arise, we define the neighborhoods Br(z, σ) as follows. Assuming equally spaced intervals, we let r σj = Mj/k, and define Br(z, ·) = {z′: − Mj/k ≤ zj − zj’ ≤ Mj/k; j = 1, …, m; z′ ∈ Rm} as neighborhood of z, where Mj ≡ [Maxi∈N {zji} − Mini∈N {zji}] and k denotes the number of intervals within the data range. The set T *rv in (7) is then defined accordingly. The analysis is repeated for alternative numbers of intervals k: k = 1, 2, 4, 6, 8, 10, 12. The distances D(zj, T *rs ) are estimated under VRS (with s = v) for each farm. The results are reported in the “Appendix” for each farm. Summary measures are presented in Table 2 for our three farm sizes: small farms, medium farms, and large farms. The results are consistent with Proposition 2. First, as expected, D(z, T *rv ) is bounded between D(z, TFDHv) and D(z, Tv), with D(z, TFDHv) as lower bound and D(z, Tv) as upper bound. Second, D(z, T *rv ) tends to increase with the size of the neighborhood r, or equivalently decrease with the number of intervals k (given r σj = Mj/k). Third, Table 2 shows that our estimates D(z, T *rv ) nest DEA estimates and FDH estimates as special cases. Indeed, D(z, T *rv ) becomes equal to D(z, Tv) when neighborhoods become “large” (in our case, when k = 1), and it becomes equal to D(z, TFDHv) when neighborhoods become “small” (in our case, when k = 12). Yet, neither case seems realistic. Indeed, choosing k = 1 imposes a convex technology and prevents any possibility of uncovering evidence of non-convexity. Alternatively, choosing k = 12 likely finds non-convexity “too often”. As noted above, FDH does not satisfy our “Rule 2”. In this case, 12 intervals are “too many” as there are not enough points in each neighborhood to obtain a reliable estimate of marginal productivity around each data point. And this has adverse effects on the ability to find evidence of technical inefficiency. Indeed, in this case FDH or k = 12 fails to find any evidence of technical inefficiency.Footnote 12 These results help document why FDH does not provide a reasonable approach in the analysis of non-convexity.

One advantage of our approach is that it allows us to choose neighborhoods that satisfy our Rules R1 and R2. These rules seek a balance between finding evidence of technical inefficiency versus finding evidence of non-convexity. In our application, we believe that choosing k = 4 is a good choice: it is between k = 1 (corresponding to DEA) and k = 12 (corresponding to FDH). It identifies neighborhoods that are “not too large” to allow us to uncover evidence of non-convexity, and “not too small” to generate a more reliable estimate of the production technology around any data point. Interestingly, when k = 4, we still find evidence of technical inefficiency. Indeed, Table 2 reports mean estimates of technical inefficiency of 0.025 for small farms (with 62.2 % of small farms being technically efficient), 0.035 for medium farms (with 75.5 % of medium farms being technically efficient), and 0.003 for large farms (with most large farms being technically efficient).

In addition, Table 2 reports estimates of the non-convexity measure Crv(z) given in Eq. (13). When k = 4, the mean estimates of Crv(z) are 0.039 for small farms, 0.123 for medium farms, and 0.116 for large farms. For example, it means that, for medium farms, the effects of non-convexity amount to a 12.3 percent change in average outputs. These estimates indicate that the technology facing Korean farmers exhibit significant non-convexity. They also show that the extent of non-convexity is larger on medium and large farms (compared to small farms). As analyzed by Chavas and Kim (2007), non-convexity contributes to increasing the productivity benefits of specialization. This would indicate that large farms have stronger incentives to specialize than smaller farms. To our knowledge, this is the first evidence that non-convexity appears to vary with firm size.

Finally, we evaluate returns to scale under non-convexity. Using (16), we use our neighborhood-based representation T *rv under VRS to evaluate scale efficiency SErv(z). The results are summarized in in Table 3 for our three farm sizes. Recall that SErv(z) = 0 when point z is scale efficient, and SErv(z) > 0 implies a departure from CRS and measures the magnitude of scale inefficiency. The evidence against CRS is in general modest. Under DEA (obtained when r is large and k = 1), the average SE is 0.026 for small farms, 0.024 for medium farms, and 0.13 for large farms. Alternatively, under FDH (obtained when r is large and k = 12), all farms are found to be scale efficient (with all SE = 0). Using our neighborhood-based representation of technology with k = 4, the average SE is 0.02 for small farms, 0.041 on medium farms, and 0.030 on large farms.

Table 3 Scale efficiency SErs(z) under alternative representations of the technology, by farm size

These results have several implications. First, Korean farms exhibit a high level of scale efficiency. This is consistent with the dominant small-scale rice farming system commonly found in Korea. Second, introducing non-convexity affects the estimate of scale effects. Table 3 shows that the relationship between SE and k is not always monotonic. For example, in the case of medium farms, the average SE first rises then declines with k. This indicates that there is no general relationship between non-convexity and returns to scale. Yet, our results indicate that non-convexity matters in the analysis of scale effects. Indeed, Table 3 suggests that neglecting non-convexity (by using DEA) would generate “upward-biased” estimates of SE, while relying on FDH would likely generate “downward-biased” estimates of SE.Footnote 13 Finally, Table 3 indicates that these biases vary with farm size. In particular, the estimate of SE is found to be more sensitive to the choice of k for large farms. This is likely due to the fact that non-convexity effects are more important on large farms. This stresses the need to account for non-convexity in the evaluation of returns to scale. This also illustrates the usefulness of our approach in understanding and evaluating the technical and scale efficiency of firms under non-convexity.

8 Concluding remarks

This paper has presented a new nonparametric approach to the analysis of technology and productivity under non-convexity. Our approach relies on a neighborhood-based representation of technology. We investigate the general properties of our model and its use in the evaluation of technology and productivity under non-convexity. Our approach nests two well-known approaches as special cases: DEA, and FDH models. Yet either of these two approaches is overly restrictive: DEA because it does not allow for any non-convexity; and FDH because it allows for “too much” non-convexity. We argue that our new nonparametric model allows for non-convexity in a more flexible way. Its neighborhood-based representation of technology allows for non-convexity to arise in any part of the feasible set. In this context, we propose a measure capturing the extent of non-convexity. We also use our approach to evaluate scale efficiency under non-convexity. We show how our approach can be applied by solving simple optimization problems. Finally, we illustrate its usefulness through an empirical application to Korean farms. The empirical analysis shows how non-convexity can reduce the extent of technical inefficiency. It finds evidence that non-convexity is more common on large farms. Finally, it documents how non-convexity matters in the analysis of scale effects.

Note that our analysis could be extended in number of directions. First, while our neighborhood-based approach provides a flexible way to investigate the presence of non-convex technology, there is a need for additional research exploring the implications of neighborhood choice for productivity and efficiency analysis. Second, exploring the statistical properties of our proposed efficiency estimator and investigating linkages with stochastic frontier analysis (e.g., Kumbhakar et al. 2007; Simar and Zelenyuk 2011) are good topics for further investigation. Third, the economics and management implications of non-convexity need to be examined in more details. For example, evaluating the productivity effects of firm specialization is a good topic for further research. Fourth, there is a need for additional studies of the economic implications of non-convex technologies in a market equilibrium context (e.g., Chavas and Briec 2012). Finally, empirical applications to different industries are needed to uncover evidence of situations where non-convexity may be important.