Introduction

Performance assessments of water utilities have received much attention from policy makers, utilities managers, and researchers (Guerrini et al. 2013) because such assessments can lead to improved efficiency and customer service quality, as well as savings on the operational costs of supplying water (Molinos-Senante et al. 2014a). Benchmarking methodologies can be used to identify the best operational practices. By adopting these benchmarks, weak performers may increase the efficiency of their utilities (Faust and Baranzini 2014). Hence, benchmarking exercises have been carried out at the international level by some regulators and water companies, such as the South East Asian Water Utility Network, the Union of African Water Suppliers’ Water Utilities Partnership, the Association of Water and Sanitation Regulatory Entities of the Americas, and the World Bank’s International Benchmarking Network for Water and Sanitation Utilities (IBNET) (Marques and De Witte 2010). Nevertheless, these initiatives have focused mainly on defining and developing a common language, rather than comparing performances at the water company level (Worthington 2014).

Cross-national comparisons in the water sector have two main advantages. First, using a larger database enables identification of international best practices (Estache and Rossi 2002), providing incentives to water utilities to support greater efficiency and innovation (Da Cruz et al. 2012). Second, the results of these comparisons can guide decision making by national policy makers and regulators. From a methodological standpoint, cross-comparisons offer the possibility to escape the unsteady equilibrium between economies of scale and number of comparators (De Witte and Marques 2010). Despite these advantages, cross-national benchmarking studies are very rare. To the best of our knowledge, there are only some exceptions. Thus, Clarke et al. (2009) evaluated the effect of introducing private sector participation into the water sector on Argentina, Bolivia, and Brazil. Corton and Berg (2009) assessed the efficiency of water service providers in six countries in the Central American region. A similar methodological approach was followed by Ferro and Romero (2011) to evaluate the efficiency of a sample of Latin American water utilities. Mbuvi et al. (2012) analyzed both the efficiency and effectiveness of a sample of African water utilities. However, these authors did not differentiate the utilities by country. Estache and Rossi (2002) compared the performance of public and private water companies in Asia, while See (2015) focused in Southeast Asian water utilities. In Europe, two studies have evaluated the efficiency of water companies from different countries. Thus, Da Cruz et al. (2012) compared the efficiency of Italian and Portuguese water utilities and De Witte and Marques (2010) by the use of Data Envelopment Analysis (DEA) compared the efficiency of the drinking water sector in the Netherlands, England and Wales, Australia, Portugal, and Belgium.

International comparisons of the efficiencies of water utilities struggle with several problems, especially regarding data comparability. Previous studies assumed a common production frontierFootnote 1 when comparing the efficiencies of water companies. That is, they assumed that water companies from different countries were directly comparable because they faced the same production frontier (Chen and Yang 2011). However, production units in different countries face different production opportunities and regulations, which force them to make choices from different sets of feasible input–output combinations (Chen et al. 2009). This issue is especially relevant when countries differ substantially in their economic development and water supply characteristics, which can influence the performance results.

To deal with the incomparability of performances of units from different countries, Hayami (1969) introduced the “metafrontier.” This concept reflects the envelopment of all possible frontiers that might arise from heterogeneous units (Wang et al. 2013). A metafrontier may be considered as an umbrella (upper or lower, depending on the model orientation) of all possible frontiers that might arise as a result of heterogeneity between units (Chen et al. 2009). Since its introduction, the metafrontier model has been used to compare the efficiencies of units covering diverse topics including urban water industry. De Witte and Marques (2009) used the metafrontier concept to capture the operational environment in the efficiency of drinking water companies from England and Wales, Australia, the Netherlands, Belgium, and Portugal. Recently, Molinos-Senante et al. (2015) used the metafrontier concept to analyze the presence of economies of scope in the English and Welsh water industry. Thus, the metafrontier approach is a well-established method for evaluating and comparing efficiency among nonhomogenous units.

Against this background, the main objective of this paper was to assess and compare the efficiencies of a sample of water utilities from several European and Asian countries. To account for the lack of data comparability, we applied the metafrontier model. The technological gap ratio (TGR) was used to estimate the proximity of each country frontier to the metafrontier. To the best of our knowledge, this paper presents the first comparison of the efficiencies of water utilities from different countries, including non-European nations, using a robust and reliable methodology (i.e., the metafrontier model). Although our empirical application focused on 11 countries, the proposed approach could be used to compare the performances of a larger sample of countries because it provides a solution to the problem of comparing the efficiencies of units operating under different production frontiers. From a policy perspective, the methodology and results of this study should be of great interest to managers and regulators of water companies. Efficiency comparisons at the international level should enable the identification of best practices, provide meaningful information for cost containment and reduction, and support innovation.

Methodology

The metafrontier concept is based on the premise that units from different countries do not have access to the same production technologies (Zhang and Choi 2013), and that direct comparisons of the efficiencies of units operating under different technologies are not possible (Latruffe et al. 2012). Figure 1 illustrates the metafrontier concept. Frontier countries A and B represent country-specific best practice frontiers, estimated by two separate efficiency analyses (Du et al. 2014). The all-encompassing metafrontier was obtained by pooling the data from the two countries and repeating the efficiency assessment.

Fig. 1
figure 1

Nonconcave metafrontier. Source: own elaboration

We followed the nonconcave metafrontier approach introduced by Tiedemann et al. (2011), which includes only those input–output combinations that are part of the delineated set of evaluated countries. This approach avoids the problem of infeasible input–output combinations (Sala-Garrido et al. 2011). Consider unit \(U\) that belongs to country \(A\). The ratio of distance \(\overline{{Y_{U} U^{*} }}\) to distance \(\overline{{Y_{U} U}}\) reflects the input-oriented efficiency score of \(U\) relative to its own production frontier, whereas the ratio of distance \(\overline{{Y_{U} U^{**} }}\) to \(\overline{{Y_{U} U}}\) reflects the efficiency score of \(U\) relative to the metafrontier (Fig. 1).

Either parametric or nonparametric methods can be used to evaluate the efficiency of units. Parametric methods are based on the computation of a stochastic frontier, whereas nonparametric methods do not impose any assumptions regarding production conditions a priori (Guerrini et al. 2015). Compared to parametric techniques, nonparametric techniques demand fewer requirements from the data (Ananda 2014). In this study, the DEA method was used to compute efficiency scores using the country production frontiers and the metafrontier.

Assume that units use an input vector \(x \in \Re_{ + }^{M}\) to produce an output vector \(y \in \Re_{ + }^{L}\). Production technology is defined as the capability of transforming inputs into outputs. Suppose that there are K technology sets (countries) in total, with \(k = 1, 2, \ldots , K\). Then the group technology is defined as all feasible input–output combinations for a unit of group \(k\):

$$T^{k} = \left\{ {(x,y)\, \in \,\Re_{ + }^{M + L} ;x\,{\text{can}}\,{\text{produce}}\,y\,{\text{in}}\,{\text{the}}\,{\text{group}}\,k} \right\}.$$
(1)

Based on the metafrontier concept, if an output vector \(y\) can be produced using an input vector \(x\) in one group, then (x, y) belong to set T, defined as

$$T = \left\{ {(x,y) \in \Re_{ + }^{M + L} ;x\,{\text{can}}\,{\text{produce}}\,y\,{\text{in}}\,{\text{some}}\,{\text{group}}\,T^{k} (k = 1,2,n,k)} \right\},$$
(2)

where \(T = \left\{ {T^{1} \cup \,T^{2} \cup \ldots \cup \,T^{k} } \right\}.\)

Input sets associated with T k and T are defined as

$$L^{k} \left( y \right) = \{ x:\left( {x,y} \right) \in T^{k} \}$$
(3)
$$L\left( y \right) = \left\{ {x:\left( {x,y} \right) \in T} \right\}.$$
(4)

The input-oriented distance function for each group \(k\) can be expressed as

$$D^{k} \,\left( {x,\,y} \right)\, = \,\min_{\theta } \,\left\{ {\theta > 0:x\theta \in \,L^{k} \left( y \right)} \right\}\,\left( {k = 1,2, \ldots ,K} \right).$$
(5)

Analogously, the metafrontier input-oriented distance function is

$$D\left( {x,y} \right) = \min_{\theta } \left\{ {\theta > 0:x\theta \in L\left( y \right)} \right\}.$$
(6)

Both the distance function and the metafrontier distance function (input-oriented) indicate the maximum radial contraction of inputs that a unit can achieve while keeping the vector of outputs constant (Mellah and Ben Amor 2016). The value of the distance function is lower than, or equal to, the unit as long as the input vector of the evaluated unit is a feasible element of the production set. In particular, the distance function is equal to unity if the vector of inputs is located on the production frontier.

DEA models can be characterized as demonstrating constant (Charnes et al. 1978) or variable (Banker et al. 1984) returns to scale (CRS or VRS, respectively). If the outputs increase in the same proportion as the inputs, then the model demonstrates CRS. On the other hand, if the outputs increase in a larger or smaller percentage than the inputs, then the model demonstrates VRS. Inefficiency scores estimated by a CRS approach are the products of the scale inefficiency and the pure technical inefficiency. By contrast, inefficiency scores estimated through a VRS approach only integrate technical inefficiency. Previous studies have applied both CRS and VRS approaches to estimate the efficiencies of water utilities (Kulshrestha and Vishwakarma 2013). A CRS approach means that producers can linearly scale the inputs and outputs without increasing or decreasing efficiency, which is a substantial assumption (García-Sánchez 2006). Hence, we considered the most appropriate model to be a DEA model based on the assumption of VRS. Building on previous studies (Berg and Marques 2011), we used an input-oriented model because the aim of water utilities is to reduce resource consumption, rather than to increase market share.

To estimate the efficiency scores with respect to group-k (E k) technology and the metafrontier (E), the following linear programming must be solved for each water company:

$$Min\,\theta^{k}$$
$$s.\,t:\theta$$
$$\mathop \sum \limits_{j = 1}^{n} \lambda_{j} x_{ij} \le \theta^{k}\quad i = 1, \ldots ,m$$
$$\sum\limits_{{j = 1}}^{n} {\lambda _{j} } y_{{ij}} \ge y_{{r0}} \quad r= 1, \ldots ,s$$
(7)
$$\mathop \sum \limits_{j = 1}^{n} \lambda_{j} = 1$$
$$\lambda_{j} \ge 0\quad\,j = 1, \ldots ,n_{k},$$

where \(x_{ij}\) and \(y_{ij}\) represent the quantities of the inputs \(i\, = 1, \ldots ,m\) and outputs \((r = 1, \ldots ,s)\) for each water company \((j = 1, \ldots ,n_{k} )\), where \(x_{i0}\) and \(y_{i0}\). are the values of the unit being evaluated; \(\lambda_{j}\) is the weight of inputs and outputs for each water company (\(j = 1, \ldots ,n_{k}\)); \(n_{k}\). is the number of units in group \(k\); and \(\theta^{k}\). is a scalar whose value indicates the efficiency score of the considered unit. As \(\theta^{k} \in (0,1]\)., a unit is efficient if and only if \(\theta^{k} = 1\) and the slacks of all restrictions in Eq. (7) are equal to zero (Sala-Garrido et al. 2012). The difference between an efficiency score and unity represents the potential to reduce the quantity of inputs consumed to produce the same quantity of outputs (Maziotis et al. 2015).

Restrictions of the problems of the different groups \(k\) are subsets of the constraints of the metafrontier problem. Hence, the efficiency of each group \((E^{k} )\) cannot be smaller than the efficiency with respect to the metafrontier \((E)\) (Medal-Bartual et al. 2012). Whenever there is a strict inequality between the group-k distance function and the distance to the metafrontier function, a measure of the proximity of the group-\(k\) frontier to the metafrontier can be estimated (Yu and Choi 2015). In doing so, Battese et al. (2004) defined the TGR for the group-\(k\) firms as follows:

$$TGR^{k} = \frac{D (x,y)}{{D^{k} (x,y)}} = \frac{E}{{E^{k} }} \le 1.$$
(8)

An increase in the TGR involves a decrease in the gap between the group frontier and the metafrontier (Fang and Rubin 2014).

Sample description

In the present study, we calculated the efficiencies (E k and E) of a sample of 393 water utilities from 11 countries: Belarus, Bangladesh, China, Egypt, Kazakhstan, Lithuania, Former Yugoslav Republic of Macedonia (FYROM), Moldova, Romania, Russia, and Vietnam. Data were obtained from the IBNET, which provides options for the standardized measurement of the operational and financial performances of water utilities (IBNET 2016). The IBNET database provides information on the performances of more than 2500 water and wastewater utilities from 110 countries. However, not all utilities are reported consistently. Rather than collect the data itself, IBNET sets up a mechanism by which many different organizations collect data, under the assumption that organizations closest to the water utilities and most knowledgeable about local conditions are best suited to compile data. IBNET provides instruments to support the data collection process (i.e., checks data quality to ensure internal consistency) and gives feedback on collected data (e.g., aids in data analysis). The quality of the IBNET data depends on the quality of data submitted by individual utilities and associations. The IBNET data collection tool contains ranges and built-in filters that prevent the assembly of incorrect information. After data are collected, the IBNET team creates a data collection report that indicates and describes the sources and origins of the data, according to specific criteria for value and quality.Footnote 2

Two criteria were used to select the countries and water utilities for evaluation. First, the number of water utilities within each country required to be higher than 15, to meet “Cooper’s rule” (Cooper et al. 2007). The number of units must be greater than or equal to \(\hbox{max} \{ m\cdots;3 (m + s)\}\), where m and \(s\) are the numbers of inputs and outputs, respectively, involved in the DEA study. Second, the inputs and outputs integrated into the DEA model should be available for 2010, which was the year with the largest amount of available information.

Water utilities can provide water supply and/or sewerage and wastewater treatment services. The potential existence of economies of scope could distort the efficiency scores because water and sewerage companies would be favored. To avoid this fact, this study focused on water services excluding sewerage services from the analysis. Such an approach was followed by Molinos-Senante et al. (2014b) and Portela et al. (2011) among others.

There is no consensus about the variables that should be used to evaluate the efficiency of water utilities. Selection of inputs and outputs depends on the availability of data, the reliability of available data, and the expertise and experience of the analyst. Two inputs were integrated into the DEA model: operational cost of the water supply service (\(x_{1}\)) and staff (\(x_{2}\)). Operational cost of the water supply service (in US$/m3) is the annual water service operational expense (excluding staff cost) divided by the total annual volume of water sold. Staff is expressed as the total number of staff to provide the water supply service per 1000 people served. In accordance with previous works (Lannier and Porcher 2014), two outputs were considered: water produced \(\left( {y_{1} } \right)\) and average revenue \(\left( {y_{2} } \right)\). Water produced is the total volume of water supplied annually to the distribution system expressed in liters per population per day. The average revenue (in US$/m3) is the operating revenue of the water supply service divided by the annual amount of water sold.

Table 1 reports the descriptive statistics of the data. The 11 analyzed countries have an average of 1.5 workers per 1000 people served. The average volume of water supplied per person per day varies substantially across the 11 countries. Five countries (Belarus, China, Lithuania, Moldova, and Vietnam) have water utilities that produce less than 200 l/person/d, with Belarus supplying the lowest volume of water (100.9 l/person/d). Water utilities in Kazakhstan and Russia supply remarkably large water volumes of 356.5 and 404.2 l/person/d, respectively. Cross-country comparisons of variables expressed in monetary units (i.e., operational costs and revenue) are even more complex. Hence, the use of the metafrontier concept is needed to compare the efficiencies of water utilities from the analyzed countries.

Table 1 Sample description

Results and discussion

Before estimating the efficiency with respect to the group-k frontier (\(E^{k}\)) and the metafrontier (E), it is necessary to validate whether the water utilities from the 11 countries assessed operate under different production frontiers. The hypotheses to be tested are as follows:

$$H_{0} = k \,{\text{samples operate under the same production frontier}},$$
$$H_{1} = {\text{some samples operate under other production frontiers}}.$$

The data are not normally distributed because the p value of the Kolmogorov–Smirnov test considering the Lilliefors correction is less than 0.001 for the five variables in the efficiency assessment. To check our hypothesis, the Kruskal–Wallis nonparametric test is used (Tsagarakis 2013). When the p value of this test exceeds 0.05, the null hypothesis cannot be rejected (Molinos-Senante et al. 2014a) and, thus, the samples do not operate under different production frontiers.

Four Kruskal–Wallis tests were performed, one for each variable, revealing p values of less than 0.05 in all cases (Table 2). Thus, the differences in all variables among the 11 countries evaluated are statistically significant. This finding supports the theory and assumption that a single production frontier cannot be used to compare the efficiencies of water utilities from different countries.

Table 2 Kruskal–Wallis test statistics

Having verified that the 11 groups of water utilities operate under different technological frontiers, we compute the efficiencies with respect to the group frontier \((E^{k} )\) and the metafrontier \((E)\) for each of the 393 water utilities encompassing our sample data. First, we discuss the efficiency scores obtained with respect to the country frontiers. Average efficiency scores range from a minimum of 0.694 for Moldova to a maximum of 0.936 for Romania (Table 3). The standard deviation of efficiency scores for water utilities from Romania is low, indicating a high degree of homogeneity. The performances of water companies from Moldova, China, and Belarus are more heterogeneous, as indicated by the variability in efficiency scores. On average, water utilities evaluated in this study have large efficiencies within their respective frontiers. For example, the mean efficiency score for water utilities from Russia is 0.842, indicating that inputs could be reduced by 16 % given the country frontier.

Table 3 DEA estimates of technical efficiencies with respect to group frontiers and to the metafrontier

To the best of our knowledge, assessments of the efficiencies of water companies have been performed for China and Vietnam, but not for the other countries evaluated in this paper. Liu et al. (2013) assessed the efficiencies of water systems in 33 administrative regions of China, assuming both CRS and VRS. The mean efficiency score obtained by the CRS-DEA model was 0.852, with 11 regions being efficient. Under the VRS-DEA model, the mean efficiency score was 0.978, with 19 efficient regions. See (2015) obtained mean efficiency scores of 0.64 for 10 state-owned water utilities in Vietnam by applying a double bootstrap DEA model. Our findings of mean efficiency scores for Chinese and Vietnamese water utilities of 0.770 and 0.845, respectively, differ somewhat from those of previous studies. This fact is because our sample involves more number of water utilities.

Considering the water utilities that constitute best practices within each country (i.e., with efficiency scores equal to unity), we find differences among the 11 analyzed countries (Table 3). More than 50 % of water utilities from Egypt, Lithuania, and FYROM are identified as efficient, indicating that inputs to produce the vector of outputs are already minimized. On the other hand, only 13.16 and 13.48 % of water utilities from Moldova and Russia are efficient. This finding suggests that around 70 % of water utilities in both countries have the potential to reduce the use of inputs while maintaining outputs.

Next, we compare the performances of water utilities from the 11 countries by computing their efficiency scores with respect to the metafrontier (i.e., envelope of all country frontiers). As expected, for all countries, the efficiency scores with respect to the metafrontier are smaller than those computed on the group frontiers (Table 3; Fig. 2). Nevertheless, this reduction in efficiency does not affect all countries equally. Water utilities from Romania, FYROM, Russia, and Lithuania show the largest declines in efficiency, whereas those from Bangladesh and Moldova have the largest mean efficiency and perform the best. The lowest mean score (worst performance) is associated with water utilities from Russia. The efficiency scores and relative position of each country are divergent when efficiency is calculated with respect to individual frontiers or with the metafrontier (Fig. 2). These results highlight the importance of using the metafrontier concept when seeking to analyze efficiency differences between water utilities from several countries.

Fig. 2
figure 2

Efficiency scores with respect to group frontiers and to the metafrontier

When efficiency is calculated using the metafrontier as a reference point, the number of efficient water companies decreases for all countries. Four countries (Lithuania, FYROM, Romania, and Russia) no longer have efficient water companies under this analysis (Table 3). Remarkably, the efficiency of water utilities in FYROM decreases from 58 to 0 % when efficiency is computed with respect to the metafrontier rather than individual frontiers. This result reiterates the idea that the performances of water utilities when evaluated on a common frontier are not comparable to those under different frontiers. The metafrontier is needed to solve the problem of performance incomparability among companies in countries with different technologies (Wang et al. 2013).

The TGR measures how close each country is to the metafrontier, thus disentangling technological differences in efficiency (Latruffe et al. 2012). Average TGR values vary from 0.344 in Russia to 0.969 in Bangladesh (Table 4; Fig. 3). This finding suggests that the group frontier of water utilities from Bangladesh is closest to the metafrontier. Around 60 % of water utilities from Bangladesh have a TGR equal to unity, indicating that these water utilities are minimizing the use of inputs to produce a given output. In addition, the large variability of the average TGR values for water utilities from Belarus and Kazakhstan suggests that their technological development is significantly different.

Table 4 Technological gap ratios (TGRs) for the countries evaluated
Fig. 3
figure 3

Technological gap ratios (TGRs) for the countries evaluated

The results indicate that water utilities from Bangladesh have the best efficiency performance compared to other countries in the study. From an institutional point of view, at national level, the Local Government Division of the Ministry of Local Government, Rural Development and Cooperatives is responsible for the overall development of the water supply and sanitation. Nevertheless, large urban areas have their own water authorities and in other urban areas water supply is maintained by city corporations. In addition to government institutions, nongovernment organizations (NGOs) and the private sector are involved in water services. However, private sector participation in this sector remains limited to small businesses (Asian Development Bank 2009).

Conclusions

In this paper, and for the first time, we assessed and compared the efficiencies of water utilities from different countries using the metafrontier concept. Hence, the issue of the lack of data comparability was considered in the benchmarking process. The metafrontier can be considered as an umbrella of all possible frontiers that might arise as a result of heterogeneity between water utilities. The TGR was computed as a measure of how close each country frontier is to the metafrontier.

An empirical application was carried out to assess and compare the efficiencies of 421 water utilities from 11 countries: Belarus, Bangladesh, China, Egypt, Kazakhstan, Lithuania, FYROM, Moldova, Romania, Russia, and Vietnam. The following primary findings were observed. (i) Water utilities from different countries use different production frontiers; hence, direct cross-country comparison would involve biased performance estimations. (ii) On average, water utilities evaluated in this study have large efficiencies within their respective frontiers. (iii) Water utilities from Bangladesh have the best efficiency performance among the analyzed countries.

From a policy perspective, the methodology and empirical application of this study are of great interest for regulators and company managers. Benchmarking at the international level would contribute to improve the performance of water utilities, by saving operational costs and increasing competition in the water sector. Identifying international best practices could guide the implementation of water reforms by authorities and policy makers. This study supports the importance of not using the same frontier production when comparing the efficiencies of water utilities from different countries or operating under different conditions. Finally, the metafrontier approach could potentially be used as a baseline to develop assessments of a wider range of water utilities from different countries. Such information would contribute toward improving the efficiency of water supply services.