1 Introduction

The discourse over meaningful criteria for sustainability measurement has been debated for more than two decades. The lack of common ground is apparent from the sheer number of more than 500 indicators aiming to quantify sustainable development (Böhringer and Jochem 2007; Parris and Kates 2003). At the same time, this abundance of indicators underscores the importance of sustainability measurement. The motivations for measuring sustainability are multiple: policy and decision making, environmental management, advocacy, participation, consensus building, research and analysis (Parris and Kates 2003:559). The desire to monitor and quantify sustainability in a meaningful way is well-founded and widespread.

Yet the ambiguous nature of sustainability in general and its measurement in particular (Van de Kerk and Manuel 2008:228) is problematic, especially for composite indicators of sustainability that incorporate multiple indicators. Composite indicators have often been at the centre of controversy. It has been argued that composite indicators suffer from a lack of consistency (Pillarisetti and van den Bergh 2010) and methodological flaws (York 2009; Siche et al. 2008; Niemeijer 2002). In short, they may even try to “measure the immeasurable” (Böhringer and Jochem 2007:1). Notwithstanding the critics, composite sustainability measures are widely used in environmental management and decision making at all levels (Niemeijer and de Groot 2008).

Since sustainability indicators are increasingly recognised as important instruments for policy making and public communications (Singh et al. 2009), it is becoming increasingly important to assess the meaningfulness of these indicators. The foremost objective of sustainability measurement is to provide decision-makers with well-structured data “in order to assist them to determine which actions should or should not be taken in an attempt to make society sustainable” (Singh et al. 2009:191). If this purpose is to be met, it is necessary to evaluate the grounds for inclusion of the indicators used, critically assess underlying foundations, uncover inconsistencies and eventually overcome limitations.

This paper aims to contribute to the literature on sustainability measurement by assessing one of the most comprehensive approaches, the Environmental Sustainability Index (ESI). The ESI (Whitford and Wong 2009:191) is known as a well established composite sustainability measure that factors in not only ecological but also socio-economic and political foundations of sustainability. After a brief introduction to sustainability measurement the architecture of the ESI is assessed using the Pressure-State-Response (PSR) framework, exploratory factor analysis, and a series of regression models. The results obtained from these analyses serve as a means to construct an equivalised version of the ESI based on a statistically derived weighting scheme, which is then used to further explore the measurement properties of the ESI. The paper concludes with implications for the application of sustainability indicators in policy making and recommendations for future environmental research.

2 Indicators of Environmental Sustainability

National and global environmental discourse reveals relentless controversies and diverging ideologies of measurement concepts for measuring sustainability. Current methods of sustainability measurement include single environmental indicators (e.g. carbon dioxide emissions, methane emissions, water pollution, deforestation) and composite indices (e.g. Ecological Footprint, Environmental Sustainability Index). Whereas the former reflect one particular aspect of the environment, the latter include not only ecological but also socio-economic and political dimensions to account for a more comprehensive picture of sustainability.

2.1 Single Indicators

The purpose of single environmental indicators is to track changes to the quality and condition of air, water, land, ecological systems and their resident fauna. These measures report geographical and temporal trends of specific environmental conditions and situations. A number of such indicators are used in environmental and economic research to arrive at a better understanding of how political, social and economic development of a given place affects its environment.

In the last two decades a large literature has been published on the driving forces of environmental sustainability. Many of these empirical studies include single environmental indicators to quantify the state of the environment on a national scale. Single indicators such as sulphur dioxide, smoke and heavy particles are employed in an study by Grossman and Krueger (1995) to examine the impact of affluence on air pollution and on the contamination of river basins. Single indicators such as water and air quality are used to examine how they are related to different levels of literacy, income inequality or rights (Torras and Boyce 1998). National well-being and export flows are also considered to have strong impact on the state of the environment. It is suggested that increasing deforestation is largely caused by forestry export flows from poor to rich nations (Shandra et al. 2009). Similarly, Longo and York (2008) use single indicators (fertilizer and pesticide consumption in agricultural production) to show how they vary with economic development and export intensity. Also, welfare analyses fall back on single environmental indicators which seem to be relevant to human well-being and health. The pollutant nitrogen dioxide for example, has been identified by Welsch (2007) to play an important role in the determination of subjective well-being.

With the growing concern about climate change and global warming, a large body of literature has investigated the driving forces of greenhouse gas emissions. The most popular single environmental indicator of this sort is carbon dioxide (CO2) emissions. Carbon dioxide is the main anthropogenic greenhouse gas accounting for 77 % of the total greenhouse gas emissions (Baumer et al. 2005:5). Hence, an increasing number of studies attempt to identify the underlying social, political and economic causes of CO2 emission rates. Carbon dioxide is often used as a single indicator in analyses aiming to demonstrate a relationship between emission rates and affluence (York 2003; York et al. 2003b), income inequality (Ravallion et al. 2000), technology (Dietz and Rosa 1997) or with the position of a country in the world-system (Prew 2010). Population levels are also seen to play a key role in the determination of CO2 emission rates. Several attempts have been made to elucidate the relationship between population and national-level emissions of carbon dioxide (Dietz and Rosa 1997; Rosa et al. 2004; York 2003; York et al. 2003b).

Besides carbon dioxide, a lot of attention has been directed to methane (CH4) as a single environmental indicator. This highly potent greenhouse gas is the second largest contributor to global warming, accounting for 14 % of all greenhouse gases (Baumer et al. 2005:5). A number of studies attempt to identify the social structural causes of methane emission intensity. Jorgenson’s (2006) cross-country comparison includes methane emissions as a single indicator to show how emission rates are related to the production of beef and veal, oil and natural gas, the use of biomass for energy and political, economic and social factors. Other greenhouse gases like nitrous oxide, chlorofluorocarbons (CFCs) and halons have received only minor attention in environmental research.

The downside of single indicators is that they cover relatively few dimensions of environmental quality and do not provide a broader picture of sustainability in a social, economic and political sense. Critics argue that it is more important to focus on the balance between the natural and human environment and to accept that these systems are multidimensional and characterised by different economic, social and environmental dimensions (Cabezas and Fath 2002; Mayer et al. 2004; Pezzoli 1997). Single environmental indicators do not fully meet these requirements as they usually reflect only one specific characteristic of the system (Mayer 2008).

2.2 Composite Indices

To account for the deficiencies of single indicators an increasing number of composite indices have been developed over the last two decades. The basic assumption is that when a broader variety of indicators and variables are aggregated into an index, the final figure shows at a glance a “simplified, coherent, multidimensional view of a system” (Mayer 2008:279). The major objectives of composite indices are to (Mayer 2008; Pillarisetti and van den Bergh 2010; Singh et al. 2009):

  • Monitor and evaluate sustainable development and environmental pressure.

  • Aggregate complex or multi-dimensional issues to support policy making.

  • Track the development of environmental states on geographical and temporal scales.

  • Highlight factors which are most responsible for driving the system.

  • Anticipate and assess conditions and trends.

  • Provide early warning information to prevent economic, social and environmental damage.

  • Formulate strategies and communicate ideas.

  • Facilitate the ranking of countries.

  • Attract public interest and awareness.

Composite indices were heavily promoted by international organisations such as the United Nations (Agenda 21) and the World Bank, which even proposed its very own indicator—the Genuine Savings Index (GSI)—to assess environment/economic interactions. Many studies have examined the strengths and weaknesses of the GSI, concluding that there are serious doubts as to the applicability of the measure. Issues include missing data, resources which had been depleted in the past due to direct importation of resources from other countries, changing consumption patterns and mixing up natural capital with physical and human capital (Asheim 2003; Dietz and Neumayer 2007; Hamilton and Dixon 2003; Hueting and Reijnders 2004; Lawn 2007; Pezzey et al. 2006; Pillarisetti 2005).

Another attempt to develop a more market and economy-based index is the Index of Sustainable Economic Welfare (ISEW), which was later modified and renamed the Genuine Progress Indicator (GPI) (Böhringer and Jochem 2007). This composite index adjusts net national product for loss of welfare caused by environmental and social issues. The final index is aggregated by 20 sub-indicators (Singh et al. 2009) of which seven indicators reflect a growth in welfare and 13 indicators reflect a reduction of welfare. Similar to the GSI this index received criticism for methodological flaws in valuation and normalisation (see Böhringer and Jochem 2007; Lawn 2007).

On the other end of the composite indicator spectrum, there are myriad eco-system indices such as the Ecological Footprint (EF). This index by Wackernagel and Rees (1996) is one of the most popular composite indicators in both environmental research and public debate. In short, the Ecological Footprint measures human demand for natural resources and provides information on whether “nations are living within or beyond their biological capacity” (Pillarisetti and van den Bergh 2010:52). The index allows the measurement of sustainability on different levels (individuals, cities, countries, regions, humanity) by comparing ecological demands against the available supply of natural resources (Siche et al. 2008). After calculating the footprint and biocapacity, the final step implies the calculation of an ecological balance (biocapacity—footprint). One essential feature of the index is that “resources used for the production of goods and services that are exported are counted in the Ecological Footprint of the country where the goods and services are ultimately consumed” (Kitzes 2007:384).

The Ecological Footprint has undergone several refinements since its introduction and is now a notable index in environmental research. A number of studies use the Ecological Footprint to determine relationships between social structural dimensions and the demand for natural resources. There is a large volume of published studies describing the effect of socio-economic factors such as affluence or population pressure on Ecological Footprints of nations (Jorgenson 2003, 2005; Jorgenson and Burns 2007; Jorgenson and Clark 2009; Rosa et al. 2004; York et al. 2003a). Similar studies investigate the relationship between income inequality and natural resource consumption by employing the Ecological Footprint as an indicator of the latter (White 2007).

A much broader conception of sustainability measurement is proposed by the Environmental Sustainability Index (ESI), which includes socio-economic, environmental and institutional dimensions. Since its introduction the ESI has aroused large interest in the domains of environmental research and policy making (Jha and Murthy 2003; Morse 2004; Sutton and Costanza 2002). Similar to the Ecological Footprint, the ESI is as an important instrument for the evaluation and measurement of sustainability on a country scale. In contrast to the Ecological Footprint, the ESI does not solely focus on a pure ecological dimension but attempts to include ecological, economic and social dimensions (Siche et al. 2008), thereby providing policymakers with a more inclusive account of sustainability. It should be noted that the ESI does not attempt to make judgements about global sustainability but rather intends to facilitate a relative cross-country comparisons of environmental progress.

Since its introduction the ESI has been employed in a considerate number of cross-national analyses to explore how economic and social factors determine different sustainability outcomes. Whitford and Wong (2009) include the ESI to identify the political and social foundations for environmental sustainability and to test hypotheses related to economic development, religion and demographics; York (2009) provides some critical comments on this study. Similar to other studies that include sustainability indicators, the ESI is used in analyses that investigate the relationship between economic development (GDP per capita) and sustainability outcomes (Morse 2008; Park et al. 2007). Others examine the effects of more socio-economic dimensions on environmental sustainability. Social and structural forces such as environmental actions (Freymeyer and Johnson 2010), education and national culture (Park et al. 2007), happiness and quality of life (Zidansek 2007) and life satisfaction (Bonini 2008) are included in comparative analyses to investigate potential effects on ESI levels. On the political end some studies have attempted to prove a relationship between political regimes (Bush 2009), corruption (Morse 2008) and the ESI.

3 Constructing and Deconstructing the Environmental Sustainability Index (ESI)

The ESI was created by the Global Leaders of Tomorrow of the World Economic Forum in collaboration with the Yale Center for Environmental Law and Policy (YCELP) and the Center for International Earth Science Information Network (CIESIN) of Columbia University. After an initial pilot study in 2000, the ESI was first published in 2001 and subsequently refined in 2002 and 2005 (Esty et al. 2005). The purpose of the ESI is to inform decision makers who “wish to compare nations’ long-term environmental trajectories” (Morse and Fraser 2005:628). According to the originators of the ESI (Esty et al. 2005:1) it provides: (1) a powerful tool for putting environmental decision making on firmer analytical footing, (2) an alternative to GDP and the Human Development Index for gauging country progress, and (3) a useful mechanism for benchmarking environmental performance.

The ESI has become a popular indicator in comparative cross-national analyses. Its underlying methodology and construction have attracted much scholarly attention. Siche et al. (2008) and Pillarisetti and van den Bergh (2010) offer comprehensive evaluations of environmental sustainability indices including the ESI. Niemeijer (2002) has categorised the ESI as a data-driven index and compares its strengths an weaknesses to more theory-driven environmental indicators. Others have evaluated the selection, substitution and imputation of missing data (Niemeijer 2002), normalisation, weighting and aggregation methods (Böhringer and Jochem 2007; Ebert and Welsch 2004; Morse and Fraser 2005; Niemeijer 2002; Siche et al. 2008; York 2009) and the conception of the components, sub-indicators and variables that lie at the heart of the ESI (Morse 2003; Pillarisetti and van den Bergh 2010:57; Siche et al. 2008; York 2009).

Unfortunately, most of these analyses remain at the level of methodological issues (e.g. aggregation, normalisation) or solely focus on the composition of the ESI. Besides that, most studies were conducted on early-released data before the ESI had undergone major adjustments and revisions regarding methodology and composition. What remains to be addressed is a more in-depth evaluation and determination of its measurement quality based on the most recent and widely used ESI release (2005).

This paper fills this gap by answering three major questions: (a) How coherent and meaningful is the architecture of the ESI? (b) How can sustainability be modelled using the ESI? (c) How would an equivalised version of the index affect the country-ranking and the measurement qualities of the index? Finally, the paper concludes with implications for policy making and future environmental research.

3.1 The ESI Architecture

The ESI scores range from a theoretical low of 0 (less sustainable) to a theoretical high of 100 (most sustainable). With a single measure for each country included in the index, the ESI allows the ranking of nations in respect of their environmental sustainability. Further, the ranking permits cross-national comparisons of environmental progress among countries in a systematic and quantitative way. Even though the final measure is a single number, the ESI can be disaggregated into its indicators, components and variables to conduct in-depth analyses on the respective levels. The most recent edition of the ESI (2005) covers 146 countries and consists of 76 underlying variables which are aggregated into 21 indicators and five components.

The underlying ESI variables were chosen by the ESI originators through a review of the environmental literature, surveys, analyses and consultations with policymakers, scientists and specialists (Esty et al. 2005). The statistical techniques and methods used to calculate the ESI include the following steps: (a) selection of countries based on country size and indicator and variable coverage, (b) standardisation of the variables to comparable scales, (c) data transformation and preparation for imputation and aggregation, (d) multiple imputations of missing data, (e) data winsorisation to avoid negative effects of extreme values, (f) data aggregation and weighting (Esty et al. 2005:53).

The 21 underlying indicators include various socio-economic, environmental and institutional dimensions. The indicators are categorised and aggregated into five components which are (1) environmental systems (air, water, land and biodiversity), (2) environmental stresses (pollution and resource consumption), (3) human vulnerability (nutrition and illnesses associated with the environment), (4) social and institutional capacity (potential to handle environmental challenges and problems) and (5) global stewardship (efforts of global environmental responsibility).

The five ESI components offer five distinctive ways to look at sustainability and environmental quality. Even if the categorisation of the 21 indicators into these five specific components seems reasonable at a first glance, it is important to ask if this composition actually reflects the models used in environmental decision making. The next section sets out to answer this question by validating the composition of the ESI against one of the most widely used cause-effect frameworks in environmental policy making.

3.2 Validating the ESI Against a Cause-Effect Logic

First, an in-depth analysis of the ESI may start by disaggregating the index into its underlying components and indicators. This is a relatively simple step given the transparency of the methodology used by the ESI originators to aggregate the index. The purpose of this analysis is to determine whether the 21 indicators which are aggregated into five components are actually organised in a meaningful way. A well established cause-effect model for environmental policy making—the Pressure-State-Response (PSR) framework—is used for this purpose. By validating the ESI against this model, judgements about the coherence of the index and its actual applicability in environmental decision making can be inferred.

The PSR framework was published and promoted by the OECD (Giannetti et al. 2009; Niemeijer 2002) for the development of environmental indicators and is based on a simple cause-effect-response logic. The underlying questions reflecting this logic are: What is happening to the state of the environment or natural resources? Why is it happening? What are we doing about it? (Hammond et al. 1995:11). The first question is related to indicators that reflect changes or trends in the state of the environment (state indicators), the second question is associated with indicators that describe stresses caused by humans (pressure indicators) and the third question is concerned with indicators that reflect adopted policies (response indicators). Therefore, state indicators in general measure the quality of the state of the environment (e.g. air quality, water quality, stocks of fish, stratospheric ozone concentrations) pressure indicators reflect the causes of environmental problems (e.g. air pollution, water pollution, natural resource consumption, deforestation) and response indicators show the efforts taken to improve the situation of the environment (e.g. policy measures, research, treaty ratifications, budget commitments).

The validation of the ESI against the PSR framework reveals whether or not the aggregation into the five pre-defined ESI components corresponds to either pressure, state or response categories. If a single one ESI component consists of different types of PSR indicators the consistency and applicability of the index is challenged, at least from the perspective of the PSR framework. Ideally, every single ESI component should consist of only pressure, state or response indicators.

The PSR categorisation of the 21 ESI indicators is shown in Table 1. It is important to note that while some indicators can clearly be placed into one of the PSR categories, the process of categorisation is to some extent based on subjective judgement. However, the results indicate that most of the ESI indicators that are aggregated into the same component also belong to a mutual PSR category. The highest level of consistency is found for the components Environmental Systems, Reducing Environmental Stresses and Reducing Human Vulnerability, where all indicators fall into either the pressure or response category.

Table 1 The ESI under the Pressure-State-Response Framework

In contrast, the other two ESI components, Social and Institutional Capacity and Global Stewardship, seem to lump together indicators of different PSR categories. For these two components not all of the ESI indicators reflect the same PSR category. In particular, it seems that the two variables energy efficiency and hydropower and renewable energy production do not allow for a clear classification. Both variables reflect environmental pressure but could also be interpreted as a country’s policy attitude towards energy consumption. Since the production of goods and services in a country is usually less energy demanding in an efficient economy and high levels of energy consumption are usually associated with high levels of natural resource consumption, the indicator eco-efficiency reflects more the pressures a country puts onto the environment rather than the actual response to environmental problems.

A similar tension between pressure and response categories emerges when the Global Stewardship component is broken down into its indicators and variables. On the one hand, this component measures the participation in international collaborative efforts as responses. On the other hand the component includes pressure-related measures such as greenhouse gas emissions (carbon emission intensity and efficiency) and trans-boundary environmental pressures (sulphur dioxide exports and import of polluting goods and raw materials). As a consequence the Global Stewardship component fails to describe one clearly distinguishable PSR category but rather turns out to be a combination of both pressure and response indicators.

The validation of the 21 ESI indicators against the PSR model provides a first evidence of some inconsistencies in the composition of the index. While three ESI components reflect either the state of the environment, human pressure or responses to environmental problems, two components cannot be assigned to distinctive PSR categories. This ambiguity gives reason for a more detailed analysis of the ESI.

3.3 Assessing the Factorial Structure of the ESI

The validation of the ESI against the PSR model reveals some inconsistencies in the structure of the underlying components. To further explore the underlying structure of the ESI an exploratory factor analysis is conducted. This method is employed as an exploratory tool to reduce data dimensionality and to finally combine the ESI indicators into a smaller number of factors. The question to be considered is: how many different factors are required to explain the pattern of the relationships among the indicators of the ESI?

The results of the factor analysis on the 21 ESI indicators are shown in Table 2. A varimax rotation could be used for rotating the matrix, however, the majority of the indicators already sort well on the factors and the unrotated loading matrix allows for a good interpretation. The Kaiser criterion suggests that Eigenvalues of one or greater explain an adequate amount of the data variance. The factor analysis reveals that only six factors (Eigenvalues ≥ 1) would suffice to explain 76.6 % of the final ESI measure. The first factor accounts for 36.1 % of the total variance of the ESI while the cumulative explanatory powers of the first and second factor explain 50.2 % of the variance. The largest three extracted factors account for 60.8 % of the total ESI-variance.

Table 2 Results of the factor analysis on the 21 ESI indicators

Figure 1 depicts the scree plot of the factor analysis which shows the magnitude of the Eigenvalues in a descending order. This graphical method is used to determine which of the factors to retain and which factors to omit (Cattell 1966). The scree plot reveals that a sharp drop of the magnitude of the Eigenvalues occurs between the third and fourth factor. Since factor four, five and six only contribute little to the explanatory power, the interpretation below focuses only on the first three factors, which together account for more than 60 % of the total variance in the ESI indicators.

Fig. 1
figure 1

Scree plot of factors and Eigenvalues

The component matrix in Table 3 reveals that the first and largest factor is largely determined by the indicators reducing population growth (0.823), environmental health (0.818), basic human sustenance (0.874), and science and technology (0.923). Underlying variables for these indicators feature total fertility rate, death rate from infectious diseases, mortality rate, deaths from floods, cyclones and droughts, innovation index, digital access index and enrolment rates. All of these variables are characteristics of affluent countries and the first factor is therefore taken as a proxy for economic development. Since the nature of the first factor is about affluence and economic development, which are generally related to social capacity and reduction of vulnerability, the label ‘social robustness’ is assigned to the first factor. This evidence suggests that such social robustness plays a key role in determining ESI levels.

Table 3 Component matrix for the 21 ESI indicators

The second factor accounts for 14 % of the total variance in the ESI indicators and shows high loadings from three ESI indicators: eco-efficiency (0.780), participation in international collaborative efforts (0.653) and greenhouse gas emissions (0.663). Some of the underlying variables included in these ESI indicators include energy efficiency, renewable energy production, participation in international environmental agreements, carbon efficiency and carbon intensity. The extracted factor is neither associated with any of the systems nor with any of the stresses or vulnerability variables. However, considering the nature of the indicators that load most highest on this factor, it seems appropriate to interpret it as a proxy for ‘environmental consciousness’.

The third factor, which accounts for 10.6 % of the total variance in the ESI indicators, is largely correlated with indicators of the Environmental Systems component: land (0.600), water quality (0.587) and water quantity (0.649). Some underlying variables for these indicators include percentage of total land area, dissolved oxygen concentration, phosphorus concentration, and freshwater and groundwater availability per capita. This factor seems to be driven by the states of the environmental systems and is therefore interpreted as a proxy for ‘natural endowment’.

The ESI is based on five components whereas the factor analysis suggests the existence of only three empirical factors in the ESI. Similar to the original components, these three factors allow policy-makers to look at sustainability issues on a more detailed level and from three distinctive perspectives. The reduction from five to three components, however, raises some questions of redundancy and unnecessary complexity in the ESI architecture.

A validation of the three components against the PSR framework reveals a fairly consistent pattern. The first factor, social robustness, seems to be associated with the response category of the framework. Its underlying indicators such as science and technology, basic human sustenance, and reducing population growth are all related to response measures society takes to either prepare against environmental threats or to reduce pressure on the environmental habitat. The second factor, environmental consciousness, features indicators that describe stresses caused by humans and thus belongs to the pressure category. Underlying indicators include greenhouse gas emissions and eco-efficiency, which both determine how much pressure a country puts on the environment. The third component, natural endowment is closely tied to the state category of the PSR model featuring indicators such as water quantity and quality, land, biodiversity and air quality. In short, this validation shows that the altered aggregation of the components seems to match much better with the PSR framework.

4 Modelling Sustainability Using the ESI

In the recent past, a large and growing body of literature has investigated the various driving forces of environmental sustainability. Many of these studies are based on cross-national comparisons that include sustainability indicators as dependent variables (e.g. CO2, MH4, Ecological Footprint). Some of them have also employed the ESI. The attempts to model sustainability using the ESI have included predictors like economic development (Morse 2008; Park et al. 2007), political institutional arrangements, demographics, population (Whitford and Wong 2009) and environmental actions (Freymeyer and Johnson 2010).

None of these studies has developed a comprehensive set of socio-economic and political predictors,. Yet, the most comprehensive attempt has been offered by Whitford and Wong (2009) who set out to identify the social and political foundations of environmental sustainability using the ESI as a dependent variable. Still, there are some important predictors that have not been included in their models. In this paper, an attempt is made to model sustainability in a more comprehensive approach by including a broader set of predictors.

Here, a series of models employing multiple socio-economic and political predictors is deployed to investigate the driving forces behind different ESI outcomes. The purpose of this analysis is twofold. First, it is designed to determine the socio-economic foundations of sustainability. A number of driving forces are identified and tested in regression models to reveal their impact on the ESI. The second purpose of this analysis is to test the measurement qualities of the ESI.

4.1 Data

This study employs the most recent ESI data which were released in 2005. The sample size is largely determined by the countries for which ESI data are available. The final sample for which data for all of the variables were available consists of 120 countries. All models employ the Environmental Sustainability Index (ESI) 2005 as the dependent variable. The ESI 2005 data are gathered from Esty et al. (2005). Ten independent variables are included in the analysis. All of the selected variables are widely used in sustainability research (Babcicky 2012). The predictors are aggregated into five regression models that reflect economic, social and political dimensions. The specification of the models is as follows.

4.1.1 Base Model (Model 1)

GDP per capita, 2005 (logged and centred) is used as a proxy for a nation’s relative level of economic development and capital intensity. The data are taken from the World Bank (2010) and are measured in constant 2000 US dollars. The measure is first logged to correct for its skewed distribution. It is then centred by subtracting the mean of log GDP per capita to account for problems with potential collinearity among the independent variables.

GDP squared, 2005 (logged, centred and squared) is the quadratic term for per capita GDP. This term is included to test the hypothesis of a non-linear relationship between economic development and environmental sustainability as commonly proposed by ecological modernisation theorists. The coefficient of the quadratic term tests for the existence of an Environmental Kuznets Curve (EKC).

4.1.2 Economic Dimensions (Model 2)

Industrialisation, 2005 is measured as industry as percentage of Total Gross Domestic Product and measures the extent to which an economy is based on industry. These data are taken from the World Bank (2010).

Agriculture as Percentage of Total Gross Domestic Product, 2005 controls for the extent to which an economy is based on agricultural production. These data are gathered from the World Bank (2010).

Exports as Percentage of Total Gross Domestic Product (logged), 2005 represents the value of all goods and other market services provided to the rest of the world and measures the extent to which a nation is integrated into the international trading system. The variable is logged to account for a skewed distribution. These data are taken from the World Bank (2010).

4.1.3 Social Dimensions (Model 3)

Urbanisation, 2005 is measured as Urban Population as Percentage of Total Population and controls for relative levels of urbanisation. The data are taken from the World Bank (2010).

Gender equality, 2007 is measured by the Gender Equity Index (GEI) taken from Social Watch (2007). This measure represents the evolution of the situation of women around the world.

Dependent population, 2005 measures the percentage of the population younger than 15 or older than 64. This measure represents the percentage of the population dependent on the working-age population (15–64). The data are taken from the World Bank (2010).

4.1.4 Political Dimensions (Model 4)

Democratisation, 2005 is measured as the combined average of the political rights and civil liberties scales from Freedom House (2010). Political rights reflect the degree to which a nation is governed by democratically elected representatives and has fair, open and inclusive elections. Civil liberties reflect whether within a nation there is freedom of press, freedom of assembly, general personal freedom, freedom of private organisations and freedom of private property (Freedom House 1997).

Control of corruption, 2005 captures the extent to which public power is exercised for private gain, including both petty and grand forms of corruption as well as ‘capture’ of the state by elites and private interests (Kaufmann et al. 2009). The data are taken from the Worldwide Governance Indicators (WGI) published by the World Bank (2009).

4.1.5 Fully Saturated Model (Model 5)

This model includes all predictors and is referred to as the full model.

4.2 Models and Interpretation

The results of all models are shown in Table 4. Economic development (GDP per capita) is found to have a highly significant and positive impact on the ESI both in the baseline model and in Model 2. Although this effect does not hold for Models 3 and 4, the results show that economic development does have a significant and positive effect in the fully saturated Model 5 where all predictors are included. This suggests that affluent countries perform significantly better in terms of environmental sustainability (as measured by the ESI). The quadratic term of GDP per capita is not significant in any of the models, suggesting that the Environmental Kuznets Curve is not reflected in the ESI.

Table 4 Regression results for the ESI

As expected, industrialisation is significant and negatively related to sustainability in Model 2 but not when social and political dimensions are included in Model 5. Agriculture is only significant in the full model and has a positive effect on sustainability. The effect of export dependence is not significant in any of the models.

Urbanisation, gender equality and dependent population are found to be positively related to sustainability in Model 3 and in Model 5. Urbanisation is generally considered as characteristic of modernity and previous studies have shown that urbanisation is a significant driving force of ecological degradation and increases environmental impacts (York 2003; York et al. 2003a). The positive impact of urbanisation on the ESI, however, is not consistent with this view and may be explained by higher levels of environmental citizenship among people living in urban areas (Barkan 2004). The positive effect of gender equality on the ESI suggests that nations with greater gender equality are more supportive of environmental protection (Norgaard and York 2005). The positive relationship of dependent population with ESI indicates that countries with greater numbers of dependents (generally children) tend to be more sustainable. This finding is in line with other studies which suggest that the presence of large dependent populations is negatively correlated with ecological depletion (York et al. 2003a).

Model 4 fails to offer robust evidence that control of corruption is associated with higher levels of the ESI. However, the related measure of democracy is found to be positively related to the ESI both in Model 4 and in Model 5. A possible explanation for this might be that democratic governments may be more accountable to the public and therefore are more responsive to environmental interests (Asafu-Adjaye 2003).

The explanatory power of the models is much lower than might be expected. While Model 1 of course explains only a fraction of the variance in ESI (R-squared = 0.227), even the fully-saturated model explains less than half of the total variance in ESI (R-squared = 0.455). Overall, the goodness of fit of the models is low to moderate. All eight economic, social, and political variables together explain only 22.8 % of the variance in ESI once national income has been accounted for.

The poor performance of the ESI is somewhat surprising since the predictors were carefully chosen on the basis of similar studies in the field of sustainability research. All of the variables are widely used as predictors, usually explaining a much higher proportion of the variance in environmental indicators. However, when used to explain the ESI as a dependent variable, these variables fail to achieve similar explanatory power. One possible explanation might be that the measurement quality of the ESI as a composite indicator of environmental sustainability is much lower than that of the single indicators that dominate the literature.

5 Improving the Quality of the ESI as a Measure of Sustainability

The previous regressions have shown that the ESI delivers results that are largely in line with similar studies in the sustainability literature. This provides some evidence for the credibility of the measurement concept. However, the measurement qualities of the index have turned out to be surprisingly poor. One chance to addressing this deficiency is to alter the weighting scheme of the underlying components.

A new weighting scheme could make use of the results of the factor analysis presented in Sect. 3.3 above. This analysis shows that there are three major components that determine ESI sustainability levels (social robustness, natural endowment and environmental consciousness). For each of the three components factor scores were obtained which can now be used to create a re-weighted version of the ESI. Such an approach should improve the reliability of the ESI, though without necessarily making the ESI more meaningful as an indicator of environmental sustainability. In other words, a re-weighting based on the previously obtained factor scores would not alter the conceptualisation of the ESI but could result in a significant higher reliability.

After adding up the factor scores of the three largest factors in one single measure, the outcome is a modified, re-weighted ESI, the ‘Equivalised ESI’. The computed scores and the new country ranking for the Equivalised ESI are presented in Table 5. Countries are listed from highest to lowest sustainability.

Table 5 Ranking table for the equivalised ESI

Like with the original ESI, higher Equivalised ESI scores suggest better environmental stewardship (social robustness, natural endowment and environmental consciousness). The three highest-ranking countries are Iceland, Norway and Canada. These countries are seen to be in a good position to maintain favorable environmental conditions into the future. Iceland, now ranking highest on sustainability moved from rank 5 to the first rank. The original ESI ranks Norway second and so does the Equivalised ESI. Canada is now on rank 3 compared to the original ESI where the country was on rank 6.

Many other rank changes are also relatively minor. The lowest ranking countries on the Equivalised ESI are North Korea, Iraq and Haiti. These countries have not sufficiently managed the challenges of sustainable development. Haiti ranks last on the Equivalised ESI while its previous rank on the original ESI was 141. Iraq dropped by two ranks and is now on rank 145 compared to 143 on the original ESI. North Korea originally ranked last, has increased by two, ranking 146.

A major change is that in the new ranking the United States made a huge step towards the top of the list. Its ranking changed from the 45th to the 8th rank. Bearing in mind that the US is one of the greatest polluters on earth, one might wonder how the US can be portrayed as one of the most sustainable countries in the world. Technically, this somewhat surprising finding can be explained by the different weights that are given to the underlying indicators based on the outcomes of the previous factor analysis, which are highly skewed toward social capacities. Other indicators associated with environmental systems and stresses seem to be given less weight by the index. The US and other OECD countries perform best on vulnerability and social capacity indicators but rather badly on indicators reflecting the current state of environmental systems or environmental pressures. Therefore, the ESI and the Equivalised ESI both fail to account accurately for environmental deterioration and impacts in the presence and past.

Both indices take a prospective (capabilities) rather than a retrospective (actions) approach and seem to portray sustainability as a “weapon to be armed for the future”. The past and present externalisation of environmental costs imposed by the developed on the developing countries only play a minor role in the indices. As the conceptualisation of the ESI seems to “forget” about the historical and transboundary environmental costs, it pushes major developed countries to the top of the ranking. In short, the ESI portrays sustainability not so much as a historical and international responsibility but more as an intra-national and prospective concept limiting itself almost to an “intra-country-sustainability” measure. The Equivalised ESI is a means to bring to the fore these issues that are implicitly embedded in the logic of the ESI.

5.1 Modelling Sustainability Using the Equivalised ESI

The results of the Equivalised ESI rankings show that the top-ranks are largely dominated by highly developed and politically stable countries. This domination of affluent nations in the highest ranks is true for both the Equivalised ESI and the original index. It has been suggested above that this is a fundamental flaw in the logic of the ESI, not an artefact of the reconstitution of the ESI into the Equivalised ESI. Further light can be shed on this question by re-running the regression analyses reported in Table 4 using Equivalised ESI in place of ESI as the dependent variable. By doing so, one important question can be answered: has the statistical reweighting of the ESI into the Equivalised ESI improved the reliability of the index?

The results for the Equivalised ESI are presented in Table 6. As expected, economic development is found to have a highly significant and positive impact on the Equivalised ESI (Model 1). Since this positive effect holds for all models, the results suggest that economic development has a significant and positive effect on sustainability (as measured here). Affluence seems to be an even stronger determinant for the Equivalised ESI than for the original ESI, since GDP per capita is highly significant in all five models—compared to three models in the regressions using the original ESI. The significance levels of economic development for the Equivalised ESI are also higher compared to the original ESI. As with ESI, the quadratic term of GDP per capita (Environmental Kuznets Curve) is not significant in the full model for Equivalised ESI (Model 5). However, the term is significant in Models 1 and 3. Although these results fail to fully support the existence of an Environmental Kuznets Curve, they are suggestive.

Table 6 Regression results for the Equivalised ESI

As with the original ESI, industrialisation is not significant in Model 5 but shows a negative impact on sustainability in Model 2, which focuses on economic structure. Higher levels of industrialisation are related to lower levels of sustainability. Agriculture is only significant in Model 5 and is found to have a positive effect on sustainability. This finding is similar to the results obtained for the original ESI and indicates that economies based on agriculture tend to be more sustainable. The effect of export dependency is not significant in any of models.

The social dimensions urbanisation, gender equality and dependent population are positively related to sustainability in Model 3 as well as in the full Model 5. This echoes the results found with the original ESI.

The democracy variable is only significant in Model 4. The regressions using the original ESI show that democracy is significant in model 4 and in the full model. This is the one place where the results reported in Table 6 are less robust than those reported in Table 4. Similar to the original ESI, no robust evidence can be found for the measure for control of corruption.

In the fully-saturated Model 5, most of the predictors that are identified as driving forces of sustainability remain significant. While industrialisation changes from significant to non-significant, agriculture changes from non-significant to significant. The same pattern is observed for the original ESI. The significance of democratisation changes from highly significant to non-significant which indicates that democratisation has no direct role on levels of environmental sustainability.

All of the models reported in Table 6 have greater explanatory power (in terms of their R-squared scores) than the parallel models reported in Table 4. The increased levels of variance explained in the Equivalised ESI models seem to be driven mainly by an increased explanatory power of national income for Equivalised ESI versus raw ESI. Thus, the R-squared for Model 1 rises from 0.227 in Table 4 to 0.553 in Table 6. This increase in explanatory power carries right through all five models to Model 5, where the explanatory power of the full model (R-squared = 0.699) is still substantially higher than in the regressions involving the original version of the ESI (R-squared = 0.455). In short, the Equivalised ESI is even more closely tied to national income per capita than is the ESI itself.

5.2 Comparing Indices: Raw Versus Equivalised

The main objective of disaggregating the ESI and creating an equivalised version based on a statistical weighting scheme derived from factor analysis was to critically examine the architecture and measurement qualities of the index. The original architecture based on five components is fundamentally challenged in this study. The findings suggest that in fact there are only three major factors determining sustainability levels, with a development-related social robustness factor predominating. In order to assess the measurement qualities of these factors, an Equivalised ESI was computed by adding up the scores of three largest factors into one single measure. A comparison of the two indices’ performance in a series of regression models showed that both indices—ESI and Equivalised ESI—lead to remarkably similar results, though with a much stronger influence of national income on Equivalised ESI than on raw ESI.

Overall, this comparison leads to two conclusions. First, the findings provide evidence of redundancy in the composition of the ESI. Since the Equivalised ESI—based on just three components—leads to similar results as the original version, the original architecture based on five components is perhaps suspect. It is shown that only three factors suffice to compute a sustainability measure with significantly better measurement qualities. Furthermore, a simplified and clearer architecture based on only three components may provide a better understanding of environmental sustainability when the index is applied in environmental decision making and public communication.

Second, the makeup of Equivalised ESI, its changes in the ranking of countries compared to their rankings on raw ESI and the fact that the (positive) relationship between national income and ESI is only reinforced when using Equivalised ESI all suggest a fundamental flaw in the way ESI is constructed. At the bottom line, the ESI can be interpreted as an effort to accommodate the political desire for cross-country comparison and benchmarking sustainability levels between countries. The result, however, is that sustainability ends up being defined as a national rather than as a global concept.

The downside of this approach is that the ESI does not then measure global sustainability. By making wealthy nations look good in the ESI, the index suggests that economic development is the key to sustainability. The truth is that economic development is a significant driver of greenhouse gas emissions (Jorgenson 2005) and ecological depletion (York et al. 2004). Today, global population requires the equivalent of 1.5 planets to provide the necessary resources and absorb the generated waste (Ewing et al. 2010:18). Some economically developed countries such as the United States, are using the equivalent of 8.0 planets to meet their ecological demands (Ewing et al. 2010:74). In short, economic development is generally related with serious ecological overshoot. Negative impacts like these are largely neglected by the ESI, or at least are not given the priority they merit. Even if the ESI does not claim to measure global sustainability as such, it still needs to give appropriate weight to the detrimental effects of economic development in order to make benchmarking to it more meaningful.

6 Conclusion

This paper makes several contributions to the field of sustainability measurement. The main purpose was to assess the Environmental Sustainability Index (ESI) from a number of different perspectives. First, the index was validated against the PSR model, one of the most widely used frameworks for environmental decision making. Two out of five ESI components are made up of indicators of different PSR categories. As a consequence, two components are not consistent with the PSR model, which renders the ESI partially incompatible with this widespread policy-making framework. Such inconsistencies could cause certain challenges in policy-making practices that have adopted the PSR model as proposed by the OECD, UN and other international organisations.

One possibility to improve this shortcoming is to re-arrange the indicators and to make adjustments to the categorisation of the indicators. A factor analysis was employed to reduce the number of components from five to three. These three components were identified as social robustness, environmental consciousness and natural endowment. The factor scores of the new components were summed to create a statistically re-weighted version of the index, the Equivalised ESI.

Both the ESI and the Equivalised ESI were employed in regression analyses to validate their performance with reference to a large body of environmental research. Both indices yield to similar regression outcomes but the Equivalised ESI delivers a substantially higher measurement quality. The results of the regression models reported here confirm the initial conclusion based on factor analysis that ESI is inappropriately weighted in a way that flatters the environmental performance of rich countries.

The findings from the regression analyses suggest that economically developed countries perform better on environmental sustainability. Still, these countries are the greatest polluters, emitting enormous amounts of CO2 and more aggressive greenhouse gases such as MH4—besides other externalities which benefit the economically developed countries and harm less developed countries. Although CO2 and MH4 emissions are included as variables in the ESI, they only play a minor role.

Transferring to environmental issues the Marxist perspective that the powerful and wealthy maintain their position by exploiting labour, it may be argued that the economically developed countries maintain their privileged positions by exploiting natural resources and particular by externalising their environmental costs onto less developed countries. Making the wealthy nations look good in the ESI is a way of promoting the simplified notion that economic development eventually would lead to more sustainable outcomes. In fact, while GDP growth may lead to better performance on national “sustainability” indicators like literacy and education, it is clearly overwhelmingly destructive for the sustainability of the world as a whole.

This paper has revealed that the architecture and measurement qualities of the ESI are far from perfect. However, the need for sustainability indicators is real and every tool that is added to the toolbox of sustainability research inspires debate. This paper aims to contribute to this debate in two ways: first, by assessing the architecture of the ESI and second, by demonstrating how an equivalised version of the index could improve its measurement qualities. There are serious validity problems with defining the ESI as a sustainability measure, but whatever ESI does measure, an argument can be made that (for better or worse) the Equivalised ESI measures it better.

For future research and environmental decision making the broader question is which dimensions actually play key roles for sustainable development. The ESI clearly over-weights the development-related variables that are captured in the social robustness dimension of Equivalised ESI. Developed and developing countries may have rather different conceptions of what sustainable development might be and index construction should address both. Two themes are important to keep in mind. First, appropriate variables must be selected according to definitions of sustainability and broad common ground. Second, these variables must be weighted according to meaningful conceptualisations of sustainability.

This paper clearly demonstrates the importance of index construction and weighting mechanisms. If the debate in sustainable development is to be moved forward, a much broader common ground on the foundations of sustainability measurement needs to be developed. Studies like this are not a call for the abandonment of creating sustainability indicators. Instead, this and related work should be interpreted as a wake up call for index constructors to take various perspectives on variable selection, re-consider weighting schemes and re-think aggregation methods to improve instrument quality.