1 Introduction

Since 1995, the United Nations Development Program (UNDP) has published in their Human Development Reports (HDR) two indices that reflect gender disparities in basic capabilities at the world level: the Gender-related Development Index (GDI) and the Gender Empowerment Measure (GEM). The impact of these two measures has been very important both in academic and non-academic milieux and their use has been widespread for the purpose of assessing disparities between women and men all over the world. Despite their relevance, the GDI and GEM have been criticized for their conceptual and methodological limitations, as has been widely acknowledged elsewhere (see Bardhan and Klasen 1999; Dijkstra and Hanmer 2000; Dijkstra 2002, 2006; Klasen 2006). As a result of the many limitations of the GDI and GEM, a number of gender-related well-being indices have appeared in the literature.Footnote 1 In turn, these alternative indices suffer from different shortcomings that limit somehow their usefulness and appropriateness as global gender inequality indices (Hawken and Munck 2009; Permanyer 2010; Bericat 2011 provide extensive and critical reviews on that literature). In an attempt to overcome some of the problems identified by different researchers during the past 15 years, the 2010 HDR has presented a new measure: the Gender Inequality Index (GII). This index has been designed to capture women’s disadvantage in three dimensions—empowerment, economic activity and reproductive health—for 138 countries around the world.

The assessment of gender inequality at the global level using the same set of indicators across the world does inevitably question their meaning and validity: are those indicators equally relevant and meaningful in all world countries? One of the main purposes of this paper is to examine the appropriateness of UNDP gender-related indices and variables to assess gender inequality levels in “highly developed” countries. More specifically, we want to explore the extent to which the assessment of gender inequality levels in Europe is altered when shifting from the use of the “old” GDI variables to the new GII variables. A critical assessment of this kind is an important exercise because of the strong and global impact that UNDP reports have in academic and policy-making circles. The fact that the GDI has been replaced by a completely new measure makes necessary to critically compare the performance of the different indicators to understand their strengths and weaknesses. In the first part of the paper we use the “old” variables included in the GDI, and in the second one we critically review the construction and performance of the new GII (whenever possible, we offer constructive alternatives to the identified problems).

At the risk of oversimplification, the attempts to critically analyze and eventually improve on existing gender inequality indices can be classified in the following non-mutually exclusive and non-exhaustive groups. On the one hand there is an important strand of the literature that proposes to introduce alternative dimensions that are typically not incorporated in the existing gender inequality indices—e.g.: gender-based violence, time spent on care activities, leisure time, decision-making power and so on (see, for instance, Dijkstra 2002, 2006; Folbre 2006; Beneria 2008; Esquivel et al. 2008; Bericat 2011). On the other hand, a different strand of the literature—which is of technical nature—focuses on the ways in which alternative dimensions should be aggregated to generate a single dimensional index with “reasonable” properties that can be compared across time and space (e.g.: Bardhan and Klasen 1999; Klasen 2006; Dijkstra and Hanmer 2000; Permanyer 2010; Beneria and Permanyer 2010; Klasen and Schüler 2011). The critical appraisal we will present in this paper of UNDP gender-related indices is based on both approaches: we will discuss both the advantages and disadvantages of choosing the specific dimensions/variables that are included in those indices and the ways in which these dimensions are aggregated to generate a composite index.

Composite indices are increasingly popular instruments that are typically used to synthesize complex information into a simplified format (see Nardo et al. 2005). However, there is an ongoing debate on the necessity of working with composite indices rather than with their individual subcomponents. The use of composite indices has been criticized on grounds of complexity and arbitrariness by different authors who argue that what composite indices basically do is to lump together apples and oranges to generate meaningless measures (e.g.: Ravaillon 2010, 2011). While we acknowledge part of this criticism, we still contend that: (1) composite indices are relevant instruments indeed which capture information that cannot be seized working with each of the components separately, and (2) the information provided by a composite index should be treated as a complement—rather than a substitute—to the information provided by its different subcomponents.Footnote 2 For this reason, in order to complement the information provided by UNDP gender-related composite indices, we will also explore the behavior of their different subcomponents separately. Among other things, this will allow us to disentangle the extent to which the different subcomponents contribute to the aggregate values of the corresponding indices.

The measurement of gender inequality is important in itself: the fact that women have pervasively been discriminated against in many well-being dimensions is a long lasting state of affairs for which some remedy must be sought. However, as warned by some scholars (see Smyth 1996), the adoption of feminist language and concerns has other more instrumental motivations. One reason why gender inequality has attracted many economists’ attention is because of its presumed link with countries’ economic growth. On the theoretical side, neoclassical economic theorists argue that economic equality is desirable on grounds of fairness and equity but it might reduce efficiency by distorting incentives (Forbes (2000), for instance, concludes that increases in income inequality are actually good for growth). Alternatively, the World Bank and other institutions foster the empowerment of women for its allegedly beneficial impact on economic growth. On the empirical side, different papers on this topic offer conflicting evidence. Klasen (1999) and Dollar and Gatti (1999) find empirical evidence suggesting that higher gender equality levels are associated with higher economic growth rates. Alternatively, Seguino (2000) argues that in many East Asian economies between 1975 and 1995, low gender equality and low wages for women was accompanied by high economic growth. Forsythe et al. (2000) find mixed evidence suggesting that both the neoclassical Women in Development (WID) and the more liberal Gender and Development (GAD) approaches are not incompatible and they both contain elements of truth. From a broader perspective, Easterly (1999) reports that increases in income are weakly related—if at all—to improvements in a broad spectrum of gender gaps in quality-of-life indicators. In an attempt to shed some light and make a further empirical contribution to the debate, we perform exploratory cross-sectional and cross-time analyses to examine the extent to which the levels of gender equality are at odds with overall economic performance during the last decade in the European context.

2 Gender Inequality in Europe Using GDI Variables

We start with a brief methodological revision of the gender inequality indices that will be used in this paper.

In order to reflect gender inequalities at the world level, UNDP presented the GDI in its 1995 Human Development Report. The GDI is a composite index that measures the absolute development levels in a given country corrected downward by the existing gender inequalities in three well-being dimensions: a long and healthy life, knowledge and a decent standard of living. However, the GDI is not a measure of gender inequality in itself, but rather an efficiency indicator discounted for gender disparities in its components. Moreover, the values of the GDI are largely influenced by an earned income component estimated in many countries under vastly oversimplifying assumptions (see Bardhan and Klasen 1999). As noted in Dijkstra and Hanmer (2000), GDI values are highly correlated with GDP per capita, so both indices are essentially conveying the same information.

Among the proposals presented in the literature to improve and simplify the functional form of the index so that it captures gender inequality per se, the Gender Relative Status (GRS) index proposed by Permanyer (2010), Beneria and Permanyer (2010) and Klasen and Schüler (2011)Footnote 3 is perhaps the simplest and most appealing one. It has the following functional form

$$ GRS: = \left( {\prod\limits_{i = 1}^{n} {\frac{{x_{i} }}{{y_{i} }}} } \right)^{1/n} $$
(1)

where x i ,y i are the average achievement levels in dimension ‘i’ for women and men respectively and ‘n’ is the number of dimensions we are considering. Recall that the country-specific value of GRS should be interpreted as the average gender gap that is observed in that country. The normatively desirable value for GRS is 1; therefore when the values of GRS < 1, men are on average better off than women, and when GRS > 1, women are on average better off than men, so the interpretation of the index is simple and clear. Using GRS, gender gaps running in opposite directions can eventually offset each other resulting in “artificially high” levels of gender equality. Because of this possibility, Beneria and Permanyer (2010) propose the creation of another gender inequality index that only takes into account gender gaps favoring men. However, since this offsetting between dimensions hardly takes place with our data, we simply work with GRS.Footnote 4

Recall that Eq. 1 is yet underspecified as its values depend on the variables we incorporate in our analysis. In the first part of this paper, to ensure methodological consistency and comparability with UNDP indices we simply work with the dimensions included in the GDI defined as follows.

1. The health component is computed using life expectancy at birth. Given the fact that women outlive men, gender-specific life expectancies (LE f , LE m ) are not strictly comparable. For this reason, UNDP suggests to construct gender-specific life expectancy indices as follows

$$ \left. {\begin{array}{*{20}c} {LEI_{f} : = \frac{{LE_{f} - 27.5}}{60}} \hfill \\ {LEI_{m} : = \frac{{LE_{m} - 22.5}}{60}} \hfill \\ \end{array} } \right\} $$
(2)

This way, both LEI f and LEI m lie within the [0,1] interval and are comparable.

2. The knowledge component is computed by averaging adult literacy rate (ALR) and combined secondary and tertiary gross enrolment ratios (GER). According to the GDI methodology, ALR receives a weight of 2/3 and GER a weight of 1/3. Therefore, the multiplicative formFootnote 5 gender-specific education components that will be used in this paper are defined as:

$$ \begin{aligned} EDUI_{f} : = & (ALR_{f} )^{2/3} (GER_{f} )^{1/3} \\ EDUI_{m} : = & (ALR_{m} )^{2/3} (GER_{m} )^{1/3} \\ \end{aligned} $$

3. The decent standard of living component is computed with the estimated earned income in PPP US$. The GDI methodology used to compute the gender-specific earned income components (GDPI f , GDPI m ) is somewhat involved, so it will not be presented here to avoid burdening the text too much (see technical notes in 1995 HDR and Bardhan and Klasen 1999).

Using the previous variables, the GRS can be rewritten as

$$ GRS1: = \left( {\frac{{LEI_{f} }}{{LEI_{m} }}\frac{{EDUI_{f} }}{{EDUI_{m} }}\frac{{GDPI_{f} }}{{GDPI_{m} }}} \right)^{1/3} $$
(3)

While GRS is only an underspecified functional form that could be used to assess gender inequality levels using any set of gender-specific variables, GRS1 is a particular case of the previous index where certain specific variables have been chosen. In the second part of the paper we also work with the GRS functional form but picking different variables—those included in the GII—that lead to the definition of an alternative index that will be called GRS2.

2.1 Gender Inequality Levels and Trends in Europe

Using data from UNDP Human Development Reports we explore the evolution of GRS1 values in Europe.Footnote 6 Table 1 shows those values for most European countries from 1999 to 2009. The reason for not including data from the 1995 to 1998 Human Development Reports is that the methodology used to compute the earned-income component in that period is not the same as the one used in the 1999–2009 period,Footnote 7 so both series are not strictly comparable. In 1999, the top five performers with their GRS1 values were Latvia (0.947), Sweden (0.938), Russia (0.927), Belarus (0.922) and Estonia (0.921), while the bottom five performers were Italy (0.771), Spain (0.762), Luxembourg (0.735), Ireland (0.735) and Cyprus (0.729). One decade later, the top five performers were Lithuania (0.943), Hungary (0.935), Russia (0.93), Latvia (0.925) and Norway (0.923), while the bottom five performers were Albania (0.821), Spain (0.818), Italy (0.797), Greece (0.792) and Austria (0.739). The range of values in which GRS1 varies has not changed very much during the 1999–2009 period. As shown in Table 1, the levels of gender inequality in Europe have not increased very much overall during the last decade: in 1999, the average value of GRS1 among all European countries with available data was 0.85; one decade later, that average slightly increased to 0.86. In other words: women’s average achievement level represented 85% of men’s in 1999 and 86% in 2009. This is in contrast with the remarkable trends presented in Beneria and Permanyer (2010) and Dorius and Firebaugh (2010), where the reported values of similar indices of gender inequality showed much larger improvement at the world level. Footnote 8 On this point, it is worth mentioning that these large improvements have taken place at lower tails of the gender equality distribution, that is: in different African and Asian countries that at the beginning of the period had very low levels of gender equality and plenty of room for improvement. In contrast, we find European countries at the upper tail of the gender equality distribution, with much less room for improvement. Notwithstanding, it is remarkable that the values of GRS1 have stagnated in Europe for such a long period of time.

Table 1 GRS1 and logged GDP per capita evolution in Europe during the last decade

Looking in more detail at the time trends shown in Table 1 we can observe that the gender equality distribution in Europe becomes less disperse at the end of the period. The standard deviation of the GRS1 values for European countries with available data equals 0.064 in 1999 and 0.047 in 2009, so gender equality levels in Europe are apparently converging. One possible explanation for this convergence in gender inequality levels might be related to the world’s greater interconnectedness that is observed as the flow of information, communication, goods, services and people expands with economic globalization. The breadth of this flow exposes world’s citizens to new ideas, institutions and practices that affect the status of women vis-à-vis men in increasingly homogeneous ways (Dorius and Firebaugh 2010).

Despite of this increasing homogeneity, one can still distinguish clear regional patterns. Southern European countries like Greece, Italy, Portugal or Spain have lower levels of gender equality with respect to other European countries during the whole period. At the other extreme of the distribution, Nordic countries like Norway, Sweden, Finland or Denmark are among the ones with higher gender equality levels during the last decade. These figures reflect a well-documented phenomenon that has been consistently reported in recent European gender inequality analysis (e.g.: Plantenga et al. 2009; Bericat 2011). Interestingly, some former soviet republics like Belarus, Estonia, Latvia, Lithuania and Russia have very high levels of gender equality according to the values of GRS1. However, as will be explored in more detail in the following section, this result is largely driven by the bad average performance of men rather than by improvements in women’s achievements.

2.2 Component-Wise Evolution

Given the fact that GRS1 is a composite index that averages gaps across dimensions, it is particularly important to explore the values of its different components separately. It turns out that the behavior of each of those components over time is clearly differentiated from one another.

2.2.1 Health

Concerning the gaps in life expectancy it is interesting to observe that: (1) virtually all of them are slightly above the normatively desirable value of 1, and (2) the variations over time are very small (see Table 2). This suggests that the overall health conditions for European women and men have not experienced important sex-specific changes during the last decade. However, the former soviet republics of Belarus, Estonia, Latvia, Lithuania and Russia constitute a clear exception to that rule: the gaps are much larger than the ones observed in all other countries and they systematically favor women. As reported elsewhere (e.g.: Notzon et al. 1998), these results are largely driven by the declining health status of men—which is related among other things to men’s higher alcohol and drug consumption and increases in mortality from violent injuries—rather than by improvements in women’s health. This example clearly illustrates the necessity of looking at the different subcomponents before rushing to premature conclusions based on the values of the composite indices alone.

Table 2 Values of the gender gap life expectancy index in Europe in 1999, 2001, 2003, 2005, 2007 and 2009

2.2.2 Education

The gaps in education are very close to the normatively desirable value of 1 during the entire period, therefore indicating that gender specific literacy rates and school enrolment ratios are extremely similar for women and men across European countriesFootnote 9 (see Table 3).

Table 3 Values of the gender gap education index in Europe in 1999, 2001, 2003, 2005, 2007 and 2009

2.2.3 Income

The gaps in the income component are—by far—the largest ones that are plugged into GRS1 and they systematically favor men (see Table 4). There is a slight tendency over time to reduce the size of the gap, but there are many exceptions to that rule. Some countries have quite constant trajectories, others have monotonically increasing or decreasing trajectories while many others have an erratic behavior that does not seem to follow any particular pattern.

Table 4 Values of the gender gap GDP index in Europe in 1999, 2001, 2003, 2005, 2007 and 2009

As an illustration of the previous general results we briefly explore the case of Spain, a country with one of the highest gender inequality levels in Europe. Let us examine each subcomponent separately (see Fig. 1). Regarding the health subcomponent, the life expectancy gap remains fairly stable over time with values around 1.04. The fact that this gap is slightly above 1 does not necessarily imply that the health condition of Spanish women is better than that of men’s, but it rather reflects the crude normalization methodology that was used in the construction of LEI f and LEI m (equally reasonable normalization procedures could have well lead to health gaps below 1). Regardless of whether the gap in the health subcomponent is above or below 1, its stability over time suggests that the health conditions of the Spanish population have not suffered sex-specific shocks during the last decade.

Fig. 1
figure 1

Gender gap in labor force participation, GRS1 values and its subcomponents in Spain, from 1999 to 2009. Source Author’s calculations using 1999 to 2009 Human Development Reports and Spanish National Statistical Institute (INE) data

With respect to the education subcomponent there are no concerns regarding the normalization methodology, so Fig. 1 shows an undisputable—slight—advantage of Spanish women vis-à-vis Spanish men. These results are in line with the gender-specific worldwide trends in education that have been reported elsewhere: in the last decades there has been an education expansion at all levels and in many countries the school enrolment of girls is now larger than that of boys (Lutz et al. 2007). Therefore, neither life expectancy nor educational attainment seem to offer interesting insights to capture gender inequalities in Spain. Sadly enough, the larger investments in education made by Spanish women are not reflected in their earned income, which is much lower with respect to men’s. Something similar happens when we examine the gender gap in labor force participation rates during the same period (see Fig. 2): even if women’s education levels tend to be higher than men’s, the latter have much greater access to remunerated jobs than the former. The general patterns (but not the levels) observed in Spain are repeated with negligible variations in almost all European countries.

Fig. 2
figure 2

Log of GDP per capita vs GRS1 in Europe from 1999 to 2009. Source Author’s calculations using data from 1999, 2004 and 2009 Human Development Reports

2.3 Gender Equality and Economic Performance

The association between gender equality and economic performance in European countries—as measured with GRS1 and logged GDP per capita—is illustrated in Fig. 2 for the years 1999, 2004 and 2009: the cross-sectional relationship between these variables is very weak. The corresponding correlation coefficient equals −0.34, −0.27 and −0.3 respectively, but they are never statistically significant at the 5% level (the values needed to compute these coefficients are shown in Table 1). Therefore, countries with larger GDP per capita tend to have slightly lower GRS1 values. This result is in contrast with the results that are found cross-sectionally at the world level, where larger GDP per capita values tend to be associated with larger GRS1 values (see, Dijkstra and Hanmer 2000:52–53)—an issue that will be further discussed later.

As is well-known, cross-sectional relationships alone can be very misleading unless they are complemented with cross-time data. We define cross-time variables for time moments t 1 and t 2 in the following way:

$$ \left. {\begin{array}{*{20}c} {\Updelta GDP_{{t_{1} ,t_{2} }} : = \ln \left( {GDPpc_{{t_{1} }} } \right) - \ln \left( {GDPpc_{{t_{2} }} } \right)} \hfill \\ {\Updelta GRS_{{t_{1} ,t_{2} }} : = (GRS1)_{{t_{1} }} - (GRS1)_{{t_{2} }} } \hfill \\ \end{array} } \right\} $$
(4)

Interestingly, the correlation coefficient between \( \Updelta GDP_{1999,2004} \) and \( \Updelta GRS_{1999,2004} \) is −0.18 and is highly non-significant (p-value = 0.34). Analogously, the correlation coefficient between \( \Updelta GDP_{2004,2009} \) and \( \Updelta GRS_{2004,2009} \) is −0.23 and is also non-significant (p-value = 0.22). These results are in line with the results reported in Easterly (1999) and Dorius and Firebaugh (2010:1958).

3 Gender Inequality in Europe Using GII Variables

UNDP’s new Gender Inequality Index is an important contribution to the debate on gender inequality measurement. Introduced in the 2010 Human Development Report, the new index has been designed to overcome the many limitations of its predecessor: the GDI. The GII has been designed to capture women’s disadvantage in three dimensions—empowerment, economic activity and reproductive health—and uses the following variables: maternal mortality ratio (MMR) and adolescent fertility rate (AFR) for the reproductive health dimension, educational attainment (proportion of population with secondary education and above, SE) and parliamentary representation (PR) for the empowerment dimension and labor force participation rate (LFPR) for the economic activity dimension. While MMR and AFR are women-specific variables, the variables SE, PR and LFPR are computed for women and men (the gender specific variables are denoted as SE f , SE m , PR f , PR m , LFPR f , LFPR m , where the subscripts f,m refer to female and male respectively). The publication of a new global index of gender inequality by UNDP is good news for at least two reasons. On the one hand, the GII brings fresh air by substituting a couple of indices that have been criticized on many fronts. On the other hand, it will further contribute to the debate on gender inequality measurement by incorporating concepts and dimensions that had not been used before in that context at the global level.

Regarding the choice of variables, at least two issues should be pointed out. The fact that the earned-income component is not present in the GII is to be welcomed because of its unreliability (Bardhan and Klasen 1999:992–993). In contrast, the labor force participation rate is a much more reliable estimate of economic participation that, unfortunately, fails to capture the informal and care economy sectors in which women are typically overrepresented (see Folbre 2006 and Beneria 2008). On the other hand, it is also remarkable that some reproductive health indicators have been introduced for the first time in a global gender inequality index. Hopefully, this initiative will contribute to raise further awareness on the poor reproductive health conditions experienced by men and women all over the world.

The GII formula is displayed in the following equation:

$$ GII = 1 - 4\frac{{\left[ {\left( {\root{3} \of {{\left( {\frac{1}{MMR}\frac{1}{AFR}} \right)^{1/2} \left( {PR_{f} SE_{f} } \right)^{1/2} LFPR_{f} }}} \right)^{ - 1} + \left( {\root{3} \of {{\left( {PR_{m} SE_{m} } \right)^{1/2} LFPR_{m} }}} \right)^{ - 1} } \right]^{ - 1} }}{{\root{3} \of {{\left( {\sqrt {\frac{1}{MMR}\frac{1}{AFR}} + 1} \right)\left( {\sqrt {PR_{f} SE_{f} } + \sqrt {PR_{m} SE_{m} } } \right)\left( {LFPR_{f} + LFPR_{m} } \right)}}}} $$
(5)

According to its designers, the values of GII should lie between 0 and 1 and should be interpreted as “the loss in human development due to gender inequality accounting for association—or overlap—between dimensions” (Gaye et al. 2010:8).

At this moment, several comments on the GII formula are in order.

1. The GII formula is extremely complicated and will be particularly difficult to understand for practitioners, analysts and policy-makers that want to make use of it. The GII does not fulfill a basic requirement that could be expected from a global index of this kind: simplicity in order to convey clear messages to the general public. Conceptually, the meaning of the GII is not entirely clear: it is purported to measure “losses in human development”, but losses from what maximal standard? This has not been specified anywhere.

2. The only justification given by its designers for introducing such a complex formulation is that they want the GII to be an “association sensitive” indexFootnote 10 (Gaye et al. 2010:14). While this is an interesting property that makes the index responsive to those redistributions that systematically discriminate one gender against the other, alternative gender inequality indices that are much less sophisticated than the GII also satisfy that normative property. In conclusion, we contend that the GII has been unnecessarily complicated and that other—much simpler—alternatives are also available.

3. The fact that GII mixes women-specific indicators together with indicators that are computed for women and men does lead to conceptual and methodological problems. If all indicators were women-specific we could be talking about something like a “women status index” (i.e.: an index that could be used to assess women’s absolute achievement levels). Analogously, if all indicators were computed for women and men, we could construct something like a “gender inequality index” (i.e.: an index that could be used to assess the relative position of women vis-à-vis men). Including the two kinds of indicators in the same formula, the GII becomes an odd mixture that is halfway between both concepts, thus obscuring the interpretation of an already complicated index. Moreover, there are important problems that are derived from this mixture of absolute and relative indicators.

First, when women and men fare equally well in all dimensions, the values of the GII are not equal to zero as its designers claim (see Gaye et al. 2010:34). Consider a hypothetical country with PR f  = PR m , SE f  = SE m , LFPR f  = LFPR m and with the lowest MMR and AFR observed in the sample of countries for which data is available \( (MMR = 10,AFR = 3.8) \). In that case, that hypothetical country would have a GII value well above 0 (GII ≃ 0.15).

Second, it is not absolutely clear that the GII values should depend on the MMR and AFR variables in the way they actually do. While the proponents of the index might rightly argue that it makes sense to “penalize” those countries with bad reproductive health conditions for women, it is fair to say that countries’ performance in those areas is influenced by a myriad of factors other than gender-related issues alone. Risks associated with childbearing vary tremendously globally and locally within countries, reflecting differences in access to and use of health services, social and cultural practices affecting access to healthcare, socio-economic levels and public health policies. Therefore, while it is true that gender norms and practices exert an important influence on MMR and AFR values, these are by no means the only influencing factors. The way in which it is constructed, the GII penalizes less-developed countries for poor performances in reproductive health indicators that are not entirely explained by gender-related norms or discriminative practices against women.

3.1 A New Gender Inequality Index

For these reasons, we propose to complement our analysis by constructing an alternative gender inequality index that incorporates the GII variables but that is constructed with a much simpler aggregation methodology. Rather than using the complicated formula of the GII shown in Eq. 5 we suggest using the much simpler GRS formula shown in Eq. 1—that generates an index that is also “association sensitive”. However, recall that in the GRS formulation we are only using variables that are defined for women and men: the problem then is what to do with the women-specific components of the GII. Ideally, it would be desirable to have meaningful reproductive health indicators for men. However, the ways in which reproductive health issues affect the lives of women and men are completely different, so it is not entirely clear how such an indicator should be constructed for men in a way that it was meaningfully comparable with a women’s reproductive health indicator. While acknowledging its different limitations, we incorporate life expectancy as an imperfect substitute of MMR and AFR in the definition of the new index. Therefore, our new index of gender inequality will be defined as follows:

$$ GRS2: = \left( {\left( {\frac{{SE_{f} }}{{SE_{m} }}\frac{{PR_{f} }}{{PR_{m} }}} \right)^{1/2} \frac{{LFPR_{f} }}{{LFPR_{m} }}\frac{{LEI_{f} }}{{LEI_{m} }}} \right)^{1/3} $$
(6)

The powers introduced in this formula reflect the original weighting scheme proposed in the definition of the GII (alternative weights can be chosen as well). Having introduced these new measures we revisit our assessment of gender inequality in Europe, focusing on the existing similarities or dissimilarities with the previous results.

3.2 Gender Inequality in Europe Revisited

Table 5 shows the gender inequality levels in Europe as measured by the new indices GII and GRS2. According to the 2010 HDR, the top five performers in Europe together with their GII values are The Netherlands (0.174), Denmark (0.209), Sweden (0.212), Switzerland (0.228) and Norway (0.234), while the bottom five performers are Bulgaria (0.399), Estonia (0.409), Russia (0.442), Romania (0.478) and Albania (0.545). Roughly speaking, these GII values suggest that Northern European countries suffer smaller welfare losses because of gender inequalities while these losses are much larger in Eastern European countries. Interestingly, countries like Switzerland or The Netherlands ranked much lower when assessing gender inequality levels with GRS1, while Eastern European countries like Hungary, Russia, Estonia or Romania were very well ranked according to that indicator. Moreover, Southern European countries like Spain, Italy or Greece perform much better with the new GII indicator than with GRS1. In particular, Spain jumps from the 29th position in the GRS1 ranking to the 12th position in the GII ranking, a remarkable improvement. It seems that the inclusion of new variables and a new aggregation methodology draws a completely different picture of gender inequality levels in Europe. This is confirmed in Fig. 3, where we plot the joint distribution of GII and GRS1 values. The correlation coefficient between these two variables is only 0.12 and is not statistically significant.

Table 5 GII, GRS1 and GRS2 values and country rankings in Europe
Fig. 3
figure 3

Matrix scatterplot comparing the pair-wise distributions of GII, GRS1 and GRS2 for European countries in 2010. Source Author’s calculations using 2010 Human Development Report data for European countries

Given the conceptual and methodological flaws of GII, we complement our analysis examining the distribution of GRS2. As shown in Table 5, the top five performers are Sweden (0.959), Finland (0.939), Norway (0.887), Denmark (0.877) and The Netherlands (0.875), while the bottom five performers are Hungary (0.669), Greece (0.648), Slovenia (0.642), Romania (0.637) and Albania (0.585). Recall that in this case, the GRS2 should be interpreted as an average gender gap, so values close to 1 can be seen as normatively desirable. In this case, there are not large disparities between the top performers of GRS2 and GII: Northern European countries consistently appear to be the countries with lower gender inequality levels. On the other hand, some Southern and Eastern European countries appear to be the worst performers according to the new GRS2. Somewhat surprisingly, Portugal and Spain fare pretty well according to GRS2 values—they rank in the 9th and 10th positions respectively—therefore accomplishing a remarkable improvement with respect to the previous GRS1 values. Figure 3 shows the pair-wise joint distribution of the three indicators. Given the fact that GII measures welfare losses due to gender inequality and GRS2 measures average gender gaps, both indices run in opposite directions and have a negative relationship; the corresponding correlation coefficient equals −0.65 and is highly statistically significant. Alternatively, both GRS1 and GRS2 measure average gender gaps using the same aggregation methodology but incorporating alternative variables, so there is a positive relationship between them. However, the corresponding correlation coefficient is not very large: it equals 0.31 and is only statistically significant at the 10% level.

To sum up, these results suggest that the assessment of gender inequality levels in Europe is not very robust to alternative specifications of gender-related indicators and aggregation methodology, so only very loose statements can be made. This forces analysts and policy-makers to be cautious when making global appraisals of gender inequality levels and trends in Europe and to be very explicit on the advantages and disadvantages of the chosen methodology that has been used to produce their results.

3.3 Component-Wise Behavior

As in the previous section, it is particularly important to explore the behavior of the different subcomponents before rushing to premature conclusions based on the values of the composite indices alone. A great advantage of using simple functional forms as the one used in GRS2 is that it is relatively easy to know the percent contribution of the different subcomponents to the aggregate value of the index.Footnote 11 Figure 4 shows the values of GRS2 for all European countries included in our dataset together with the percent contributions of its different subcomponents. Since the gender gaps favoring women and men run in opposite directions, the percent contributions shown in Fig. 4 are negative and positive respectively (see Footnote #11). Unsurprisingly, most gender gaps included in GRS2 favor men except for the health dimension. The gaps in life expectancy tend to favor women, particularly for the cases of the former soviet republics of Belarus, Estonia, Latvia, Lithuania and Russia,—in spite of the normalization procedure shown in Eq. 2—and for the few countries where that gap favors men its contribution is very small. However, the overall contribution of the health dimension to the values of GRS2 is quite small. The same conclusion can be reached concerning the education component. As shown in Fig. 4, the gaps in education are relatively small and tend to favor men (expect for the cases of Sweden, Norway, Iceland, Portugal, Russia, Great Britain and Ireland) as opposed to what was observed in the education component of GRS1 shown in the previous section, which tended to favor women.

Fig. 4
figure 4

Values of GRS2 and percent contribution of its subcomponents for European countries in 2010. The values of the percent contributions are shown in the vertical axis of the left hand side and the values of GRS2 are shown in the vertical axis of the right hand side. Abbreviations of country names follow the ISO 3166 Country Codes. Source Author’s calculations based on 2010 Human Development Report data

Interestingly, the gaps in labor force participation rates tend to be much higher than the ones found in the education and health components. The big gaps in sex-specific labor force participation rates were shown for illustrative purposes for the case of Spain in Fig. 1, indicating a large disadvantage for women that has been diminishing during the last decade (similar trends are found for most European countries). Finally, the parliamentary representation component is the one that contributes the most to the aggregate values of GRS2. This is basically due to the fact that the representation of women in European parliaments is still far away from that of men, even if their population shares in the respective population are about the same. Sweden is the only European country where the share of women in the parliament approaches 50%; the other countries tend to have substantially large gaps. This basically explains the relatively good performance of countries like Portugal and Spain in terms of GRS2 values. Even if the parliamentary representation of women in those countries is just around 30%, they occupy the surprisingly high 7th and 8th positions in terms of the values of that variable when compared to the other European countries.

3.4 Gender Equality and Economic Performance Revisited

When the cross-sectional relationship between gender equality and economic efficiency is revisited using the new GII and GRS2 indices we obtain substantially different results. When GII is used as a measure of welfare loss due to gender inequalities, a strong negative relationship is observed, suggesting that richer countries clearly suffer smaller welfare losses due to gender inequalities. The correlation coefficient between these variables is relatively large in absolute terms (−0.76) and is statistically significant at the 1% level. However, the relevance of this strong relationship should be heavily qualified—to say the least—because of the peculiar way in which the GII has been designed.

On the other hand, when GRS2 substitutes GII, the cross-sectional relationship between gender equality and economic efficiency turns out to be positive. The correlation coefficient between the two variables equals 0.45 and is statistically significant at the 5% level, so there is considerable data dispersion (see Fig. 5). The relatively small value of the correlation coefficient suggests that the information conveyed by GRS2 is substantially different from the GDP per capita. Therefore, it is possible to find richer countries with relatively low gender equality levels and vice versa. Interestingly, the cross-sectional relationship shown in Fig. 5 at European level is similar to the one that is reported in Dijkstra and Hanmer (2000:53) at the world level. Since GII and GRS2 values are only available for year 2010, it is not possible to explore cross-time relationships yet.

Fig. 5
figure 5

Log of GDP per capita vs GRS2 values for European countries in 2010. Abbreviations of country names follow the ISO 3166 Country Codes. Source Author’s calculations based on 2010 Human Development Report data

4 Summary and Concluding Remarks

In this paper we have presented an assessment of gender inequality levels in Europe using variables from UNDP indices. A critical assessment of this kind is an important exercise because of the strong impact that UNDP reports have at the global level both in academic and policy-making circles. More importantly, the 2010 HDR has replaced its flagship indicators on gender-related issues—the GDI and GEM—by a completely new measure—the GII–, so a critical comparison of the performance of the different indicators can be particularly useful to understand their strengths and weaknesses.

The new GII turns out to be an interesting but highly complicated measure with conceptual and methodological flaws. Conceptually, the mixture of absolute and relative indicators of women’s well-being obscures the interpretation of an already complicated index. This choice produces an index that, among other things: (1) penalizes less-developed countries for poor performances in reproductive health indicators that are not entirely explained by the gender-related norms or discriminative practices against women that the GII purports to measure, and (2) does not reach the expected or normatively desirable value whenever women and men fare equally in all indicators. Because of the complicated methodology involved in the construction of the GDI and GII, we have introduced an alternative aggregation methodology—the GRS index—that simply averages the gender gaps in the dimensions we are taking into account. In the first part of the paper we have worked with the GDI indicators and plugged them into the GRS aggregation methodology to create a first version of our index: GRS1. In the second part of the paper we use GII variables to create a second version of GRS: GRS2.

Exploring the average values of GRS1 in Europe during the last decade it turns out that they have stagnated around 0.85, that is: women’s average achievement levels have roughly represented 85% of men’s all along the 1999–2009 period. It is remarkable—and somewhat surprising—that the average gender gaps have not decreased in a substantial way during these 10 years. A partial explanation for this comes from the fact that most of the gender gaps included in GRS1—in life expectancy, literacy rates and school attendance ratios—had approximately reached their normatively desirable values long ago, so they do not have room for further improvement. On the other hand, the gender gaps in the earned income component have remained stubbornly high during the last decade and there are no clear signs of improvement, therefore resulting in stagnant GRS1 values. The fact that the gaps in health and education have almost vanished (and even reversed) in Europe is a highly relevant historical trend that has been reported elsewhere (e.g.: UNDP Human Development Reports). Notwithstanding, it is debatable whether these variables should be included whenever one aims to capture those dimensions where women are disadvantaged in the European context.

Interestingly, the dispersion in gender equality values between European countries has decreased during the last decade, a result that is in line with many studies that report global convergence on a broad range of quality of life indicators (see Dorius and Firebaugh 2010). In spite of this convergence, the country ranking in terms of GRS1 values has remained relatively stable over time, with Northern and Southern European countries basically occupying the highest and lowest positions respectively.

The incorporation of the new GII variables into GRS2 brings a new perspective into our assessment of gender inequality levels in Europe. The substitution of the problematic earned-income component by labor force participation rates is good news: the latter is a more reliable measure of economic participation that is available in many countries across the world.Footnote 12 Concerning the education component, focusing on secondary education and above is much more appropriate in the European context than working with literacy and gross enrolment rates—the latter being more meaningful in low-income countries. GRS2 also incorporates the shares of parliamentary seats hold by women and men. While this indicator has been criticized for being too simplistic and not truly reflecting the power relations among women and men, it focuses on an interesting dimension that still reflects huge gaps in favor of men in virtually all countries of the world. Finally, GII incorporates reproductive health indicators that are exclusively measured for women: Maternal Mortality Ratios and Adolescent Fertility Rates. Ideally, it would be desirable to have meaningful reproductive health indicators for men too, but it is not entirely clear how such an indicator should be constructed in a way that it was meaningfully comparable with a women’s reproductive health indicator. For the sake of simplicity, GRS2 simply incorporates gender-specific life expectancies at birth as imperfect substitutes of MMR and AFR.

The gender gaps included in GRS2 turn out to behave quite systematically for the different European countries. While the gaps in education and health tend to be relatively small, the gaps in labor force participation rates and—especially—in parliamentary representation tend to be particularly large. The mismatch between the relatively small gaps in educational achievement and the relatively large gaps in labor force participation rates is a classical indicator of the underlying gender-specific norms and practices that still discourage female participation in the labor market in favor of domestic-related activities.

To sum up, the empirical results shown in this paper indicate that while GDI variables might be appropriate to capture gender inequalities for low and middle-income countries, they are nowadays not very useful for most European countries, where most gender gaps have either vanished or are measured on shaky grounds. This suggests that, for certain purposes, it might be more meaningful to define region-specific measures at the European level only, an issue that has already been attempted in other recent papers (e.g.: Plantenga et al. 2009; Bericat 2011). In contrast, the indicators included in the GII constitute a promising alternative that—while imperfect—hint new research directions towards more insightful gender inequality indicators.

Another important issue that has been addressed in the different sections of the paper is the existing relationship between economic growth and gender equality. Using GRS1, it turns out that this variable is weakly and even negatively related to logged GDP per capita levels in Europe during the last decade using cross-sectional and cross-time variables. This is basically due to the aforementioned stagnation of European GRS1 values, which have small room for further improvement. In the light of the current economic crisis, and because of the somber growth prospects faced by the European economies, it is dubious that the sign of that relationship can be reversed in the short run. When the same relation is revisited substituting GRS1 with GRS2, we obtain substantially different results. At the European level, GRS2 and logged GDP per capita are weakly but positively related using cross-sectional data. This finding is in line with the results presented in Easterly (1999), Dijkstra and Hanmer (2000), where the authors report a positive cross-sectional relationship between economic efficiency and gender equality at the world level. Given the larger room for improvement that European countries have in terms of GRS2 levels, the positive relationship between both variables might still continue for some time—as long as economic growth permits. Summarizing, we conclude that the relationship between economic efficiency and gender equality in Europe is at most weakly positive when cross-sectional variables are used and “appropriate” gender inequality indicators (i.e.: truly identifying those dimensions where women are disadvantaged) are brought to the fore. In line with the existing literature, none of the observed relationships are statistically significant when a cross-time perspective is used.

It must be stressed that the values of the indices presented in this paper alone are not enough to guide policy-makers; the fact that a country scores high does not necessarily mean that women’s status is good or that men’s problems should not be addressed. Quite the contrary, the analysis of these equality indicators should be complemented by looking at: (1) the different subcomponents separately, and (2) country-specific, qualitative and quantitative information on both women and men. Despite this limitation—which is intrinsic to any composite index and is well acknowledged in the literature (Dijkstra and Hanmer 2000:63; Beneria and Permanyer 2010:396)—we contend that such indices are very useful to raise awareness about gender-related issues and to provide crucial information for policy-makers.

In this paper we have exclusively relied on the values of UNDP indicators. The main disadvantage of this approach is that the analysis is reduced to a restricted set of variables that is rather limited and leaves aside extremely important dimensions like violence against women or the informal and care economy sectors—to mention just a few. In this respect, the inclusion of other variables reflecting alternative aspects of women’s and men’s lives would be particularly appropriate to reach a better assessment of gender inequality levels. Unfortunately, surveys containing this information—e.g.: time use surveys—are typically available for a reduced set of countries only. On the positive side, working with UNDP indicators allows for a huge geographical and temporal coverage that has no parallel with other data sources. Moreover, the conceptual foundation upon which UNDP indices are grounded is solid and well established (Hawken and Munck 2009). Last but not least, UNDP’s Human Development Reports are important documents that have a great impact in an increasingly interconnected and globalized world, so it is particularly important to assess them critically as has been done here. While we acknowledge the limitations of our approach, by no means do we claim that it constitutes an exhaustive assessment of gender inequality levels. Other approaches working with different datasets that can be used to capture alternative well-being dimensions are greatly needed to complement the analysis presented in this paper.