Introduction

The issue of ranking higher education institutions (HEI) has drawn a lot of attention as of late. Many different stakeholders, especially students, use rankings as an indicator of a university’s reputation and performance (Agasisti and Perez-Esparrells 2010; Bowman and Bastedo 2010; Hien 2010; Stolz et al. 2010). Moreover, projects dealing with the transformation of higher education systems often rely on HEI ranking results. For instance, the present French Minister of Research and Higher Education was delegated by the French President ‘‘to have two institutions among the world’s top 20 and ten among the world’s top 100’’ (Billaut et al. 2010). “This is putting huge pressure on schools to better meet the criteria relevant to rankings in order to attract both students and funds. However, despite growing popularity, the ranking of universities remains a controversial issue and has been widely debated” (Dehon et al. 2010). Probably the most cited ranking list is Academic Ranking of World Universities (ARWU) which has been the focus of researchers since its creation in 2003 (Aguillo et al. 2010; Docampo 2008, 2011; Florian 2007; Lukman et al. 2010).

The Shanghai (ARWU) ranking is based on six different criteria and aims to measure academic performance. Within each category, the best performing university is given a score of 100 and becomes the benchmark against which the scores of all other universities are measured. Universities are then ranked according to the overall score they obtain, which is simply a weighted average of their individual category scores (Dehon et al. 2010). The variables ‘‘Alumni’’ and ‘‘Award’’ measure the number of Nobel prizes and Field medals won by a university’s alumni (‘‘Alumni’’) or current faculty members (‘‘Award’’). The next three variables, ‘‘HiCi’’, ‘‘N&S’’ and ‘‘PUB’’ reflect the researchers output. ‘‘HiCi’’ is the number of highly cited researchers, ‘‘N&S’’ is the number of articles published in “Nature” and “Science” journals, and ‘‘PUB’’ is the number of articles indexed in the Science Citation Index Expanded and the Social Science Citation Index. The sixth and final variable, ‘‘PCP’’, is a weighted average of scores obtained from the previous five categories, divided by the number of current full-time equivalent academic staff members. The variables ‘‘Award’’, ‘‘HiCi’’, ‘‘N&S’’, and ‘‘PUB’’ each make up 20% of the final score, while ‘‘Alumni’’ and ‘‘PCP’’ are each given a slightly lower weight of 10% (Dehon et al. 2010).

Yet, almost immediately after the release of its first ranking, ARWU attracted a lot of criticism. One of its first detractors was van Raan (2005a), whose comments started a vigorous debate with the authors of the Shanghai ranking (Liu and Cheng 2005; Liu et al. 2005; van Raan 2005b). Since then the attacks of the rankings in academic circles have been both widespread and strong. Some problems regarding the choice of ranking variables are able to be discussed upon. For instance, the criteria Award and Alumni consider only two subjectively chosen awards. As pointed out by a referee, these two criteria are based on prizes and medals that far from cover all important scientific fields. Distinctions such as the ‘‘A. M. Turing Award’’ in the area of computer science or the ‘‘Bruce Gold Medal’’ in the area of astronomy, are among the many examples of highly prestigious awards that are ignored in the Shanghai ranking (Billaut et al. 2010). The criteria N&S also consider only two subjectively chosen journals. Moreover, a paper signed by several co-authors will have greater weight than a paper only signed by a single person which seems to be paradoxical (Billaut et al. 2010).

In addition, the relative weight attributed by the ARWU ranking to each variable has been show to be one of its potential weaknesses (Dehon et al. 2010). In this article, this potential weakness shall be examined by applying the statistical I-distance method.

I-distance method

Quite often the ranking of specific marks is done in a way that can seriously affect the process of taking exams, entering competitions, UN participation, medicine selection, and many others (Ivanovic 1973; Ivanovic and Fanchette 1973; Jeremic and Radojicic 2010).

I-distance is a metric distance in an n-dimensional space. It was proposed and defined by B. Ivanovic (Ivanovic 1977) in various publications that have appeared since 1963. Ivanovic devised this method to rank countries according to their level of development on the basis of several indicators. Many socio-economic development indicators were considered and the problem was how to use all of them in order to calculate a single synthetic indicator which will thereafter represent the rank.

For a selected set of variables X T = (X 1,X 2,…X k ) chosen to characterize the entities, the I-distance between the two entities e r  = (x 1r ,x 2r ,…,x kr ) and e s  = (x 1s ,x 2s ,…,x ks ) is defined as

$$ D\left( {r,s} \right)\, = \,\sum\limits_{i = 1}^{k} {{\frac{{|d_{i} \left( {r,s} \right)}|}{{\sigma_{i} }}}} \prod\limits_{j = 1}^{i - 1} {\left( {1 - r_{ji.12 \ldots j - 1} } \right)} $$
(1)

where d i (r, s) is the distance between the values of variable X i for e r and e s, e.g., the discriminate effect,

$$ d_{i} \left( {r,s} \right) = x_{ir} - x_{is} ,\quad {i} \in \left\{ {1, \ldots ,k} \right\} $$
(2)

σ i the standard deviation of X i , and rji.12…j−1 is a partial coefficient of the correlation between X i and X j, (j < i) (Ivanovic 1973).

The construction of the I-distance is iterative; it is calculated through the following steps:

  • Calculate the value of the discriminate effect of the variable X 1 (the most significant variable, that which provides the largest amount of information on the phenomena that are to be ranked).

  • Add the value of the discriminate effect of X 2 which is not covered by X 1.

  • Add the value of the discriminate effect of X 3 which is not covered by X 1 and X2.

  • Repeat the procedure for all variables (Mihailovic et al. 2009).

This I-distance fulfils all 13 conditions for defining the measures of distances. It is essential to point out that the I-distance method requires a standardization of all data, as it proves useful in overcoming the differences in measures.

Sometimes, it is not possible to achieve the same sign mark for all variables in all sets, and, as a result, a negative correlation coefficient and a negative coefficient of partial correlation may occur. This makes the use of the square I-distance even more desirable. The square I-distance is given as:

$$ D^{2} \left( {r,s} \right)\, = \,\sum\limits_{i = 1}^{k} {{\frac{{d_{i}^{2} \left( {r,s} \right)}}{{\sigma_{i}^{2} }}}} \prod\limits_{j = 1}^{i - 1} {\left( {1 - r_{ji.12 \ldots j - 1}^{2} } \right)} . $$
(3)

In order to rank the entities (in this case, universities), it is necessary to have one entity fixed as a referent in the observing set using the I-distance methodology. The entity with the minimal value for each indicator or a fictive maximal or average values entity can be set up as the referent entity. The ranking of entities in the set is based on the calculated distance from the referent entity.

Results of the I-distance method

The results achieved by means of the I-distance method are shown in Table 1.

Table 1 Results of the I-distance method, the I-distance value and rank for the year 2008

As can be seen from Table 1, Harvard University tops the I-distance method list as well. The correlation between the calculated I-distance values and the ARWU total score is significant, r = 0.921, p < 0.01. Interestingly, the members of the top 12 HEI’s are the same for both methods. Overall, if the I-distance ranking is compared with the ARWU ranking, it can be concluded that the ranks are quite similar. As a matter of fact, the Spearman’s rho statistic was calculated and correlation is significant with r s  = 0.772, p < 0.01. When observing only the European HEI, the correlation is also significant, with r s  = 0.689, p < 0.01. On the other hand, when examining only the USA’s HEI, the correlation was highly significant with r s  = 0.879, p < 0.01. The statistically significant difference between the correlation coefficients of the European and USA HEIs was also determined (p = 0.0235).

However, the alternative rankings for some of the European universities are quite different from the official ARWU list. For instance, Ecole Normale Super Paris and Moscow State University are the most drastic examples. They are ranked quite lowly in the ARWU list (73rd and 70th place), whereas the I-distance method puts them in 19th and 23rd place, respectively. It is quite interesting to note that this inconsistency was also mentioned by Dehon et al. (2010). If this analysis is analyzed further, the “weakest link” for these two HEIs is the variable “HiCi” the number of highly cited researchers. Indeed, almost all European universities are “suffering from the same illness”. Conversely, US universities have no such problems at all. As a matter of fact, US universities have significantly higher “HiCi” values than their European counterparts (p < 0.01). The same conclusion applies for the variables “PUB” and “N&S”, in both of these cases, US universities obtain significantly higher values (p < 0.01).

This data set has been further examined and a correlation coefficient of each variable with an I-distance value and ARWU total score has been determined. This is crucial as it provides information on how significant each of the six input variables is. The results are shown in Table 2.

Table 2 The correlation between input variables and I-distance and the ARWU total score

Table 2 shows that the most significant variable for the calculated I-distance value, that which provides the largest amount of information, is the score on alumni. It correlates highly with the I-distance value (r = 0.948). As can be seen, there is a reasonable amount of concern as to whether the current weighting factors for each of the six variables are appropriate. There is a huge inconsistency between the variables; for instance the score on PUB has a barely significant correlation with r = 0.508. This is precisely the reason why the relative weight attributed to each variable must be very carefully determined.

On the other hand, the score on alumni is far less significant when determining the ARWU total score. As a matter of fact, the difference between the correlation coefficients is significant between the I-distance and ARWU total score for the indicator score on alumni, p = 0.000. This essentially means that the score on alumni differently affects the final rankings for the two classification methods. The I-distance method regards alumni to be the most significant (correlated) variable, while the ARWU method puts it in fifth place.

In addition, when the ranking is done according to the ARWU methods, the score on N&S correlates most to the ARWU total score. In line with this, N&S is a crucial variable for determining the rankings according to official ARWU methodology. Nonetheless, according to the I-distance methodology, “N&S” is the third most significant. Another interesting finding is that “Alumni”, “N&S” and “HiCi”, which are the most controversial criteria in contemporary literature on this matter (Billaut et al. 2010), are those variables which differently affect the final results obtained by either the I-distance or ARWU methodology. The other three variables are equally significant for calculating the final results whether done by the I-distance or ARWU method.

Further, attention needs to be focused on the most drastic inconsistencies between the ARWU and I-distance ranking. Table 3 presents an overview of three European and three US HEI which have shown a great disparity in their rankings.

Table 3 A comparison of three European and three US universities

As can be seen, the three mentioned US universities (all from California) come in at the bottom of the I-distance method ranking list. On the other hand, according to the official ARWU list, they belong to the “golden mean”. As can also be seen, the above-mentioned US universities have no score for the variable “Alumni”. This is precisely the reason why the I-distance method awarded these three Californian universities with quite low rankings. According to the I-distance method, the variable “Alumni” is the most significant and the fact that a particular university has no score for this variable is reflected in its bottom list rankings. Conversely, the three mentioned European universities have a high score for the same variable; as a consequence, the I-distance method awarded them with far better rankings rather than those obtained by the official ARWU list.

Nonetheless, the US universities have significantly higher values of the variables “N&S” and “HiCi”. Therefore, the difference in the ARWU rankings between the US and European universities is mostly associated with these two variables. It appears to be that the academic staff of these US universities have been far more successful in publishing their papers in “N&S”, compared to their European colleagues. This disparity is the most obvious in terms of the variable “HiCi”, where a vast majority of highly cited authors come from US universities. These two variables are entirely crucial for the relatively high rank of the above-mentioned Californian universities.

Conclusions

With a growing worldwide interest in university rankings, the academic world is becoming even more concerned with the assessment of higher education. These rankings are very often used as a marketing tool for universities to show their educational or research excellence. This is precisely the reason why it is exceptionally important to provide rankings as accurate as possible. The aim of this study has been to point out that the ordering of the world’s universities is subject to continuous improvement. As a remedy to this issue, the analysis presented here has stressed potential weaknesses in the ARWU rankings in regard to the fact that changing the relative weight placed upon each of the six factors significantly alters the ranking.

As a way to overcome this issue, the use of the statistical I-distance method has been here proposed. The results from Table 1 suggest that the I-distance ranking is quite similar to the official ARWU list. However, it is essential to emphasize that European universities are, on average, lower ranked in official rather than in alternative rankings. As one of the identified reasons, the variables “N&S” and “HiCi” can be singled out. On the other hand, the I-distance method clearly has shown that certain variables are far more important than others for the ranking process. Table 2 presents the essential underlying differences in dynamics for I-distance and ARWU methodology. These findings should be incorporated into future research as being weight placed on each variable is crucial for the ranking process. It is hoped that this shall encourage debate on how to determine the criteria to best conduct and analyse universities’ rankings.