Introduction

The “scientific wealth” of nations is often interpreted in terms of publication and citation data. Early studies along these lines were done by May (1997) and King (2004). Nations with larger R and D investments had larger shares of paper and citation counts (Klavans and Boyack 2017). Indeed, there is a strong relationship between economic and scientific wealth. Leydesdorff and Zhou (2005) further demonstrated that newly emerging powers in science which start from a lower base have relatively higher growth rates. Cole and Phelan (1999) showed that economic forces do not fully account for scientific productivity; social and cultural forces like religion, decentralization and competitiveness were also factors. Cimini et al. (2014) use citation data of scientific articles to show that the scientific fitness of each nation, that is, the competitiveness of its research system, depends on the extent to which they diversify as much as possible their research system into as many scientific domains as possible.

So far, no one has looked at how concentration of science output in a few premier institutions within each country (i.e. the inequality in scientific wealth production) is related to the overall scientific wealth of a nation. In this paper, we interpret the “scientific wealth” of a nation in terms of citation data of its various academic institutions harvested by Google Scholar Citations for profiled institutions from all countries in the world. By examining data from three cohorts of countries, we show that the “richer” a country is, the more likely that its scientific excellence will come from a highly concentrated group of premier institutions,

The transparent ranking of Universities

The Third Edition of TRANSPARENT RANKING: Top Universities by Google Scholar Citations (http://www.webometrics.info/en/node/169) is now available. It uses institutional profiles introduced by Google Scholar Citations (GSC) for providing a ranking of universities using information provided for the groups of scholars sharing the same standardized name and email address of an institution. There are close to one million individual profiles and over 5000 university profiles in GSC. This covers most of the leading academic organizations from nearly 200 countries. The methodology used is described in http://www.webometrics.info/en/node/169. Ranking within each country and globally is done on the basis of descending order of total citations. Since the setting up of a personal profile in GSC is voluntary and some effort is required from each individual to ensure correctness, there will be many errors of omission and commission (i.e. intended or unintended fake, incorrect or duplicate records). Even then, we can have an indicative understanding of the scientific wealth of each country as a count of citations of organizations that make it to the list and also of the unevenness or variance in the distribution of this wealth within a country.

The methodology of the present exercise and results

There are 4447 academic institutions in the world which have more than 1000 citations at the time of collection (around 20th December 2016) of transparent ranking. The largest number of institutions is found in the United States of America with 930 institutions (20.9% of the global total). Many small countries have only one institution each and many which do not appear have no institution that makes the cut. The data for China and Russia seem unreliable and in our further exercises these are not considered.

Let us first focus our attention on the records from the United States of America. Let N be the number of institutions that have more than 1000 citations in a country and C be number of total citations. The 930 institutions account for a total of 74,852,741 citations. Note that N is a size-dependent or extensive parameter. The one can think of an average impact term i = C/N as a size-independent measure of the average excellence of the institutions in the country. For the USA, this is 80486.82. Then if N is a zeroth-order measure of performance, C is a first-order measure of scientific output or performance. Following Prathap (2011, 2014), it is possible to define second-order measures of performance such as Exergy X and Energy E. The ratio η = X/E is a very simple size-independent measure of the degree of unevenness or inequality or of concentration in the distribution. A value of η = 1 implies absolute equality or evenness of distribution and this is also the default value for this parameter when there is only one institution in the country. For the USA, the corresponding values are X = 6.02E + 12, E = 3.02E + 13 and η = 0.200. That is, excellence is distributed in the USA in a very highly skewed or uneven manner.

In Table 1 we compare the size-dependent and size-indeendent indicators for the world and the United States of America as indexed in TRANSPARENT RANKING. It is seen that the USA maintains an average impact that is nearly twice as high as the global average impact. The global measure of inequality of distribution is higher than that within the USA. That is, globally excellence is concentrated in an even more highly skewed or uneven manner than in the USA.

Table 1 The size-dependent and size-independent indicators for the world and the United States of America

Following the intuition of Cole and Phelan (1999) that social and cultural forces are significant factors in determining the scientific competitiveness of nations, we look at three cohorts as described in Table 2. Altogether some 52 countries are covered. In one column we have some leading countries as measured by size-dependent measures of performance. China and Russia are omitted from this list as the data from profiled institutions, which in turn depend on the authenticity of data from profiled individuals, seem unreliable. In the second column we look at major Islamic countries (Sarwar and Hassan 2015) to see how social and cultural determinants may affect performance. In the third column we have an agglomeration based on language where the Iberian peninsula countries of Spain and Portugal are taken together with many Latin American countries. In all cases, the nominal GDP measure in billions of US dollars is taken as a measure of the size of the economy. GDP values are those reported by the International Monetary Fund.

Table 2 Three cohorts taken up for examining the nature of relationships between size-dependent and size-independent indicators for various countries

Table 3 shows the Pearson’s correlation for the size-dependent and size-independent indicators for the 52 countries covered in Table 2. We see a very strong correlation between nominal GDP and the size-dependent research performance indicators. Average impact, i, is modestly correlated with GDP; richer countries produce research of higher quality or impact. The size-independent inequality measure is consistently negatively correlated with all the other size-dependent indicators indicators. Figure 1 shows scatter plots illustrating how the size-dependent performance indicators are related to nominal GDP. Indicative lines are also shown with slopes of 1.0, 1.5 and 2.0 respectively. As GDP increases, the scientific perfomance increases, with the higher-order indicators emphasizing the compounding role that impact or quality plays. The zeroth-order indicator, N, varies directly with GDP, i.e. richer countries boast of a larger number of institutions that have more than the threshold of 1000 citations. In Fig. 2 we have scatter plots showing that the size-independent inequality indicator is negatively correlated with the second-order performance indicators for the three cohorts considered. As nations move towards higher degrees of total excellence, the inequality parameter also increases showing that growth takes place in a concentrated fashion in a few elite institutions.

Table 3 Pearson’s correlation for the size-dependent and size-independent indicators for the 52 countries covered in Table 2
Fig. 1
figure 1

Scatter plots showing how the size-dependent performance indicators are related to nominal GDP

Fig. 2
figure 2

Scatter plots showing how the size-independent inequality indicator is negatively correlated with the second-order performance indicators for the three cohorts considered

Concluding remarks

We have used citation data harvested by Google Scholar Citations for profiled institutions from all countries in the world as a proxy for the “scientific wealth” of each nation. It is seen that this is very unevenly distributed among the institutions in each country. From correlation analysis and scatter plots we see that the greater the scientific wealth of a nation the more likely is it that it will tend to concentrate this excellence in a few premier institutions. That is, great wealth implies great inequality of distribution.