Introduction

Scientometrics is a distinct discipline that performs quantitative studies of science and technology using mathematical, statistical, and data-analytical methods and techniques for gathering, handling, interpreting, and predicting a variety of features of the science and technology enterprise, including scholarly communication, performance, development, and dynamics. In practice, scientometrics often requires the use of bibliometrics, the measurement of texts and information, and results might be presented as science maps (Börner 2010; Börner et al. 2003). After decades of research (Rip 1997), more robust and better-validated techniques and tools are available.

The study presented here uses papers that appeared in Scientometrics, the flagship journal of the field (Chen et al. 2002) publishing a major percentage of works in scientometrics as well as in the field of informetrics (Bar-Ilan 2008) over the last 33 years. Being fully aware that some scientometrics research is published in other journals but also in books or theses, we subsequently use the 33 years Scientometrics dataset to study scientometrics. This is in line with a number of prior studies. For example, Schoepflin and Glänzel (2001) used papers published in Scientometrics for the years 1980, 1989, and 1997 to identify a decrease in the percentages of both the articles related to the subjects of science policy and to the sociology of science. Peritz and Bar-Ilan (2002) used papers published in Scientometrics for the years 1990 and 2000 and confirmed that Research Policy and Social Studies of Science are the third and fourth most frequently referenced journals in articles published in Scientometrics. Hou et al. (2008) analyzed the structure of scientific collaboration networks in scientometrics at the micro level (individuals) by using bibliographic data of all papers published in Scientometrics from the years 2002–2004. They found that although half the authors had co-authored with each other, the network was not strongly connected and the collaborative network in the field of scientometrics was very loose. Dutt et al. (2003) analyzed Scientometrics papers published during 1978–2001, examining the distribution of countries and themes and comparing institutions and co-authors to show that the research output is highly scattered, as indicated by the average number of papers per institution and dominated by single-authored papers; however, multi-authored papers are gaining momentum. To our knowledge, none of the existing studies has used the set of all 2,541 papers published in Scientometrics from 1978 to 2010, and nobody has yet attempted a multi-level study that aims to improve our understanding of the structure and evolution of collaboration networks at the country (macro), institution (meso), and author (micro) levels. Different, yet relevant to the work presented here, are studies on evolving citation, co-citation, or collaboration networks. For example, Chen et al. (2010) introduced a multiple-perspective co-citation analysis for characterizing and interpreting the structure and dynamics of co-citation clusters of the field of information science between 1996 and 2008. He showed that the multiple-perspective method increases the interpretability and accountability of both author-citation analysis (ACA) and document-citation analysis (DCA) networks. Wagner and Leydesdorff (2005) applied network analysis to map the growth of international co-authorships, and they found that international co-authorships can be explained based on the organizing principle of preferential attachment, although the attachment mechanism deviates from an ideal power-law. Samoylenko et al. (2006) visualized the scientific world and its evolution by constructing minimum spanning trees (MSTs) and a two-dimensional map of scientific journals using the Science Citation Index from the Web of Science database for 1994–2001 and showed a linear structure of the scientific world with three major domains: physical sciences, life sciences, and medical sciences. Perc (2010) studied the evolution of Slovenia’s scientist collaboration network from 1960 to 2010 with a yearly resolution and showed the network had a “small world” pattern and its growth was governed by near-linear preferential attachment. This paper will advance the existing works by studying the evolution of scientometrics at three different network levels.

Data source and data unification

Data was acquired from the Web of Science in Dec 2010. All 2,541 publications—covering articles, proceedings papers, and reviews—published in the journal of Scientometrics in 1978–2010 (publication year) were downloaded. Subsequently, the Thomson Data Analyzer (TDA) was used to extract the number of countries, institutions and authors per year. The names of countries had to be cleaned to make sure that each country had only one unique name. Institutions (extracted from author affiliations) needed to be pre-processed by hand to unify different names and abbreviations and to correct misspellings. Particularly challenging were institution names given in different languages. Author names were cleaned using the following process: if two names differed only by the presence of a middle name but had the same first and family name and were from the same institutions, then the two names were merged. For example: authors Meyer M and Meyer MS, both working at the University Sussex, were assumed to be one person. Next, three collaboration networks were extracted based on the co-occurrence of authors/institutions/countries, respectively. Third, the weights of collaboration links were calculated by counting the number of times two authors/institutions/countries co-occurred on a publication—i.e., even if two authors on a publication have institution/country X and three others have institution/country Y, this publication contributes a weight of one to the total X–Y collaboration link. Next, MS EXCEL was applied to run data statistics shown in Figs. 1 and 2. Finally, the Science of Science Tool (Sci2 Team 2009) was used to analyze the network parameters and visualize evolving collaboration networks at all three levels.

Fig. 1
figure 1figure 1

Growth of papers, countries, institutions, and authors for 1978–2010

Fig. 2
figure 2

Densification of collaboration networks on macro, meso and micro levels

Results and analysis

Growth of countries, institutions, and authors

The 2,541 papers published in Scientometrics were contributed by 78 unique countries (or regions), 1,275 unique institutions, and 2,697 unique authors. Figure 1 shows the growth (annual and cumulative) of the number of papers, countries (or regions), institutions and authors from 1978 to 2010. By counting the annual numbers in each figure, we obtain average annual growth rates, which are 20.4 % (papers), 9.4 % (countries), 19.6 % (institutions), and 20.1 % (authors).

As can be seen in Fig. 1, there are intrinsic differences in the growth pattern of countries, institutions, and authors when compared to the growth of papers. Figure 1e shows that the average number of papers per country grows rapidly. The average number of papers per author increased in the first 10 years, then was constant at about one paper per author for nearly 15 years, but is decreasing slowly in the most recent 5 years. The average number of papers per institution increased slowly over the 33 years (as did the number of authors, institutions, and countries per paper). suggesting that the scientometrics research community expanded dramatically—more and more institutions and authors are publishing in scientometrics. Figure 1f shows a small annual increase of about 2.34 % in the average number of authors per institution. The number of countries that publish in Scientometrics grows linearly over the 33-year period but divided into two phase by 1994.

Densification and growth

As Bettencourt et al. (2009) pointed out, when fields grow, their collaboration networks densify—i.e., the average number of edges per node increases over time. They found that the relation between the number of nodes and edges followed a simple scaling law with scaling exponent (α > 1):

$$ {\text{edges}} = A\left( {\text{nodes}} \right)^{\alpha } , $$
(1)

They assumed that A and α are constants and showed that the scaling exponent α correctly captured the densification independent of scale, here number of nodes.

In our work, we construct collaboration networks at three levels: macro-countries, meso-institutions, and micro-authors. As can be seen in Fig. 1, all three entity types grow in number. Figure 2 shows that the scaling exponent α equals 2.9533 at the macro-country, 1.5222 at the meso-institution, and 1.2353 at the micro-author levels. It has the highest value for countries—i.e., the country collaboration networks densify rather quickly, which is also due to the fact that this is the network with the fewest nodes. However, a large number of within-country or within-institution collaborations or an increase in single-authored papers would also result in smaller α values.

Network diameter

Building on work by Leskovec et al. (2005) which found that as networks grow and more nodes and edges are added, their effective diameter (as measured by shortest-path length—i.e., the 90th percentile) tends to decrease. They confirmed this for citation and affiliation graphs extracted for patents registered with the United States Patent and Trademark Office. Contrary to this, Bettencourt et al. (2009) showed that collaboration graphs in several scientific and technological fields exhibit initial rapid growth in their diameter, which then tends to stabilize and stay approximately constant at 12–14. This might be caused by the fact that when a new field emerges, authors are not yet aware of all relevant experts and works; as the field matures, important collaborations come into existence and lines of research are interlinked via co-author and citation linkages. The diameter of a collaboration network has major implications for information diffusion—the shorter a pathway of co-author linkages that connects an author pair, the more likely knowledge diffuses.

Over the 33 years, the country collaboration network diameter grew from 1989 to 1998 (there were no edges before 1989), achieves the highest value in 1998, and decreases in the last 10 years. This might be due to the rather limited number of countries that perform scientometrics research. The diameters of the institution and author collaboration networks increase continually and both reach a diameter d = 15 in 2010. This confirms the results reported by Bettencourt et al. (2009). A closer look at the density of the three networks (the ratio of the number of actual edges to all possible edges in a fully connected graph with the same number of nodes) shows that both the meso and micro networks’ densities decrease over time while the macro network, which experienced a topological transition from large to decreasing diameter, shows an increase in density.

Evolving node centrality and betweenness and network visualizations

In an attempt to understand the structure of the 1978–2010 networks, the degree for each node in the network was determined and the node degree distribution p(k) plotted in Fig. 3. The x-axis plots low degree nodes on the left and high degree nodes on the right; the y-axis indicates the probability of these. The right-most data point reveals that the institution network has the node with the highest degree—Katholieke Univ Leuven which has 51 collaboration links to other institutions. All three networks exhibit power law degree distributions.

Fig. 3
figure 3

Node degree distribution plots for 1978–2010 networks

To understand which countries, institutions, and authors play key roles in the three networks, the degree centrality (the number of links a node has) and betweenness centrality (nodes that have a high probability to occur on a randomly chosen shortest path between two randomly chosen nodes have a high betweenness) (Freeman 1977) values for each node were calculated. The resulting TOP-5 countries, TOP-10 institutions, and TOP-10 authors calculated for every 6 years (cumulatively from 1978) are listed in Tables 1 and 2.

Table 1 TOP-5 countries, TOP-10 institutions, and TOP-10 authors by degree centrality
Table 2 TOP-5 countries, TOP-10 institutions, and TOP-10 authors by betweenness centrality

In addition, the last table column shows the TOP-10 countries, institutions, and authors if only 2001–2010 data is considered. While the differences are minimal for countries and institutions, the list of TOP-10 authors changes considerably if only recent works are considered.

Multi-level analysis

A closer inspection of Tables 1 and 2 and the associated network layouts in Figs. 4, 5, 6 (nodes are size coded by number of papers, edges are width coded by number of collaborations) reveals the following:

Fig. 4
figure 4

Country collaboration network (1978–2010). Top-10 countries with the highest number of papers have been labelled

Countries

USA, Belgium, and England are the lead countries throughout the 33 years—they are among the TOP-5 by degree centrality and betweenness centrality. They are followed by the Netherlands, Spain, Germany, France, China, and India. A change occurred between 1992 and 1998 when China and France appeared in the TOP-5 countries in terms of degree centrality. However, they never returned after 1998. Instead, the Netherlands and Germany made the TOP-5 in 1998 and 2004 and the Netherlands and Spain joined in 2010. As for betweenness, China shows the same pattern of degree centrality, together with India. They were replaced by the Netherlands, France and Spain in 1998, 2004 and 2010. Interestingly, France had a high ranking in betweenness in 1998 and 2010 but is not present in the degree centrality TOP-5 list during those years. Figure 4 shows that, by the end of 2010, Belgium, USA, England, Germany, the Netherlands, China, and France are central network nodes with a large number of papers. These six countries not only link to each other but also to outside countries—e.g., Belgium and Germany have strong links to Hungary, and Belgium and England have strong links to Finland. Comparing 1978–2010 to 2001–2010, the TOP-5 countries were the same for degree centrality and a little different (the Netherlands replaced Belgium in the TOP-5th spot) for betweenness centrality. This might be due to the fact that the key countries networks pattern in the latest 10 years almost determined the whole pattern of 1978–2010.

Institutions

When analyzing the evolving institution collaboration networks, it becomes clear that a few key institutions manage to stay in the TOP-10 list—among them are the Univ Sussex, KHBO, Katholieke Univ Leuven, Hungarian Acad Sci, and Leiden Univ. Other institutions come in and out—possibly even taking the top place. Such is the case with Univ Instelling Antwerp, ranked first in 1992 and 1998 counted by both centrality and betweenness. Leading authors were Rousseau R and Egghe L. Rousseau R’s degree and betweenness centrality increased steadily after 1998. His curriculum vitae showed that he is an associate professor at KHBO (Catholic School for Higher Education Bruges-Ostend, Belgium), a professor associated with the K.U. Leuven, and a guest professor at UA’s School for Library and Information Science. In 2004, Univ Instelling Antwerp experienced a lower ranking and then disappeared from the TOP-10 list in 2010. One possible reason might be the fact that it was merged into the Univ Antwerp in 2003. However, the Univ Antwerp never appeared in the TOP-10 lists again. This is most likely due to the fact that Rousseau R left the Univ Antwerp—only six of his papers listed Univ Antwerp in the address field after 2004. The centralization pattern of key institutions can also be seen in Fig. 5, together with the fact that high-ranking institutions also have more papers and are in the core of the largest network component.

Fig. 5
figure 5

Institution collaboration network (1978–2010). Top-50 institutions with the highest number of papers have been labelled

Comparing 1978–2010 to 2001–2010, the TOP-10 institutions were similar both for degree centrality (9 of the TOP-10 were the same) and for betweenness centrality (8 of the TOP-10 were the same). This might be because the key institutions networks pattern in the latest 10 years almost determined the whole pattern of 1978–2010.

Authors

During the evolution of the co-author networks, early authors are replaced by current authors. Most TOP-10 authors from 1980 and 1986 are missing in the later years. Key authors listed in the TOP-10 lists around 1986 decline in ranking or are replaced by other authors. For instance, Lancaster FW ranks second in 1986 and 1992, decreases to third in 1998, tenth in 2004, and drops off the list in 2010. Authors with a similar pattern include Braun T, Courtial JP, Narin F, VanRaan AFJ, et al. On the other hand, authors that rank highly in 2004 and 2010 such as Rousseau R and Moed HF had never appeared in the TOP-10 lists before. Figure 6 shows the co-author network with a giant component in the middle surrounded by many smaller, unconnected networks. While most TOP-10 authors are part of the giant component, there are other authors such as Lancaster FW, Sullivan D, and White DH who are key nodes in smaller networks. Comparing 1978–2010 to 2001–2010, there were more changes to the TOP-10 authors both in degree centrality and in betweenness centrality compared to countries and institutions. That might be due to the fact that over time, new authors come into existence and begin to play a key role. Some new key players in the TOP-10 list of the latest 10 years had never appeared before. Some of them might become new leaders in the coming 10 or 20 years.

Fig. 6
figure 6

Author collaboration network (1978–2010). Top-50 authors with highest number of papers have been labelled

Comparing country, institution and author levels

The micro, meso and macro levels discussed in this paper are intrinsically interlinked. Authors work at institutions, institutions have their geospatial home in specific countries, and subsets of authors, institutions, and countries co-occur on each Scientometrics paper. One might assume that rankings on the author (micro) level impact the ranking of institution (meso) and country (macro) levels. While author rankings impact institution rankings; institution rankings are less predictive of country rankings, as exemplified below.

Country and institution levels

Tables 1a and 2a list the USA in the Top-3 in terms of degree and betweenness centrality for each of the five time frames considered. However, no institution or author in the USA is listed in the Top-10 lists (Tables 1b,c and 2b,c). The reason is the large total number of existing institutions and the papers published in the USA. As can be seen in Table 3, USA ranks first in the number of institutions and the number of papers over the 33 year time span. However, the average number of papers per institution was low for the USA, especially when compared with Belgium, Netherlands, and Hungary. Plus, no institution in the USA had a large numbers of papers, see also Fig. 5 that size codes institution nodes by the number of their papers. The USA institution with the most papers (22) is Inst Sci Informat which ranked 11—a rather low number if compared with the TOP-10 institutions: Hungarian Acad Sci (155 papers), Katholieke Univ Leuven (93), Leiden Univ (88), Natl Inst Sci Technol and Dev Studies (75), CSIC (68), Univ Sussex (49), Univ Granada (36), Univ Amsterdam (35), Univ Instelling Antwerp (29) and KHBO (26) in Hungary, Belgium, Netherlands, India, Spain, and England. Other USA institutions with a relatively large number of papers included Drexel Univ, Indiana Univ, Georgia Inst Technol. Similarly, while no single author in the USA appears in the TOP-10 lists, the number of all authors combined and the number of their papers results in a high country ranking.

Table 3 Number of institutions and papers of the TOP-10 countries with the most papers

Institution and author levels

Can one single author impact the ranking of an entire institution or country? The answer is yes. An example is Glänzel W, who authored 89 papers in Scientometrics and ranks first in the number of papers per author. As he is a professor at the Katholieke Universiteit Leuven and a senior scientist at the Hungarian Academy of Sciences, 79 of his papers list the Hungarian Academy of Sciences and 41 the Katholieke Universiteit Leuven in the Netherlands as institution, and 7 papers list none of the two institutions, i.e., 38 papers list both institutions. His papers constitute half of the total number of papers published by these two institutions. Glänzel W collaborated with Schubert A and Braun T publishing 82 % of the papers of the Hungarian Academy of Sciences. The latter two authors also appear in the TOP-10 lists of centrality but only in early years and both of them co-authored many more papers with Glänzel W than with anyone else. At the institution level, only between institution collaborations count, collaborations within one institution are omitted as they do not impact an institution’s degree or betweenness centrality. The 155 papers of the Hungarian Academy of Sciences were co-authored with 30 institutions, 22 of which were contributed by papers authored by Glänzel W. As for the 93 papers by the Katholieke Universiteit Leuven, 13 of 51 institution links were added by Glänzel W. Another key player is Rousseau R, who contributed 19 collaborating institutions to the Katholieke Universiteit Leuven.

Note that for some countries the productivity of one institution can have a major impact on the position and interconnectedness of an entire country in the global collaboration network, e.g., Katholieke Universiteit Leuven contributes 15 collaborating countries to Belgium’s total degree centrality of 16.

Conclusions and discussion

This paper analyzed the evolution of collaboration networks of scientometrics on three levels: macro (countries), meso (institutions) and micro (authors) based on all 2,541 publications in the international journal Scientometrics from 1978 to 2010.

Over the 33 years, the number of countries grew steadily with a linear growth feature with USA, Belgium and England leading in terms of centrality and betweenness. According to Chen et al. (2011), more and more papers published in Scientometrics were contributed from the TOP-10 countries: USA, Belgium, Spain, China, the Netherlands, England, Hungary, India, Germany, and France. As their share increases, they have a stronger impact on the evolution of scientometrics. Over time, more and more collaboration links are generated and the average node degree and network density increase as well (see Table 4). Given the trajectory of the past 33 years and the strength of the collaboration network, it seems likely that these top countries will be predominant in future years as well. It is important to point out that some top-ranking countries have a small number of top-ranking institutions (e.g., Katholieke Univ Leuven in Belgium) while other countries (USA) have a large number of contributing institutions. Similarity, some top-ranking institutions have one or two top-ranking authors, e.g., Glänzel W and Rousseau R That is, single authors can not only have a major impact on the ranking of their institution but also of their country.

Table 4 Network attributes for macro, meso and micro level for 3-year time spans

The institutions collaboration networks study indicated that the average number of papers per institution increased slowly with the development of scientometrics. At the same time, the growth rate of institutions, authors and papers for each year were similar about 20 %. It suggested that this field had been attracting more and more institutions and authors to join the field of scientometrics. On the other hand, the scaling exponent study showed that with the new nodes added, the edges increased faster with α > 1 and average degree increased yearly. We could not find the key nodes ranked top all time in institutions collaboration evolution networks calculated by centrality and betweenness as countries. This might be due to changes in affiliations by key authors, but a closer examination is needed to confirm this.

The co-author network analysis showed that many new authors joined the field of scientometrics, especially in the recent 8 years. The diameter, average degree, and density of the network show the same trends as those calculated for institutions. The replacement of early central authors by later central authors might reflect the update of a new generation to the elder one—due to the limited life of each author. This is expected to lead to differences when comparing co-author networks of mortal authors with collaboration networks of less mortal institutions/countries.

In sum, authors of Scientometrics articles seem to have effectively linked collaboration networks at the micro to macro levels. While co-author networks experience the departure of senior and the arrival of young researchers, the institution and country networks seem to have a comparatively stable structure of key nodes. New authors might bring changes in topic coverage, and future work will analyze the evolving topical coverage (expertise profiles) of authors, institutions, and countries. We are aware that research in scientometrics is also published in other journals such as JASIST, Journal of Informetrics, and PLoS ONE. By using citation networks at the paper or journal level, the dataset can be enlarged to provide a more comprehensive coverage of research on scientometrics and the collaboration networks at the author, institution, and country levels. Nevertheless, Scientometrics is a flagship journal in the field of scientometrics, and it was used here to demonstrate a novel approach to study evolving collaboration networks at the micro, meso and macro levels.