Introduction

A number of studies have recently been conducted in attempts to measure the impact of research more broadly than with citations to meet the needs of scholarly communities or investigate opportunities for research evaluation procedures. Altmetrics indicators provide a new perspective beyond the scientific impact of research on research itself, but also of the impact of research on other segments of society (Bornmann 2015), drawing scholars’ attentions to categorizing metrics according to different types of communication. Bornmann reported a negligible correlation between citations and micro-blogging counts, a weak correlation between citations and blog counts and medium to large correlations between citations and bookmark counts from online reference managers. Altmetric methods were also seen as alternative approaches solving several problems associated with the use of bibliometrics in the humanities (Hammarfelt 2014). While altmetrics indicators are more oriented towards communication with the general public, usage counts focus more on “scholarly communication” among scholars (Pringle 2015).

According to Brody et al. (2006), a reading-citing cycle of scholarly publications from the moment an article is accepted for publication until it is published, read or cited may range from 3 months to 1–2 years or even longer. The usage impact being measured early in the reading-citing cycle is significant and may be predictive for the later stage of the cycle, i.e. the citation impact which can be only measured after the publications of those citing articles of a given article. Among the 39 scientific impact measures tested by Bollen et al. (2009), usage-based measures are even stronger indicators of scientific prestige than many citation measures.

Recent research on usage indicators is based on the data retrieved from one single publisher or repository and lacks for a comprehensive and large-scale sources (Wang et al. 2016). One of the most extensive citation index databases in the world, the Web of Science Core Collection (WoS), has provided the daily-updated usage counts of indexed publications on its platform to measure the level of interest in a specific item since September 2015. The counts show the number of times the full text of a record has been accessed or a record has been saved in the last 180 days or since 1 February 2013. They can thought of as a supplement to citations on the platform accounting for early discovery as opposed to formal recognition by other scholars. Even though the definition of ‘usage’ in the WoS is more peripheral than, e.g., that of citations, views or fulltext downloads, the WoS usage counts measuring bibliographic records saving and full-text clicking are potential indicators showing the interests and motivations of scholars. This new measure provides a different perspective of usage data from the users of an interdisciplinary database and is worth of study.

The WoS usage counts were analysed for highly and lowly cited papers (Wang et al. 2016) and the authors found that highly cited papers accumulate more usage counts. However, the citation window and usage window are asymmetric in this study and may distort the comparison for the samples published between the 1940’s and 1970’s. Therefore, in this study we analyse only the publications published in 2013 to ensure a similar period for accumulating citations and usage counts simultaneously.

Beyond the reading-citing cycle, scientific collaboration may increase the quality of the research. Scientific collaboration is defined as “interaction taking place within a social context among two or more scientists that facilitates the sharing of meaning and completion of tasks with respect to a mutually shared, superordinate goal” by Sonnenwald (2007, p. 645). Collaboration then should influence measures of productivity because of work sharing and citation impact of scholarly publications as far as it partially measures quality. Persson et al. (2004) demonstrated that co-authored publications are cited more frequently than single-authored papers. The number of authors is often used as a measure of the scientific collaboration. For example, numerous increases in the number of co-authored papers in different scientific disciplines and countries were observed in bibliometric studies (Cronin et al. 2004; Grossman 2002; Moody 2004). Furthermore, Peters and van Raan (1994), and Glänzel and Thijs (2004) detected a general positive correlation between citation counts and number of co-authors. Therefore the number of co-authors will be taken into account in the present study to represent the factor of scientific collaboration.

In this study, we will include the factor of scientific collaboration into the discussion of citations and usage counts, and therefore firstly investigate the relation among citations, usage counts and the number of authors per paper. Apart from the relation, we will also analyse the distributions of citations and usage counts via the method Characteristic Scores and Scales (CSS) which was introduced by Glänzel and Schubert (1988). CSS is proved to be stable with respect to the underlying subject field, the publication year as well as the citation window. In this study CSS will be applied to usage account for the first time to test its stability, and extend the relations between citations and usage counts though the comparison of distributions of two metrics. This approach provides a more complete discussion on the association between usage and citation impact.

Methodology

We collected usage counts and number of citations from WoS Core Collection (SCIE, SSCI & AHCI) for three countries with similar publication output as pars pro toto examples. Relevant data were extracted for each article published in 2013 from two developed country (Belgium and Israel) and a developing country (Iran). We selected Belgium as medium-sized developed European country, where we are actually did the research, along with Iran and Israel which showed interesting and quite characteristic features in terms of downloads compared to citations in a previous study (Glänzel and Heeffer 2014). In particular, Iran is among the most active downloading countries but has rather weak citation links with other countries. Researchers of this country are frequently downloading information from many other countries but their publications are not significantly downloaded by others. In contrast, Israel has few downloading activities from and to other countries.

In total, 26,886 papers with at least one Belgian address each, along with 16,618 Israeli papers and 28,203 Iranian papers were downloaded with citation and usage counts from the WoS on 29 November, 2016. The usage count reflects the number of times WoS users clicked links to the full-length article at the publisher’s website or saved the article for use in a bibliographic management tool. It is counted starting from 1 February 2013 till the date of data download. Automated download artefacts were removed from the data by WoS.

All items extracted from the WoS have been assigned to the 74 individual subfields and 16 major fields according to the modified Leuven-Budapest classification system (see Glänzel et al. 2016). The original scheme was introduced by Glänzel and Schubert (2003) and has been recently modified to provide a better categorisation for the social sciences and humanities based on the WoS and Journal Citation Reports (JCR) subject categories. In this study, five major fields in this scheme representing natural sciences, life sciences and social sciences, were selected to analyse the relations between usage, citation impact and scientific collaboration (see Table 1). The fields are Chemistry, Clinical and Experimental Medicine II (non-internal medicine specialties), Mathematics, Neuroscience & Behavior, and Social Sciences II (economic, political & legal studies). Table 1 shows that the usage figures in the social sciences and chemistry are larger relative to citations than in the other fields. In contrast, WoS records in clinical medicine are not used as much as other fields possibly because of the availability of other popular medicine data sources such as PubMed.

Table 1 Statistics of sample data sets.

All the samples are further analysed for the correlation coefficients among citation counts, usage counts and the number of co-authors per paper, and the distributions of citations and usage counts by Characteristic Scores and Scales (CSS). CSS is a parameter-free solution to analyse the citation distribution of papers over some essential performance classes, to create reference standards for the comparison of citation impact across research fields. In particular, four classes are defined on the basis of an underlying population or a given reference sample and applying an iterative procedure with three thresholds (characteristic scores) b k (k = 1, 2, 3): b k  = E(X(t)|X(t) ≥ b k−1), where b 0 := 0 by definition. The conditional expectation E(.|.) is empirically replaced by the corresponding mean values. The four classes are then formed by the elements taking values in the intervals [b k−1, b k ) and [b 3, ∞). As explained by Glänzel et al. (2014), classes 1 and 2 represent ‘‘head’’ and ‘‘trunk’’ of the underlying citation distribution over individual papers, usually referring to 90% or a slightly larger share of all documents. The upper two classes, Class 3 and Class 4, representing nearly 10% of all publications, contain the highly cited publications. Particularly, Class 4 covers the top 2–3% of the corresponding sample and forms the most interesting category. This is the first study to apply CSS to usage data for testing the stability of the usage distribution of papers alongside citations.

Results

Correlation analysis

Table 2 shows that the correlations between citations and usage counts in different fields are of intermediate magnitude but much higher than other two combinations, especially in the social sciences and chemistry. In contrast, an increase in the number of authors is associated with an increase of usage counts the most in mathematics and with the increase of citations the most in neuroscience.

Table 2 Pearson correlation coefficients of publications in Belgium, Israel and Iran.

Providing a closer look at the stronger relations between citations and other indicators, Figs. 1, 2 and 3 illustrate the scatter plots of publications in the social sciences, chemistry and clinical medicine based on the regression of usage on citations and citations on number of co-authors. In general, the correlation plots between citations and usages per paper are positive and similar in all the countries and all the fields. The correlation plots between the number of co-authors and citations per paper are more diverse. In the social sciences, Fig. 1 shows distinct correlations between the number of co-authors and citations per paper. The corresponding variables proved practically uncorrelated for Belgian and Iranian publications. In chemistry, Fig. 2 shows that only Belgium has a uncorrelated correlation between the number of co-authors and citations per paper. In clinical medicine, Fig. 3 shows that Belgium and Israel have uncorrelated correlations between the number of co-authors and citations per paper, but Iran has a strong positive correlation between these two variables.

Fig. 1
figure 1

[Data sourced from Thomson Reuters Web of Science Core Collection]

Scatter plots of publications in Social Sciences II. Belgium (top), Israel (center) and Iran (bottom). Left usage versus citations. Right citations per paper versus authors per paper.

Fig. 2
figure 2

[Data sourced from Thomson Reuters Web of Science Core Collection]

Scatter plots of publications in Chemistry. Belgium (top), Israel (center) and Iran (bottom). Left usage versus citations. Right citations per paper vesus authors per paper.

Fig. 3
figure 3

[Data sourced from Thomson Reuters Web of Science Core Collection]

Scatter plots of publications in Clinical Medicine II. Belgium (top), Israel (center) and Iran (bottom). Left usage versus citations. Right citations per paper vesus authors per paper.

Characteristic Scores and Scales

For citation data, the distribution of papers of CSS-classes roughly obeys the 70–21–6.5–2.5% regularity (from the lowest to the highest class—see Glänzel et al. 2014). The stability of this property can be observed in Table 3 for the samples from specific disciplines as well. Table 4 illustrates the applicability of CSS in characterising usage distributions. Apart from the distribution over classes, also the scores themselves reveal interesting aspects. On one hand, we observe distinctly different patterns in citations and usage, where, according to the expectations, usage has, in general, higher scores and higher percentages of top used class than citations and, on the other hand, we see some variations among the three countries, but the similarities within citations and usage in these fields are somewhat surprising. For example, publications in the social sciences and chemistry have the highest usage numbers compared to other fields. This points to the role of WoS in these fields as search tool more than defining the set of target journals in which to publish. Compared to other countries, Iran has relatively lower highly-cited scores in chemistry and clinical medicine, but relatively higher highly-used scores in the social sciences.

Table 3 Characteristic scores and CSS-class shares of citations for three countries: Belgium (top), Israel (center) and Iran (bottom).
Table 4 Characteristic scores and CSS-class shares of usage counts for three countries: Belgium (top), Israel (center) and Iran (bottom).

Conclusions

We found a significant moderate correlation between citations and usage counts in WoS. It is interesting to see that the correlation between citations and usage counts is stronger in chemistry and the social sciences. However, similarly to earlier observations concerning the correlation between downloads and citations, no causality in one particular direction should be assumed (Glänzel and Heeffer 2014). As mentioned by Wang et al. (2016), the “clicking and saving” WoS usage counts may not perfectly represent the usage data of “views and downloads” provided by publishers’ platforms. We will tackle the analysis of the limitations of the relation between “usage” and “download” as a future research task.

On the other hand, an increase in the number of authors is not associated with an increase of usage counts or citations to such a degree as the correlation of usage counts and citations. This is especially notable in the case of Belgian publications in the social sciences, chemistry and mathematics. Israeli publications in the social sciences have the most balanced associations among the three indicators. The three countries in our sample set show different patterns of three relations. In short, Iran generally has stronger correlations between these three indicators than the other two countries.

The application of CSS was shown to work in the usage counts as well as citation counts, keeping the stability of class distributions between citation and usage. Additionally, distinctly different patterns in citations and usage are observed, but the similarities within citations and usage in these fields are somewhat unexpected. Chemistry and the social sciences have the most distinct patterns with much higher usage than citations compared to other fields, revealing the function of WoS in these fields as search tool instead of target journal set to publish.

This study observes potential motivations of the users of an interdisciplinary database. The usages and citations in WoS reveal the scholarly interests and recognitions in an interdisciplinary but limited coverage. The three examples substantiate that, in general, there is no clear relationship between WoS usage and citation counts; the sometimes contradicting relationship between number of co-authors and citation impact, however, surprises. The moderate correlation between citation impact and usage in this study indicates a mild link between using a bibliographic record in a database and in a publication. The stronger correlations in the fields with higher mean usage rates, e.g. chemistry and the social sciences, imply that the more diverse usage patterns cross fields affect the results more than citations. Similarly, fields with higher authors per paper have stronger correlations between citation impact and number of co-authors.