Introduction

Collaboration among scientists and its effect on the citation impact of papers in different scientific fields has been the focus of a number of investigations (Bornmann 2017). No agreement has yet been reached on the advantage that multi-authored Information Science & Library Science (ISLS) papers may have on attracting citations. The hypothesis that collaboration is a driver fostering higher impact ISLS articles has been supported by Levitt and Thelwall (2009, 2010, 2016) and Sin (2011). Conversely Hart (2007) found no evidence to support this hypothesis in two important ISLS journals. Previous studies left an important question unanswered: How much is the citation impact of ISLS articles expected to change with a change in collaboration patterns? The answer to this question is important for the formulation of publication strategies by authors, research evaluation organizations, and scholars of the ISLS scientific community. We will attempt to give a partial answer to this question using power law correlations.

The number of papers published in journals of a scientific discipline and the citations to these papers have a skewed-right distribution (de Solla-Price 1965; Egghe 2007; Egghe and Rousseau 2013; Lotka 1926; Sutter and Kocher 2001). Recent studies using the power law approach supported the idea that collaboration is a driver to foster citation impact (Coccia and Bozeman 2016; Ronda-Pupo and Katz 2016a, b, 2017). These studies show that the power law regression approach accurately assesses the non-linear relationship between collaboration and citation impact.

The study of the power law relationship between citation impact and collaboration patterns of articles of ISLS journals will add a new perspective to previous findings on this bivariate relationship. It will be shown that scale-independent indicators derived from the power law correlation can be used to inform decision makers about the effect that different types of co-authorship can have on research impact in ISLS.

The aim of the present study is two-fold; to explore the distributions of citations to ISLS papers and also to investigate the power law relationship between citation impact and ISLS journal sizes according to the number of each type of collaborative papers published: non-collaborative papers (i.e. single-authored) and papers with collaboration including subsets of both multi-internationally authored and multi-domestically authored ones. Two differences between the present study and previous ones are: (1) a scale independent approach was used to account for the nonlinear relationship between collaboration and citation impact, and (2) citations were counted using a fixed citation window of five years (\(t_{0 + 4}\)).

The research questions for the study are:

Do the citation distributions of ISLS papers (overall, single-authored, multi-authored, domestic multi-authored, international multi-authored) follow a power law distribution?

Does the citation impact of ISLS articles increase when the number of multi-authored articles increases?

Which type of article, multi-authored or single-authored, shows a stronger Matthew Effect?

Which type of multi-authored article shows a greater Matthew Effect, one with international collaboration or one with domestic collaboration?

To answer these research questions, an analysis was performed on articles and reviews published in 86 journals in the Clarivate Analytics Web of Science (WOS) ISLS category indexed in Journal of Citation Reports (JCR) between 1985 and 2012, inclusive.

Background

The synonymous terms cumulative advantage, Matthew Effect, rich get richer or preferential attachment have their roots in the Gibrat’s rule of proportionate growth or the Law of proportionate effect (Gibrat 1931), also known as Gibrat’s Law. Gibrat stated that the proportional rate of growth of a firm is independent of its absolute size. Simon (1955) called this behavior the rich get richer while the poor get comparatively poorer and Merton (1968, 1988) called this success-breeds-success phenomenon the “Matthew Effect”, after a well-known verse in the Gospel according Matthew (Matthew 25:29, King James version). Newman (2005) pointed out that this behavior also goes by the names of the Matthew Effect (Merton 1968), cumulative advantage (de Solla-Price 1976), or preferential attachment (Barabási and Albert 1999) in network science.

In bibliometric and informetric studies the origin of the study of scaling behavior dates back to the studies of Alfred Lotka (Lotka 1926) and Price (de Solla-Price 1965). Later Naranan (1971) introduced power law as an estimation method to study Bradford’s Law in scientific journals. These studies were the theoretical foundations of the Egghe (2005) Lotkaian informetric. The Lotkaian approach focuses mainly on characterizing distributions that satisfy Price’s Law and the consequences of Zipf’s Law and Mandelbrot fractals (Egghe and Rousseau 1986).

The power law approach has been systematically used to analyze citation networks (Mayernik 2010; Milojevic 2010; Zhao and Ye 2013), citation distribution probabilities (Brzezinski 2015; Milojevic 2010; Thelwall 2016a, b, c), citation-based indicators (Egghe et al. 2009; Ye and Rousseau 2008) and research output (Sutter and Kocher 2001). It has been used to analyze research output at the level of countries (Katz 1999), metropolitan areas (Nomaler et al. 2014), universities (van Raan 2008) and research groups (van Raan 2006). Also, scaling laws have been used to study the historical evolution of scientific fields (Bettencourt et al. 2008). There are two general types of scale-invariant relationships: the power law probability distributions with the form of \(f\left( x \right) = kx^{ - \alpha }\) for \(x \ge x_{ \hbox{min} }\) and the power law correlations with the form \(f\left( x \right) = kx^{\alpha }\) where \(k\) is a constant and \(\alpha\) is a constant called the scaling factor (Ronda-Pupo and Katz 2017). Generally speaking, power law distributions are more frequently studied than power law correlations in informetric and scientometric studies.

Previous studies on the relationship between collaboration patterns and the citation impact of articles published in ISLS journals differ from each other in one, some or all of the following aspects: (a) the time frame being analyzed, (b) the sources used, (c) the research methods applied and (d) the results reported. Table 1 summarizes these issues in each of previous studies.

Table 1 Summary of results presented in previous studies on the relationship between collaboration and citations in ISLS journals

Previous studies covered the time frame from 1976 through 2009, inclusive. The Hart (2007), Levitt and Thelwall (2009) and Sin (2011) time frames overlap from 1986 through 2003 but the data sources differed. Each study added years to the analysis looking for a change in the patterns reported in the preceding studies.

The sources of data for analysis differed in the five studies. While Levitt and Thelwall (2009, 2016) included all ISLS journals, Hart (2007) only included two journals. Sin (2011) included only top-tier journals of the field and Han et al. (2013) included articles from 15 core LIS journals.

The findings reported in the five studies also differed. Hart (2007) found no relationship between citation impact and collaboration patterns. Conversely, Levitt and Thelwall (2016, 2009) and Sin (2011) found support for a positive relationship between citations and collaboration. Particularly Levitt and Thelwall (2009) reported that collaboration has apparently not tended to be important to the success of current and former individual elite information science scientists.

The study of the correlation between international collaboration patterns in ISLS and their citation impact has been carried out in Sin (2011) and Han et al. (2013). Sin (2011) concluded that international co-authorship is related to higher citation counts and as well that the papers with co-authorship with Northern European authors or authors in high-income nations had higher odds of being cited. Han et al. (2013) found the USA to be the core of the international collaborative research network in ISLS. Nevertheless, the institutions located in the USA are more inclined to domestic collaborations.

The literature analyzing the scaling correlation between citation impact and collaboration patterns has increased in recent years (Coccia and Bozeman 2016; Ronda-Pupo and Katz 2016a, b, 2017). Generally speaking, these studies support the hypothesis that collaboration is a driver of citation impact. Additionally, they illustrate that the scaling approach is an accurate way to interpret the nonlinear relationship between co-authorship patterns and citation impact. Furthermore, the cumulative advantage of types of multi-authorship can be quantified by their scaling exponents and used for comparative purposes. The present study will provide novel information on this subject matter to complement findings reported in previous studies on ISLS.

Methods

In the present study, we will analyze the probability distributions of citations to ISLS articles published from 1985 to 2012 by multi-authorship type and the effect that the type of collaborative activity has on two scaling correlations: (1) the scaling correlation between relative exponential growth of citation over time compared to the exponential growth of papers and (2) the scaling correlation between citation impact and journal size, measured using numbers of papers (P) published in each ISLS journal.

Definition of variables in the model

Dependent variable

Citation impact (I): is the number of citations received by papers published in each ISLS journal. Citations to papers were counted using a fixed 5-year time window which included the year of publication plus four additional years. This way the results are not biased by fluctuation in citations counts of journals of different Impact Factors, and reduce the effect of the age of papers since all papers have the same amount of time to accumulate citations.

Independent variables

Collaboration (P): the independent variable is the number of papers of a specific collaboration type which were published in a fixed time interval in a specific ISLS journal.

The independent variables were defined and coded using the authorship pattern observed in each ISLS article published in the time span analyzed. The first classification consisted of the number of signing authors of the paper. A paper signed by more than one author was coded as a multi-authored paper. These multi-authored papers were also classified according to the number of countries represented in the paper as either domestic (only one country) or international (more than one country). The papers published by a single author were coded as single-authored papers.

International collaboration: is the number of multi-authored ISLS articles published by scientists from institutions of different countries in a time interval in a specific journal.

Domestic collaboration: is the number of multi-authored ISLS articles published by scientists all working in institutions of the same country in a time interval in a specific journal.

The dependent variable citation impact (I) is a partial measure of a paper’s impact. The independent variable papers (P) is the number of papers published in a specific journal in a time interval depending on multi-authorship type.

The model posed for the study is.

$$I = kP^{\alpha }$$
(1)

The logarithmic transformation of the above equation is a simple linear relationship where \(\alpha\), the scaling parameter, is given by the slope of the log–log regression line.

$$\log (I) = \alpha \log (P) + \log (k)$$
(2)

The parameters of the correlation k and α are calculated using the Ordinary Least Squares (OLS) method because it produces fitted values with the smallest error (Leguendre and Leguendre 2012) and also, because OLS is asymmetric (Smith 2009). We are interested in predicting citation impact from the independent variable of peer-reviewed ISLS papers. The exponent of the power law correlation is a measure of the magnitude of the Matthew Effect (Katz 1999; Ronda-Pupo and Katz 2017; van Raan 2013) mathematically described as follows. Given a scaling correlation \(f\left( x \right) = kx^{\alpha }\), then for \(\alpha > 1 f\left( x \right)\) increases non-linearly with \(x\) indicating that the correlation is super-linear and its magnitude is a measure of the Matthew Effect or cumulative advantage (Katz 2006; van Raan 2013).

When \(\alpha = 1\), both citation impact and the size of the system grow linearly at the same rate. There is no cumulative advantage of one variable over the other. If the scaling exponent is \(\alpha < 1\) the correlation is sub-linear and its magnitude is a measure of the inverse Matthew Effect or cumulative disadvantage (Katz and Cothey 2006). The dependent variable scales negatively with the size of the system.

To determine the collaboration type with the stronger Matthew Effect it is assumed that if we find \(f\left( x \right) '= {\text{k}}x^{\alpha '}\) where \(f\left( x \right) '> f\left( {\text{x}} \right)\) it indicates that the Matthew Effect for \(f\left( x \right)^{ '}\) is stronger than the Matthew Effect of \(f\left( x \right)\). This can be true if and only if \(\alpha '> \alpha\) the magnitude of the exponent is proportional to the increase in the Matthew Effect (Ronda-Pupo and Katz 2017).

Data source

The data for the study consists of publications in Information Science & Library Science journals listed in the Journal Citation Reports, 2015 edition. We used documents type articles including proceeding papers published in journals and reviews. We used these publication types for two reasons: (1) they are peer-reviewed and (2) they are a primary route for disseminating new knowledge in most scientific disciplines (Adams and Gurney 2013).

To retrieve the data, we used Web of Science™ Core Collection. Tag Advance search SO = ‘Journal Names separated by the Boolean operator OR’ Refined by: Document Types: (Article OR Review). Timespan: 1985–2012, inclusive. Indexes: SSCI-EXPANDED, SSCI, A&HCI.

Results

A search on the Web of Science Core Collection with the criteria posed for the query retrieved 207,711 documents yielding 54,656 articles and reviews. 17,409 papers with incomplete address fields were excluded because it was not possible to code them as either international or domestic. Also, 9116 papers that did not receive citations in the citation window selected were removed. Table 2 shows the number of articles published in ISLS journals between 1985 and 2012, and the number of citations they received according to collaboration types.

Table 2 Number of documents and citations according to authorship patterns

The multi-authored documents accounted for 69% of overall ISLS papers published in the time frame analyzed. Similar results were reported in Han et al. (2013) that reported that 66% of LIS articles are multi-authored. Also, this result supports Ardanuy (2011) on the idea that academic collaboration in social sciences is relatively low compared with experimental sciences.

In 81% of multi-authored papers, collaboration is domestic. That is between institutions within the same country. This is due to the USA accounting for 48.26% of overall scientific production of the ISLS discipline and, as Han et al. (2013) reported, the institutions located in the USA are more devoted to domestic collaborations, suggesting institutions in the USA have a low propensity towards international collaboration in ISLS. Multi-authored papers accounted for 77% of overall citations. The domestic collaborative papers accounted for 78% of overall citations to multi-authored papers.

The distribution of citations to articles in ISLS journals

The distributions of citations to different authorship patterns were analyzed using the Clauset et al. (2009, p. 3, see Box 1) routine. The procedure is summarized in two steps: (1) estimating parameters \(x_{ \hbox{min} } \alpha\), \(p\) and (2) comparing the power law with alternative distributions as explained below.

Estimating parameters \(x_{ \hbox{min} }\) and \(\alpha\).

The exponents, α, of the power law distributions were determined using the Maximum Likelihood Estimator and the Kolmogorov–Smirnov (KS) test for estimating \(x_{ \hbox{min} }\).

Table 3 shows the result of fitting the data to a power law distribution to 5000 iterations through Monte Carlo bootstrapping analysis to the distribution of citations to the following five data sets; overall papers, multi-authored papers, international multi-authored papers, domestic multi-authored papers and single-authored papers. All exponents lie within the range 2–3 suggesting the presence of a possible power law distribution. The p value is significant for distribution of citations to single-authored and international collaborative papers. To corroborate if the distributions follow a pure power law, a comparison to possible alternative distributions was carried out.

Table 3 Results of fitting the power-law to the datasets

Comparing the power law with alternative distributions

The hypothesis for the power law distribution was compared to exponential, log-normal, Poisson, and power law with exponential cut-off as possible alternative distributions using log-likelihood ratios (LR) and p values. Positive values of LR indicate that the power law model is favored over the alternative. The distribution with the most negative LR and significant p value is favored (Clauset et al. 2009).

Table 4 shows the results of comparing the power law hypothesis to other competing distributions. The results suggest that the power law fit for citations to single-authored and for international multi-authored papers cannot be ruled out by competing distributions so the power law seems to be a good fit for these distributions.

Table 4 Results of comparing the power law to competing distributions

For overall citations and domestic multi-authored distributions, the p value for lognormal is significant but the log-likelihood ratio for power law with exponential cut off is higher in both cases. Since the p value of the power law alternative is < 0.10, the power law with exponential cut-off distribution is likely a better fit. According to Clauset et al. (2009): “In order to make a firm choice between distributions we need a log-likelihood ratio that is sufficiently positive or negative that it could not plausibly be the result of a chance fluctuation from a true result that is close to zero”. For the citations to multi-authored papers, the power law with cut-off fit has a significant p-value and a negative LR; therefore, we conclude that it is the best fit. The power law with exponential cut-off is a degenerate form of a pure power law distribution described by the equation \(f\left( x \right) = kx^{ - \alpha } e^{ - \lambda x}\). Its characteristic behavior is that some entities in the far right hand tail of the distribution do not occur with as high a probability as would be expected of a pure power law distribution.

Collaboration type and scaling correlations

The effects that the type of collaborative activity had on two scaling correlations were examined: (1) the scaling correlation between relative exponential growth of citation over time compared to the exponential growth of papers and (2) the scaling correlation between citation impact and journal size measured using numbers of papers.

It has been shown for any pair of coupled exponentially growing or decaying processes, \(x = me^{ - at}\) and \(y = ne^{ - bt}\), that a scaling correlation exists between \(y\) and \(x\) with a scaling exponent given by the ratio of the exponents of the underlying exponential growth processes. In other words, \(y \approx x^{n}\), where \(n = \frac{b}{a}\) (Katz 2005; Sahal 1981). This principle is illustrated in Figs. 1 and 2. Figure 1 shows the exponential growth of multi-authored ISLS papers from 1985 to 2012, and the citations to these papers. Figure 2 shows the scaling correlation between growth of citations and multi-authored papers over the time interval. The ratio of the exponential exponents from Fig. 1 is 0.180/0.134 = 1.34 and the exponent of the power law correlation is 1.34 (Fig. 2). Since the scaling exponent is > 1.0 citation to multi-authored ISLS papers tended to increase \(2^{1.34}\) or 2.5 times with a doubling in the number of papers in journals.

Fig. 1
figure 1

The exponential growth of citation impact (open dots) and multi-authored ISLS papers (full dots) (1985–2012)

Fig. 2
figure 2

Power law correlation over the time interval 1985–2012

The normality test for log-transformed variables was checked and met (Shapiro–Wilk). Also, constant variance passed for all datasets. Like Coccia and Bozeman (2016) we used Student’s t- distribution to verify whether or not the scaling exponents of the power law correlation indicate that there is a power scaling law correlation between the variables under analysis. Table 5 shows the values with the associated probability.

Table 5 Relative growth of I and P for each collaboration type (1985–2012)

Table 5 gives the scaling exponents for the relationship between citation impact and numbers of publications of various types of authorship. The Matthew Effect for multi-authored articles is higher than it is for single-authored papers. The result contradicts the finding of Hart (2007) of no evidence to support the superiority of multi-authored articles and supports the conclusion of Levitt and Thelwall (2009, 2016) that collaborative research is conducive to higher citations.

It can also be seen that the Matthew Effect for domestic multi-authored articles is stronger than international multi-authored ones. Citation impact to domestic multi-authored articles increased \(2^{1.35}\) or 2.55 times when the number of domestic multi-authored articles doubled. On the other hand, the citation impact of international multi-authored articles increased \(2^{1.26}\) or 2.40 times. These results do not support the conclusion of Sin (2011) that papers that included international collaboration had higher odds of being cited more.

Now let’s turn our attention to the scaling correlation between citation impact and ISLS journal sizes. It is difficult to determine how the scaling relationship between citation impact and journal size for each multi-authorship type changes due to the relatively small number of papers which are published annually. In an attempt to gain a sense of how it might have changed the data was divided into two time intervals—pre-2000 and post-2000 publications. The pre-2000 interval contains papers published between 1985 and 1999. The post-2000 interval contains papers published between 2000 and 2012.

Table 6 gives the exponents for the scaling correlations between citation impact and journal sizes, as the number of papers in the journal, for different collaboration types in three time intervals: 1985–2012, pre-2000 and post-2000. The Matthew Effect for collaborative papers (multi-authored, internal and domestic collaboration) published from 1985 to 2012 was stronger than non-collaborative papers (single-authored). Also, the Matthew Effect tended to increase from the pre-2000 to the post-2000 time intervals for collaborative papers but conversely it declined for non-collaborative papers. It is worth noting that the Matthew Effect for international collaboration was stronger than domestic collaboration for 1985–2012. However, the percentage increase in the doubling effect of domestic collaboration was greater for international collaboration in the pre-2000 interval than it was in the post-2000 interval. In other words, the doubling effect of domestic collaboration grew from \(2^{1.16}\) to \(2^{1.36}\) or 15% while international collaboration grew from \(2^{1.29}\) to \(2^{1.40}\) for a 7% increase.

Table 6 The scaling correlation across journals for each collaboration type in three time intervals

Discussion and conclusion

The distributional analysis showed that all of the distributions could be modeled by a power law or power law with an exponential cut-off. These distributions exhibit scale-invariant properties, although the upper limit of the scale-invariant tail of a power law distribution with exponential cut-off is limited (Katz 2016).

The exponents for all of the distributions had values \(<\, 3.0\). A power law distribution with a scaling exponent \(< \,3.0\) cannot be characterized by its mean and standard deviation (Newman 2005). Therefore, indicators based on averages derived from these distributions cannot be reliably used for comparative purposes (Katz 2005). Scale-invariant citation distributions appear to be a common attribute of the ISLS scientific community.

The examination of the relative growth of citation impact with respect to the number of papers over time of the various collaborative types gives novel insights. The growth of citation impact of domestic multi-authored ISLS articles increased 2.55 times while growth of citation impact to single-authored papers increased 2.41. Similar findings were reported for the Natural Sciences (Ronda-Pupo and Katz 2017) and in the field of Management (Ronda-Pupo and Katz 2016a). The impact of international multi-authored articles increased 2.40 times with a doubling in the number of internationally multi-authored papers. Contrary to the findings of (Sin 2011), a greater relative growth of citation impact for domestic multi-authored articles was shown when compared to the relative growth in the number of domestically multi-authored articles versus when compared to the relative growth of citation impact to the growth of internationally multi-authored papers.

In general, the results show that for both types of multi-authored articles, domestic and international, the scaling exponents increased over time while the exponents of single-authored papers decreased. This behavior is similar to a finding previously reported (Ronda-Pupo and Katz 2016a, 2017) supporting the idea that collaboration is a driver fostering greater citation impact in scientific fields.

An effect that is important to note is that when the power law approach is used to assess the bivariate correlation between citation impact and authorship patterns over time or at points in time, the hypothesis that collaboration is a driver to foster citation impact is supported. However, studies that have analyzed this relationship using nonparametric statistics do not always support this finding.

An examination of how citation impact increases with journal sizes by collaboration type produced additional novel insights. The scaling correlations between citation impact and journal sizes were \(> 1.0\), irrespective of collaboration type, indicative of a Matthew Effect. Multi-authored articles increased their citation impact 2.53 times when the number of multi-authored articles published in a journal in a year doubled. The citation impact of single-authored articles increased 2.41 times when they were doubled in a year.

The present study supports the notion of greater cumulative advantage of citation impact of multi-authored articles over single-authored ones. These findings are similar to previous results reported for articles in management journals (Ronda-Pupo and Katz 2016a) and the domain of Natural Sciences (Ronda-Pupo and Katz 2017).

The practical implication for ISLS policy makers and research evaluation bodies is that the results support previous findings that research collaboration is a driver to enhance productivity and citation impact. Currently, the ISLS scientific community gains more citation impact through domestic than international collaboration. Two questions remain: (1) will domestic collaboration continue to have a stronger Matthew Effect than international collaboration in the future and (2) will the Matthew Effect of single-author papers continue to decline?