Introduction

A major recent goal of research policy is to encourage collaboration. One reason for encouraging collaboration is that the findings of numerous studies (described below) show that collaborative research tends to be more highly cited than non-collaborative research. Building on the assumption that higher citation reflects higher research quality, the reasoning is that encouraging behaviour associated with high citation could be conducive to higher research quality. But to what extent does the higher citation of collaborative research vary from region to region? Many studies of collaboration have examined articles by authors from a single country. An advantage of comparing regions is that the comparison can indicate whether findings in studies on a single country are likely to apply to countries in general. Moreover, establishing the extent to which the relationship between citation and collaboration varies from region to region provides further understanding on the relationship between citation and collaboration.

This research investigates the variation between countries and American states in the extent to which collaborative research is more highly cited than non-collaborative research. The rationale is that major variations between regions would indicate that it cannot be assumed that findings on collaboration in one region would apply to another region. The definition of collaboration used here is that an article is collaborative if, and only if, the database from which the data is collected lists it as having more than one author; thus the authors of a collaborative article can be in the same institution or even in the same department.

This research also introduces a method that can be applied to other studies: It examines the extent to which findings for the countries are mirrored by findings for the US states. This provides a broader picture of the phenomenon investigated.

Related research

Several macro-level studies in science have convincingly shown an association between collaboration and citation. In their investigation of all papers indexed by the Science Citation Index (SCI) in 1992, Persson et al. (2004) found that the mean citation rate excluding self citation increased, on average, by about .58 citations for each additional author. In an investigation of Spanish SCI articles published between 1991 and 1993, Bordons et al. (1996) found statistically significant correlations for Gastroenterology, Cardiovascular Systems and Neuroscience between the productivity of authors and their international and domestic collaboration, but for local collaboration found a statistically significant correlation only in the case of Gastroenterology. This demonstrates that some types of collaboration may not be conducive to productivity and also suggests that national collaboration may be less associated with higher quality research than with international collaboration. An alternative explanation may be that more capable researchers are more able to establish international collaboration or that higher quality research projects are more likely to attract international teams.

As regards international collaboration, an investigation of nearly half a million U.K. SCI publications from 1981 to 1994 (Katz and Hicks 1997), found that articles by authors from two countries, on average, received about 50% more citations than articles by authors from a single country. Positive associations between international collaboration and citation rates have also been found for Chilean physics (Vogel 1997), Scandinavian science (Glänzel 2000), chemistry (Glänzel and Schubert 2001), Brazilian science (Leta and Chaimovich 2002), New Zealand science (Goldfinch et al. 2003), Danish industry (Frederiksen 2004), HIV/AIDS in Nigeria (Uthman 2008) and wood preservative chemical research (Yi et al. 2008). In molecular biology, Ma and Guan (2006) found a correlation between collaboration and citation for Chinese molecular biology, although the earlier study of Herbertz (1995) did not find this correlation amongst well-known research institutes. A link has also been found between higher citation and collaboration for Brazilian Management Science (Pereira et al. 2000) and library and information science (Levitt and Thewall 2009). Glänzel (2002) found that the relationship between collaboration and citation was field dependant.

In social science some studies that use SSCI data have not found statistically significant correlations between collaboration and citation. These include investigations of sociology (Crase and Rosato 1992) finance (Avkiran 1997), ecology in the US and Europe (Leimu 2005) and in two library science journals (Hart 2007).

A few studies of collaboration have compared multiple countries: Latin American countries (Gómez et al. 1999), US and Europe (Leimu 2005), Scandinavian counties (Glänzel 2000), 36 countries worldwide (Glänzel and Schubert 2001) and 50 countries worldwide (Glänzel 2001). Gómez et al. (1999) found considerable variation in the collaborative rates of Latin American countries; for 1991–1995 the internal collaboration rate was 23–40% for the five countries that published 2,000 papers whereas it was 60–74% for the five countries that published fewer than 100 papers. In an investigation of 837 papers published in Oecologia between 1998 and 2000, Leimu (2005) found that the citation advantage of collaborative articles was higher for US authors than for European authors. Glänzel (2000) found that for each of the 31 countries that published the largest number of science papers in 1995, on average internationally collaborative articles were more highly cited that non-international collaborative articles. Glänzel and Schubert (2001) in an investigation of chemistry papers published in 1995 found that for 33 of the 36 countries international collaborative papers were on average more highly cited than non-international collaborative papers. Glänzel (2001), in a comparison of collaboration for the 50 countries that had at least 1,000 WoS publications in 1995 and 1996, found large differences: for Switzerland 47.5% of publications were internationally collaborative, whereas for Japan the figure was only 14.4%.

Research questions

Glänzel (2000) and Glänzel and Schubert (2001) found that for most countries investigated (31 out of 31 and 33 out of 36) international collaborative papers were on average more highly cited than non-international collaborative papers, but this citation differential varies considerably from country to country. However, no study has investigated whether the citation advantage of all types of collaborative articles over non-collaborative articles applies to all countries or varies considerably from country to country. In addition, no study has investigated for US states the citation advantage of collaborative over non-collaborative articles.

In order to fill these gaps, this study compares collaboration in 18 countries and 17 US states with at least 70 SSCI Economics articles published in 2000. It addresses the following questions:

  1. (1)

    Is there a citation advantage of collaborative over non-collaborative articles for every country and every state?

  2. (2)

    Does the citation advantage of collaborative over non-collaborative articles vary substantially from country to country or from state to state?

  3. (3)

    To what extent do the findings depend on the indicator of citation level used?

The rationale for investigating the first two questions is that, as most investigations of collaboration examine a single country or all countries taken together, it is important to establish the extent to which findings on collaboration for one country apply to other countries. The rationale for investigating the third question is that most investigations of collaboration have used a single indicator of citation, the mean number of citations per article. One disadvantage of this indicator is that mean citations can be heavily skewed by a few very highly cited articles. Because of the large number of countries and states examined, this study does not investigate the reasons for why collaborative articles are more highly cited than non-collaborative or why the difference varies between regions.

Data collection

This research examines collaboration in the 18 countries and 17 US states for which the SSCI has at least 70 Economics articles published in 2000. Norris et al. (2008) have compared the inter-country citation levels of Economics articles, but did not examine collaboration. Seventy was used as the cut-off point in order to include as many countries and US states as possible, whilst avoiding a large proportion of the values for one indicator (the 90th percentile) being less than five. The percentiles were obtained by using the ‘Times cited’ facility of the SSCI to rank articles in decreasing order of citation. The 90th percentile is defined as the number of citations of the article with fewest citations that is raked in top 10%; two other indicators (the 75th and 50th percentile) are defined in a similar way.

Economics was chosen as it is the subject category of the SSCI with the most articles published in 2000; this year was chosen to allow a long citation window whilst providing information on recent collaboration. For technical reasons the citation window is citation-to-date (November 2008).

The data collection from the SSCI used a search to isolate the 6,636 Economics articles published in 2000. The articles were extracted by isolating all articles in the SSCI in a year, by specifying in the General Search that ‘A* OR B* OR C* OR D* OR E* OR F* OR G* OR H* OR I* OR J* OR K* OR L* OR M* OR N* OR O* OR P* OR Q* OR R* OR S* OR T* OR U* OR V* OR W* OR X* OR Y* OR Z* OR 0* OR 1* OR 2* OR 3* OR 4* OR 5* OR 6* OR 7* OR 8* OR 9*’ in the Publication name. This method was used extensively in Levitt and Thelwall (2008, 2009). In the investigation of countries the ‘Analyze’ facility was used to identify the countries with over 70 articles and to isolate the articles in each country. For each country the number of citations, a list of the authors and the language used was obtained from the record of each article. The rationale for examining the language is that this data is used to identify the extent to which the findings are affected when the study is limited to only articles in English. In the case of US states, searches were conducted to isolate all Economics articles that had the two letter state identifier (e.g., NY for New York) in the address and a sample of the data checked for false matches. For example, a search for ‘IL’ in the address, yields articles not only by authors from Illinois, but also articles by authors from Israel; the false matches were eliminated by hand processing.

For technical reasons this study does not exclude self-citations. However Glänzel and Thijs (2004), in an investigation of all WoS articles, letters, notes and reviews published during 1992–2002 found “at the macro level multi-authorship does not result in any exaggerated extent of self-citations.” In addition, a previous study indicted that higher rates of self-citation in international collaboration do not play any significant role in increasing a nation’s impact (Van Raan 1998). However, the proportion of publications with self-citation was found to be over 30% for China, Spain, Japan and Belgium (Glänzel et al. 2004). China had the highest percentage of self-citing publications and therefore findings on self-citation in the cited articles by authors from China are presented in the limitations section.

Data analysis

Four indicators of citation level are used: Mean citations per article, and the number of citations of the articles in the 90th, 75th and 50th percentiles. Numerous studies of collaboration (e.g., Katz and Hicks 1997; Goldfinch et al. 2003; Frederiksen 2004; Yi et al. 2008; Levitt and Thewall 2009) have used mean citation and several recent studies (Lewison 2007; Schwartz and Fang 2007; Levitt and Thewall 2009) have used percentiles. Some indicators, other than mean citation and percentiles, were also considered.

Relative Expected Citation Rate (RECR), introduced in Schubert and Braun (1986) and used in Glänzel (2000), is a normalised mean citation rate per article. For example the RECR for US Economics articles is the mean citation rate for US Economics articles divided by the mean citation rate for all Economics articles in the world. It was decided to use the mean citation rate rather than the RECR, as the mean provides data on the actual citation level and this investigation does not compare subjects.

Two indicators were considered as alternatives to percentiles. One indicator considered was multiples of the world average, used in the analysis of the pilot data of the UK’s Research Framework Exercise (HEFCE 2009, p. 14). This is similar to our citation level indicator, because it is based upon a specific cut-off level for citation, but the percentile indicator used here gives additional flexibility. Another indicator considered was the Hirsch k-Frequency. Levitt and Thewall (2007) defined the Hirsch k-Frequency as the number of documents in the collection that are cited at least k * h times, where h is the h-index of the collection. Percentiles were preferred in this study, as these seem more suited to comparisons between subjects.

The choice of the 90th, 75th and 50th percentiles was shaped by the data. It was decided not to use a percentile higher than the 90th percentile, as in the case of Norway the findings for the 95th percentile were based on only four articles. It was decided not to use a percentile lower than the 50th percentile as in the case of Japan the 50th percentile was only two (and low values resulted in less precise findings). The decision to have increasingly large gaps between the percentiles (15 and 25) reflected that the difference in citation level between the 90th and the 70th percentiles, in general, was substantially larger than the difference in citation level between the 70th and the 50th percentiles.

Mean citation is used because it is a standard indicator in bibliometrics. It might be argued that an advantage of using the mean citation as an indicator is based on the central limit theorem and the notion that the mean of the citation distribution is presumed to characterize the whole citation distribution for a cohort of papers. Whilst percentiles do not have this advantage, unlike the mean they cannot be distorted by one or two very highly cited articles. Moreover, the use of percentiles in bibliometric investigation is well established, in the sense that the median (50th percentile) is often used. Using both the mean and a range of percentiles thus provides a fuller picture of the distribution of citations than using the mean alone.

Findings

This study uses the 75th percentile as its key indicator, because the mean citation can be highly skewed by a few very highly cited articles, the 90th percentile sometimes covers too few articles, and the 50th percentile is too crude an indicator because the numbers in some cases do not vary much. First it uses the 75th percentile to seek to identify trends and then uses the other indicators to examine the extent to which the trends apply to the other indicators.

Table 1 presents descriptive statistics and indicators for each country for all SSCI Economics articles published in 2000, including the percentage of articles that are collaborative.

Table 1 Comparison of the number of article, the percentage collaborative articles, and the citation level of the 75th percentile for all articles (SSCI Economics articles in 2000)

From Table 1, every country published at least 53 articles and each state at least 81 articles. The range and median of the percentage of collaborative articles for countries are 228 and 91% for those for US states. The range of the citation level for countries is 50% of that for US states; a comparison of the median of citation levels is of limited interest, as the citation level for the US is a close approximation to the median of US states. In the case of the three countries with the highest percentage of articles written in a language other than English (France, 17.5%; Denmark, 12.5%; Germany 10.2%), the citation levels are depressed by including articles not written in English; the mean citation of all articles not in English is only .50 whereas for each of these countries the mean citation of all articles is between eight and nine. For the countries the correlation between the number of articles and citation level is negative (antithesis of the Matthew Effect) but not statistically significant, whereas for US states the .65 correlation is positive (Matthew Effect) and highly statistically significant. Although the country with the lowest percentage of collaborative articles (Japan) also had the lowest 75th percentile, for the countries in general there is not a statistically significant correlation between a low percentage of collaborative articles and a low 75th percentile.

Table 2 uses the 75th percentile to compare for the countries and for the US states the citation advantage of collaborative over non-collaborative (solo) articles.

Table 2 Comparison of the values of solo and the citation advantage of collaboration, 75th percentile as indicator (SSCI Economics articles in 2000)

In Table 2, the range and median of non-collaborative (solo) articles for countries are 31 and 67% of those for US states. The range and median of the citation advantage of collaboration for countries are 89 and 92% of those for US states. Although for all 18 countries this citation advantage is greater than or equal to one, for two of the 17 US states it is less than .78 (indicating a considerable citation disadvantage).

Tables 1 and 2 yielded the following findings on the relative citation levels of countries and of US states: (a) The range of the citation level for countries is half of that for states, (b) The range of solo for countries is less than a third of that for states, (c) The median of solo for countries is two-thirds of that for states, (d) The range and median of the citation advantage of collaboration for countries is about 10% smaller than those for states, and (e) For every country and all except two states there is not a citation disadvantage in collaboration.

The remainder of this section presents data on the other citation indicators that indicate the extent to which these findings are dependant on the choice of citation indicator. Table 3 compares, for each secondary indicator, the citation levels of the countries and US states.

Table 3 Comparison of country and state citation level, using the mean, 90th percentile and 50th percentile (all SSCI Economics articles in 2000)

In Table 3, for each indicator the range of the citation level is lower for countries than for US states; the comparison of the median of citation levels is of limited interest, as the citation level for the US is a close approximation to the median of the US states. Table 6 compares the extent to which the lower range of the citation level for countries depends on the choice of citation indicator. Table 4 compares, for each secondary indicator, the values of solo of the countries and the US states.

Table 4 Comparison of the values of solo, using the mean, 90th percentile and 50th percentile (all SSCI Economics articles in 2000)

In Table 4, for each indicator the range and median of the values of solo are lower for countries than for US states. Table 6 compares the extent to which these findings depend on the choice of citation indicator. Table 5 compares, for each secondary indicator, country and state, the citation advantage of collaboration.

Table 5 Comparison of the citation advantage of collaboration, using the mean, 90th percentile and 50th percentile (all SSCI Economics articles in 2000)

In Table 5, using the mean, the range of the citation advantage of collaboration for countries is less than the range for US states, whereas using the other indicators the range of citation advantage is greater for countries. For each indicator the median citation advantage of collaboration for countries is greater than or equal to that for states. Table 6 compares the extent to which these findings depend on the choice of citation indicator. For each indicator, for all 18 countries the citation advantage of collaboration is greater than or equal to one. However, when using the mean for two states this citation advantage is less than .8 and when using the 90th percentile for four states this citation advantage is less than .8 (indicating a substantial citation disadvantage).

Table 6 Comparison of the values for country expressed as a percentage of the vales for state, for all indicators (all SSCI Economics articles in 2000)

The findings frequently compared the range and median value for the countries with the range and median values for the states. The main findings are summarised in Table 6.

Discussion

The first research question asks: Is there a citation advantage of collaborative over non-collaborative articles for every country and every state? This question is addressed simultaneously with the third research question: To what extent do the findings depend on the indicator of citation level used?

The findings indicate that that for every country examined and every indicator used there is either a citation advantage in collaboration or at least not a disadvantage in collaboration. In addition, when the 50th percentile is used as the indicator of citation for every state examined there is either a citation advantage in collaboration or at least not a disadvantage. However when the 75th percentile or mean is used as the indicator of citation two states experience a substantial citation disadvantage (defined here as citation advantage <.8) and when the 90th percentile is used four states experience a substantial citation disadvantage. These findings indicate that in the case of the states, the findings for question one depend on the choice of indicator. Five of the 17 states experienced a substantial citation disadvantage: one for the 75th percentile, mean, and 90th percentile, one for the 75th percentile and mean, and three for the 90th percentile.

The second research question asks: Does the citation advantage of collaborative over non-collaborative articles vary considerably from country to country or from state to state? This question is also addressed simultaneously with the third research question.

In the case of all indicators, for both countries and states the range of the citation advantage exceeds the median citation advantage, indicating that the citation advantage varies substantially from country to country and from state to state. Table 6 indicates that for the key indicator, the 75th percentile, the range and median of the citation advantage for the countries is similar to the range and median for the states. However, in the case of the range, the findings vary considerably form indicator to indicator; for the 75th percentile the range of the countries is 89% that of the states, whereas for the 90th percentile the range of the countries is 166% that of the states. This finding indicates that even at the global level findings on citation advantage can vary considerably from indicator to indicator.

As mentioned in the final sentence of the research questions section, the scope does not extend to investigating the reasons why collaborative articles are more highly cited than non-collaborative or why the difference varies between regions. There are many factors such as the reputation of the journal, the readership of the journal, the topic of the paper, and the reputation of the authors, that could contribute to the different citation rate. It could be, for example, that more reputable economists are more likely to co-author, and thus the author reputation is driving the result and not simply the fact that the paper is co-authored. Perhaps the most likely explanation, however, is that there are different dominant field specialisms across regions, and that these field specialisms may have differing citation cultures. Further research is needed to explain the big differences between states.

Limitations

A limitation of this study is that it is for a single subject in social science and for a single year; the findings might be different for other social science subjects, for science subjects and for different years. In addition, the percentiles take only integer values, and so could be influenced significantly by the presence or absence of a small number of cases; for example, the 50th percentile for non-collaborative France articles is one, but the 55th percentile is two. For this reason, the 50th percentile and ratios need to be interpreted with care. A second limitation, mentioned in the Method, is that for technical reasons this article does not exclude self-citations. As mentioned in the data collection, Glänzel et al. (2004) found that China had the highest percentage of publications with self-citing publications. In order to gauge the effect of self-citation, self-citation was investigated in the cited Economics articles published in 2000 by authors from China. For the 66 cited collaborative articles 21% were self-cited, whereas for the 25 cited single-author articles 32% were self-cited; five of the cited collaborative articles were self-cited more than once, whereas only one cited single-author article was self-cited more than once. The five most highly cited articles were all collaborative and on average received 4.4 self-citations; the remaining 56 cited collaborative articles averaged .16 self-citations and the 25 cited single-author articles averaged .36 self-citations. These findings need to be interpreted with care, as they are based on small samples; however, they do not indicate that the findings would have been appreciably different if self-citation were to have been excluded. An unexpected finding is that the much higher level of self-citation amongst the most highly cited five articles; as these were all collaborative these articles contributed to self-citations of the collaborative articles only. An interesting question is whether, on other samples, high self-citation amongst highly cited articles contributes appreciably to the association between collaboration and self-citation.

A third limitation is that ‘considerably’ in the second research question is a qualitative term and therefore some may regard the differences found here as being not substantial enough. A fourth limitation is that many citations occur outside the Web of Science database (WoS); 53% of the citations to SSCI Economics articles may be in non-WoS publications (Moed 2005, p. 126) and another study found that 134 Economics articles averaged .6 citations from WoS and as many as 5.1 citations from Google Scholar (Kousha and Thelwall 2007). It would be interesting to know whether the findings in this research would have been different if a different data source, such as Scopus or Google Scholar, had been investigated.

Conclusions

The finding that, for all four indicators, all 18 countries did not have a citation disadvantage in collaboration indicates that some findings on collaboration apply to a wide range of countries and are not dependant on the choice of indicator. But for five of the 17 US states for at least one indicator the citation level of collaborative articles was less than 80% the citation level of solo articles.

The finding that for the countries and US states for all indicators the range of the citation advantage exceeds the median indicates that there is considerable variation in citation advantage from country to country and from state to state. This indicates that the extrapolation of findings on collaborative citation advantage from one country to another needs to be treated with caution and that studies examining multiple countries are likely to provide a more complete picture of collaborative citation advantage.

The Discussion section found that, even at the global level, findings on citation collaborative advantage can vary considerably from indicator to indicator. This indicates the importance of using multiple indicators when investigating citation advantage, in that studies using multiple citation indicators are likely to provide a more complete picture of collaborative citation advantage.

Finally, this article introduces a method that can be applied to other studies: Examining the extent to which findings between countries are reflected by findings within a single large country could provide a broader picture of collaboration or any other characteristic investigated. For example, the method can be used to investigate collaboration for other years and for other subjects.