Introduction

The development in number of references in scientific publications over time has been studied at least since 1965 when Derek J. de Solla Price reported that on average, journal papers contain 15 references. More specifically, he had found that about 10% of all papers contained no references whereas 5% contained more than 45 references. According to de Solla Price (1965), 85% of all papers contained 25 or fewer references. This is a quite unique method when reporting the number of references. Many studies have followed in the footsteps of de Solla Price, but have only provided the average or median number of references, and thus not interval ratios. All of these studies show a clear tendency: The number of references in scientific publications is growing.

About 15 years after the publication of the study by Derek J. de Solla Price, Eugene Garfield noticed a substantial growth in the number of references per item in 37 core biochemical journals (Garfield 1979, 1980). Garfield examined the growth rate by developing an R/S journal indicator where R is the number of references contained in all publications during a specified year and S being the number of source articles (S) it published that year. Furthermore, by revisiting earlier studies Garfield confirmed the increasing R/S in biochemistry, mathematics and botany. The growth rates differed but the tendency to increasing reference lists was clear (Garfield 1979, 1980). Garfield suggested five possible explanations for the increase in the number of references. Firstly, fragmented publishing implies referencing more as each study is split into several publications. Secondly, the growth of the literature would also imply an increase although the increase should level out or all publications become reviews. Thirdly, citation consciousness may incite authors to cite more. Fourthly, improved current awareness systems help authors be aware of recently published studies. Finally, available databases help authors find older studies as well and their retrospective search capability is thus also strengthened (Garfield 1979).

More recent studies have also shown that the number of references has increased. The average number of references increased substantially over a 38 year-period in the journal “Polymer testing” (Jaunich, 2018). Similar results are found in 8 engineering journals (Ucar et al. 2014) although growth rates differ over time. Ucar et al. found a strong growth trend in the number of references in engineering papers. Krampen (2010) found increasing numbers of references using a sample of 45 English and 45 German articles on developmental psychology, psychological diagnosis and assessment, and social psychology. Similarly, by examining one volume from each decade Lipetz (1999) documented that papers without references disappeared in The Journal of the American Society for Information Science (JASIST) during 50 years of publication, and that the average number of references per paper had grown exponentially. Also studying JASIST, Tsay (2008) found that the average number of references cited per paper increased 2 to 3 times over a period of about 25 years. Sánchez-Gil et al. (2018) reported an increasing average number of references in all scientific areas and categories, except in some Arts and Humanities categories although there is considerable variation across fields as well as within fields. However, the analysis by Sánchez-Gil et al. does not normalize for document types, and as reviews generally contain considerably more references normalization is necessary. Without normalization the findings could be caused by an increase in specific publication types.

A number of studies have shown that the number of references differ across fields. Hyland (2004) found that the average number of references per publication range from 24.8 in magnetic physics to 104.0 in sociology. Sánchez-Gil et al. (2018) found considerable differences across five knowledge areas ranging from an average of 24.79 references per publication in health sciences to 36.55 in life sciences. Yang et al. (2006) reported differences in the average number of references in five andrology journals.

Meadows (1974) argued that the increase in the number of references will level out and reach a maximum. The number of references in a paper may also decrease over time due to journal limitations which is reported by Anger (1999) describing a maximum limit of four references in a health sciences journal for brief communications, case reports and letters to the editor which would. However, although a limited study, Ucar et al. (2014) found a doubling of the median number of references from 1972 to 2000 and a growth acceleration after the year 2000. The acceleration is explained predominantly by a greater access to information by the generalization of the Internet, and they consequently argue that as such “there is still no sign of the so called saturation phase” (Ucar et al. 2014: 1863).

Pan et al. (2018) argue that the growth in scientific production as well as in the number of references per paper has widespread implications for scientific communities in general and more specifically how academic knowledge is connected, accessed and evaluated. Normalization is one of the areas where an increasing number of references plays an important role. Bibliometric studies often include the use of field-normalized indicators to account for the varying publication and citation cultures across fields (Waltman 2016). However, numerous approaches for normalized indicators exist (see Bornmann and Marx 2018). One way is by so-called “citing-side”, “a priori”, “ex ante” or “source” normalization (Zitt 2013). This approach is based on references in the citing documents (for recent examples see, for instance Frandsen et al. (2019), Mingers and Meyer (2017), Fang (2015)). Consequently, any changing patterns in the number of references in specific document types as well as within fields are important for citing-side normalization of impact indicators.

In summary, studies have repeatedly reported a growing number of references in research publications from most fields. A number of reasons for this growth have been suggested, but consensus is yet to emerge. To get a better understanding of the growth rate, we return to the praxis of reporting interval ratios. This will help to identify the causes of the growth: Is the growth caused mainly by authors writing very long papers with many references or is it primarily caused by more gradual shifts from shorter reference lists to medium size reference lists, etc.? Also: is the growth rate the same in different fields, for different document types, and over time? Finally, what are the consequences for the normalization of impact indicators? The aim of this study is to answer these questions.

We therefore report interval ratios for seven fields (Arts and Humanities; Social Sciences; Computer Science; Mathematics; Engineering; Medicine; Physics and Astronomy) and three types of journal papers (journal articles; notes; reviews) for a period of 24 years (1996–2019), and discuss how the findings may affect citing-side normalization. Bornmann and Marx (2018) appeal for more studies analyzing the validity of the various field-normalized indicators. This study adds to the knowledge base within the field by examining potentially changing patterns in the number of references.

Methods

Data was retrieved from Scopus ultimo March 2020. The publication window was set to 1996–2019 and limited to three document typesFootnote 1 (articles, notes and reviews) and seven subject areasFootnote 2 (Arts and Humanities, Computer science, Engineering, Mathematics, Medicine, Physics and Astronomy, Social sciences). The selection of document types and subject areas were guided by an aspiration to cover variations of both, and the selection thus covers a total of 26,931,419 items.

Retrieved items were organized by number of references using twelve intervals: 0–9; 10–19: 20–29: 30–39; 40–49; 50–59; 60–69; 70–79; 80–89; 90–99; 100–199; 200–299 (see "Appendices 121"). Interval ratios for each document type were calculated year-by-year as a percentage number. For example: In Arts and Humanities (2019) we retrieved 15,367 journal articles with a reference list containing 0 to 19 references. This amounts to 18.6 percent of the total number of retrieved journal articles that year (82,406). Table 1 illustrates the calculation of interval ratios for Arts and Humanities journal articles published in 2019.

Table 1 Calculation of interval ratios

Note that the percentages are calculated using reference lists in the range of 0–299 as total. There are, of course, a (small) number of journal articles with longer reference lists. Thus, the actual percentages are probably a little bit lower.

The organization of the data in intervals makes it possible to collapse retrieved reference intervals and perform analyses on larger interval ratios. Articles can serve as an example: Publications with less than 60 references account for more than 92% of the publications and thus some of the ratio intervals were collapsed to provide a clearer picture in the figures. Thus, analyses on the growth rate of references in journal articles were performed using six interval ratios (0–19; 20–39; 40–59; 60–79; 80–99; 100–199). Not all ratio intervals are included in all figures to enable an overview of the dominant tendencies for each document type. Reviews tend to have longer reference lists whereas few reviews have short reference lists. Consequently, analyses of reviews were performed using four interval ratios (0–49; 50–99; 100–299). The number of references tend to be smaller in notes and we therefore concentrate on the first four intervals. Other interval ratios are easily constructed from data provided in the appendices.

Results are presented in graphs showing the development over time for the period 1996–2019 in absolute numbers and ratios.

Results

First, we consider the development over time using absolute numbers and a general increase in the number of publications is evident. The number of publications in our data set has more than tripled from 1996 to 2019. Serving as an example, Fig. 1 presents an overview of the development in the number of references in journal articles. The number of articles with at least 50 references is more than 10 times higher in 2019 than 1996. Articles with 20 to 49 references have at least tripled in 2019 compared to 1996. Finally, articles with 0–19 references have only increased slightly or even decreased. Consequently, due to the general increase in the number of publications indexed in Scopus we are analyzing the development over time using ratios.

Fig. 1
figure 1

The development in the number of references in journal articles

Now we turn to ratio intervals. The results of the investigation show significant differences between fields and document types.

We find that the number of references in journal articles is growing relatively in all fields, but not at the same pace (see Figs. 2, 3 and 4). Journal articles with short reference lists (0–19 references) dominated in the beginning of the investigated period—especially in the fields of Computer Science, Engineering, Mathematics, and Physics and Astronomy. Here the short reference list was the most frequent kind until 2010 (Computer Science), 2013 (Engineering and Mathematics), and 2014 (Physics and Astronomy) when it was replaced at the first spot by journal articles with a bit longer reference lists (20–39 references). A similar replacement is found in Medicine, Social Sciences, and Arts and Humanities. Yet, in these fields the shift took place at an earlier point (Medicine: 2000; Social Sciences: 2002; Arts and Humanities: 2005). Medium length reference lists (40–59 references) were rarer guests in the first part of the investigated period, but show a steady increase in all fields over time—overtaking the second spot in Arts and Humanities (2015), Computer Science (2018), and Engineering (2019), and even ending up sharing first spot in Social Sciences (2019).

Fig. 2
figure 2

The development in the number of references in journal articles

Fig. 3
figure 3

The development in the number of references in reviews

Fig. 4
figure 4

The development in the number of references in notes

Journal articles with longer reference lists are much fewer relatively and are found to be more constant in shares over time - except for journal articles with 60–79 references in Social Sciences that are found to almost double its share over time.

These findings verify that the number of references in journal articles has been growing over time. They also reveal that the main cause of the growth in all fields is a drop in the share of articles with short reference lists (1–9 references) and an increase in share of articles with a bit longer and medium size reference lists (20–39 references and 40–59 references (and even 60–79 references especially in Social Sciences)). Journal articles with long reference lists (80–99) and very long reference lists (100–199) have shares that are found to be quite constant over time. There are marked differences in growth rates between fields. Arts and Humanities and Social Sciences display much more steady tendencies compared to the five other fields.

Turning to the development in the number of references in review articles, we find that this is quite diverse (see Fig. 3) during the period 1996–2019. Review articles with shorter reference lists (0–49 references) are dominating in all fields, but in Computer Science, Engineering, Mathematics, Medicine, and Physics and Astronomy their shares decreased heavily over time. Social Sciences display a moderate drop in shorter reference lists over time, and in Arts and Humanities the share of shorter reference lists is practically constant over time. In Medicine, the share of shorter reference lists went from almost 70 percent in 1996 to around 45 percent in 2019. In Computer Science (90% to 20%), Engineering (85% to 30%), Mathematics (85% to 50%), and Physics and Astronomy (75% to 30%) the drop was even more significant. The share of shorter reference lists only dropped around 15% in Social Sciences, from around 75% in 1996 to around 60 percent in 2019.

The shares of medium size reference lists (50–99 references) and long reference lists (100–199) gradually increased over time in all fields (except Arts and Humanities). As we are dealing with shares, the growth is of course most significant in the same fields where a significant drop in shorter reference lists were observed. Yet, the two types of reference lists are not growing at the same pace between fields. Long reference lists in Computer Science, Engineering, and Physics and Astronomy are now the most frequent kind, but still takes the third spot on the other fields.

Consequently, the primary cause of the observed growth in number of references in reviews over time was a drop in shorter reference lists and an increase in medium size and long reference lists.

Compared to the two other document types (articles and reviews), the number of references in notes are found to be more constant over time. Most notes contain a limited number of references. Figure 4 therefore only show results of notes with very short reference lists (0–9 references), short reference lists (10–19 references), and medium size reference lists (20–29 references and 30–39 references).

Only a few notes were published within the field of Mathematics during the first part of the investigated period, consequently causing the observed variability in the reported shares. When ignoring this variability, it is evident that in four of the seven fields (Arts and Humanities, Mathematics, Medicine, and Physics and Astronomy), notes display characteristics of a stable document type with a quite constant number of references. The same is not the case for the three other fields where the share of very short reference lists (0–9) have dropped significantly over time. In these fields it is especially notes with short reference lists (10–19) that have experienced a growth over time.

Thus, the results show some differences between fields, but again data reveals that in fields where the number of references have grown in notes, the growth is primarily caused by a drop in reference lists of the shorter kind and a growth in a bit longer reference lists. The share of the longest reference lists seems to be quite stable.

Discussion and conclusion

The results of the investigation show significant differences between fields and document types. The number of references in journal articles and reviews is growing in all fields (except for the reviews in Arts and Humanities that remain stable over time), but at a different pace; the number of references in notes is growing in some fields (again at a different pace) whereas it remains stable in others.

By focusing on interval ratios, this study reveals that the main cause of the observed growth in number of references is a drop in the share of short reference lists and a corresponding increase in the share of a bit longer and medium size reference lists. The share of long and very long reference lists remain much more stable in shares over time.

Before discussing the implications of the study, the limitations need to be considered—specifically the unit of analysis and the quality of data. First, the analyses are performed on seven subject areas as indexed in Scopus. Differences in the number of references across fields as well as subfields are documented (Sánchez-Gil et al. 2018) and thus a more fine-grained analysis could potentially have shown even greater differences within fields. However, for the purpose of this paper, field differences remain clear using the current data set. Secondly, data quality is of great importance for any bibliometric study, and Scopus (as well as Web of Science) suffer from various documented inaccuracies (Krauskop 2019; Van Eck and Waltman 2017). Consequently, we expect some inaccuracies in the obtained data. However, as we expect these inaccuracies to be evenly distributed across the data set, bias is not expected.

Finding that the number of references in articles and reviews are increasing across all fields (except for Arts and Humanities reviews) lends support to Ucar et al. (2014) stating that a saturation phase is not in sight. Collectively, the seven fields are displaying a substantial decrease in publications with very few references, but also that the declining pace varies across fields. This support the need for field-specific studies of document types (Sánchez-Gil et al. 2018). Yet, the stable number of references in notes in four of the seven fields suggests that the development in the number of references is also closely linked to the document type. This may be explained by certain journal limitations (Anger 1999) or by certain document types having reached a maximum as suggested by Meadows (1974). A third explanation offered in several studies is that the increased number of references is correlated with an increase in paper length. Abt and Garfield (2002) analyzed 41 journals from physical, life, and social sciences and found a linear relationship between the average number of references and the normalized paper lengths. Papers in review journals have on average twice the number of references as research papers of the same lengths. Similar results were reported by Costas et al. (2012) who had analyzed the use of bibliographic references by individual scientists in three different research areas and found that within each area the number of references increased with paper length. However, Hyland (2004) found great disciplinary differences in the number of references even when correcting for the number of words (the number of references per 1000 words ranged from 7.3 in Mechanical Engineering to 15.5 in Molecular Biology), and Vosner et al. (2016) found that the average number of references is increasing whereas the number of pages per publication is decreasing, although it has remained stable in recent years. Thus, the third explanation might be somewhat questionable.

Normalization is a key principle in citation analysis, and a number of normalization procedures have been suggested. The many field-normalized indicators measure citation impact comparably, but they are not equivalent, and the choice of field-normalized indicator can lead to different results in a research evaluation (Bornmann et al. 2019). One of these field-normalizing approaches is citing-side normalization which normalize impact indicators by correcting for the effect of reference list length (Waltman 2016). Zhou and Leydesdorff (2011: 362) describe the procedure as follows: “Each of the […] citing documents contains a number of references (k) and is accordingly attributed to the cited document with a fractional weight 1/k.” The number of references in the citing paper is used to normalize a specific citation. In some cases, the average number of references in the same journal as the citing document is used as weighting factor (Bornmann et al. 2019). It is important to keep in mind that the number of references used to normalize the citations only includes references that falls within a certain reference window and that points to a publication in a journal covered by the database used for the analysis i.e. so-called active references (Zitt and Small 2008). Several indicators exist that are based on source normalization approaches. Waltman and van Eck (2012) examines three mean source normalized citation score indicators and find that they are all strongly correlated.

Zitt (2010) acknowledges that there are limitations to citing-side normalization and explains that a pool of documents with few references may produce anomalies. Mingers and Kaymaz (2019) encountered problems with cases with extremely high as well as cases with extremely low numbers of active citations when computing normalized book citations in Google Scholar. In the case of no active citations available the target book could not be normalized. Furthermore, Zitt (2010) argues that journals with constraints on the number of references can also produce irregularities although Zitt argues that the principle of aggregation reduces this type of problem. Waltman and van Eck (2012, p. 714) concludes that “[w]hen taking a source normalization approach, it is especially important to exclude journals with very small numbers of active references”.

Consequently, researchers working with citing-side normalization recognizes the importance of the number of references. However, so far there has been little work done to address the consequences of the general increase in the average number of references. In a recent study, Petersen et al. (2019) estimates that the increase in reference lists accounts for one-third of the growth rate for the total number of references produced by the scientific literature published in any given year. While they acknowledge the importance of normalizing citation data, they argue that the problem of citation inflation may be even more fundamental:

“The problem is rather simple—when citations are produced in distinct historical periods their ‘nominal values’ are inconsistent and thus cannot simply be added together” (Petersen et al. (2019: 1855).

The authors provide convincing examples showing that the cutoff for the top 1% of articles in the journal Science published in the year 2000 corresponds to 200 citations, whereas the top 1% cutoff for publications from 1965 was just under 100 citations. Similarly, they calculate that the cutoff for the top 10% of social science publications from 1965 is around 10 citations, whereas in 2000 the threshold had risen to more than 40 citations. Furthermore, Bornmann and Mutz (2015, p. 2221) argue that the growth may be caused by “a mixture of internal and external (sociological, historical, psychological) practices that continuously have altered the ways of viewing science”. These examples illustrate why neither field-normalization nor a fixed citation window can overcome the temporal bias induced by citation inflation. The authors therefore develop a six-step procedure for obtaining data that may be used for citation deflation. This procedure involves defining a target subject and obtaining deflator time series, but do not address the importance of document types.

Our results underline the importance of normalizing citation data, and for taking citation inflation into account when conducting citation analyses expanding longer time-periods. However, our results not only show differences in number of references between fields and over time, but also that a third parameter play a role. Thus, the number of references is not only field and time dependent, but also document type-specific. To obtain even more fine-grained citation deflator data, future developers may therefore also take our results into account, and develop procedures incorporating the document type of citing documents into their equations.