Introduction

Impact factor is a quantitative tool for evaluating journals in the Journal Citation Report (JCR) database. Librarians rely on the journal impact factor as a tool for selecting periodicals, primarily in scientific disciplines (Cameron 2005). However, impact factor can be manipulated (Hemmingsson 2002), leading to incorrect evaluation of some journals. One method of manipulation is for journal editors to ask authors to cite articles published in their journals when accepting articles for publication (Smith 1997). We define this as self-cited manipulation of the impact factor in this paper (Smith 1997). Since this manipulation was announced by Smith in 1997, similar announcements have emerged in succession (Fassoulaki et al. 2000; Whitehouse 2001; Miller 2002; Hemmingsson 2002; Sevinc 2004; Caldwell 2006; Procianoy 2007; Krauss 2007; Falagas and Alexiou 2008). Manipulation which attempts to increase the nominator would cause misleading evaluation results of scientific journals and cause scientists and researchers doubt whether impact factor is a valid tool to measure the quality of journals (Schreiber 2007). Therefore, manipulation of the impact factor should be banned in the scientific field (Wallner 2009). Jones (2003) proposed that a “clean” impact factor is needed, which should be developed by journal editors (Wallner 2009). It is easy to understand that journal self-citations of the previous two years can result in abnormally increasing IF (Falagas and Alexiou 2008), giving the impression a journal is of higher quality. Reliability-based citation impact factor, abbreviated as “R-impact factor”, proposed by Kuo and Rupe (2007) considers the influence of citation and length of impact of published journals as measured by the cited half-life. The greater the number of self-citations of a journal, the shorter the cited half-life of the journal. Therefore, we should determine how the R-impact factor of a journal changes when the journal’s impact factor is manipulated. Yu and Wang (2007) established a mathematical expression of the relation between the journal self-citation rate and its impact factor by the single-factor method and analyzed the possibility that journal editors manipulate impact factors of their journals by raising the self-cited rate. In this paper, we investigate different aspects of the manipulation process between the impact factor and R-impact factor. The paper is organized as follows: in the next section we deduce the calculation formulas of manipulated impact factor and R-impact factor based on the distribution model of citations, and then simulate changes of the manipulated impact factor and R-impact factor with different aging constants. In Sect. “Analyses on actual journals”, we calculate the R-impact factors and self-cited rates in the previous two years of eight journals from 2000 to 2007, and then compare the R-impact factors with the impact factors of those journals.

Simulation of manipulation processes

Definition of R-impact factor

According to the definition of R-impact factor (Kuo and Rupe 2007), its equation can be expressed as follows:

$$ RIF\; = \;IF\; \times \;T_{0.5} , $$
(1)

where RIF is R-impact factor, IF is impact factor and T 0.5 is cited half-life. Amin and Mabe (2000) described the accumulation of citations over time after publication using the generalized citation curve in Fig. 1. The impact factor and cited half-life can be clearly seen in the figure.

Fig. 1
figure 1

Generalized citation curve (Amin and Mabe 2000) and ISI indicators

There are obvious differences between the R-impact factor and impact factor. The impact factor is an index which considers the relation between the number of cited papers a journal has published in the previous two years and the number that has been cited. The R-impact factor comprises two factors: (1) the impact factor: and (2) the long-lasting impact of published journals as measured by the cited half-life. As is commonly known, very long publication delays of scientific papers would badly influence citation distributions and journal impact factors (Garfield 1999; Yu et al. 2005). If the R-impact factor is used as a measure the same as the IF, this bad influence might partly be avoided.

Simulation models

When the number of self-citations published in a journal in the previous two years is manipulated, the self-cited rate in those two years and the impact factor of the journal will be increased in a short time. According to Eq. 1, the cited half-life, one of the two elements of RIF, is from the long-term perspective taking into account the reference value of citations, and, therefore, its rate of change is relatively slow. When IF is manipulated, the cited half-life correspondly changes in the opposite direction of the change of IF. For example, when a journal’s self-cited rate is artificially increased, the impact factor rises and the cited half-life becomes shorter. So, the result of the cited half-life multiplied can partly eliminate this increased trend and the negative impact of manipulation can be partly reduced.

In this study, we simulate changes of manipulated IF and RIF to theoretically prove their values beyond judgment. To achieve this objective, a mathematical model is first chosen to describe the citation distribution of a journal, and then expressions of manipulated IF and RIF are deduced, which provide the simulation models.

Yu and Li (2009) divide classical citation distribution models into two categories: (1) the Bernal first-order negative exponential model and its improved models; and (2) the Burton and Kebler equation and its improved models. Here, we choose the Bernal first-order negative exponential citation distribution model with non-dimensional parameters. Considering the delay phenomena in the research-citation cycle mainly caused by the publishing process (Egghe and Rousseau 2000; Yu et al. 2005), the model is corrected by pure delay \( \tau \). Let \( f(T) \) be the cited probability of the journal’s published papers and T be citation age, then:

$$ f(T) = Ke^{ - \alpha (T - \tau )} , $$
(2)

where K is the coefficient, α is the aging coefficient, τ is the pure delay for editing and waiting time in the publication process of cited articles after acceptance. Let C(T) be the cumulative probability of citations at age T, then we have

$$ C(T) = 1 - e^{ - \alpha (T - \tau )} $$
(3)

According to the definition of IF, its theoretical expression is

$$ IF(T) = \lambda (C(3) - C(1)), $$
(4)

where λ = M/Y, M is the number of times cited and Y is the number of papers published in the previous two years of one journal.

When manipulation behavior does not exist in the process of citing, we can obtain T 0.5 according to the definition of cited half-life, namely, C(T 0.5) = 0.5 when T = T 0.5. From Eq. 3, we have

$$ T_{0.5} = {{\tau + \ln 2} \mathord{\left/ {\vphantom {{\tau + \ln 2} \alpha }} \right. \kern-\nulldelimiterspace} \alpha }. $$
(5)

However, T 0.5 can not be calculated by Eq. 5 if IF is manipulated. We assume that the manipulated citation function f′(T) is k times its theoretical value in the two years. Meanwhile, the number of citations remains unchanged for other ages, so \( T \in [1,3] \), f′(T) = k · f(T). We can use integral calculus for different ages and get the distribution function of cumulative citations which has been manipulated thus:

$$ \left\{ \begin{array}{ll} C(T) = 1 - e^{ - \alpha (T - \tau )}, & \tau \le T \le 1 \\ C(T) = 1 + (k - 1)e^{ - \alpha (1 - \tau )} - ke^{ - \alpha (T - \tau )}, & 1 < T \le 3 \\ C(T) = 1 + (k - 1)e^{ - \alpha (1 - \tau )} - (k - 1)e^{ - \alpha (3 - \tau )} - e^{ - \alpha (T - \tau )} , & T > 3 \\ \end{array} \right. $$
(6)

According to Eqs. 4 and 6, we obtain the formulae of IF and RIF disturbed by manipulation as follows:

$$ IF^{\prime} = k{(e^{ - \alpha (1 - \tau )} - e^{ - \alpha (3 - \tau )} ) \cdot M/Y} $$
$$ RIF = T_{0.5} \cdot IF^{\prime} $$
(7)

IF′ expresses the manipulated impact factor. According to Eqs. 3 and 6, we have:

$$ C(T_{0.5} ) = 1 + (k - 1)e^{ - \alpha (1 - \tau )} - (k - 1)e^{ - \alpha (3 - \tau )} - e^{{ - \alpha (T_{ 0. 5} - \tau )}} = 0.5 $$

After transforming the expression above, we obtain this formula:

$$ T_{0.5} = \tau + {\frac{1}{\alpha }}(\ln 2 - \ln (1 + 2(k - 1)e^{ - \alpha (1 - \tau )} - 2(k - 1)e^{ - \alpha (3 - \tau )} )) $$
(8)

Using Eqs. 48, the changing processes of IF′ and RIF can be simulated.

It should be explained that the time dynamic characteristic of citedness may differ according to the journal. Therefore, no citation model is a suitable fit for all journals. Here, we study only those journals fitting the corrected first-order negative exponential distribution model. We believe this does not affect the results of this research.

Simulation results

Yu and Wang (2007) found that impact factors are obviously affected by self-cited rates for journals with low impact factors. It is easy to manipulate impact factors by increasing self-cited rates, especially self-citations to papers published in the previous one or two years. From Eq. 5, the smaller the aging coefficient α is, the longer the cited half-life T 0.5 and the slower the aging. Consequently, when the parameter values of Eq. 8 are chosen, the following assumptions are made: (1) let α < 0.25 (T 0.5  > 2.97 years according to Eq. 5) because a change to a shorter T 0.5 is more often seen than to a longer T 0.5; (2) From Eq. 4, the larger λ(λ = M/Y) is, the higher IF is for the same citation distribution f(T); so we choose small λ, namely the impact factor is low (between 4 and 6); and (3) τ is between 0.2 and 0.3 years because the smallest pure publication delay is commonly about 2–3 months. According to these assumptions, we have four simulation examples of different λ as follows:

  1. (1)

    λ = 4, τ = 0.25 years, α = 0.125. When k = 1, T0.5 = 5.8 years according to Eq. 5.

  2. (2)

    λ = 4, τ = 0.25 years, α = 0.15. When k = 1, T0.5 = 4.87 years according to Eq. 5.

  3. (3)

    λ = 4, τ = 0.25 years, α = 0.2. When k = 1, T0.5 = 3.71 years according to Eq. 5.

  4. (4)

    λ = 6, τ = 0.25 years, α = 0.2. When k = 1, T0.5 = 3.71 years according to Eq. 5.

Here, k changes in the scope from 1.0 to 1.5, and its mathematical expression is \( k \in [1.0,\; 1. 50] \). Four simulation results (Figs. 2, 3, 4, 5) show changes of IF, RIF and T 0.5 in the four cases.

Fig. 2
figure 2

Simulation results of three parameters of example 1

Fig. 3
figure 3

Simulation results of three parameters of example 2

Fig. 4
figure 4

Simulation results of three parameters of example 3

Fig. 5
figure 5

Simulation results of three parameters of example 4

In Figs. 2 and 3, when IFs are manipulated, the cited half-life is decreased and IF and RIF are increased, but the growth rate of RIF is slower than that of IF. In Figs. 2 and 3, when k is increasing, it is found that the faster the aging is (i.e., the bigger the aging factor α or the shorter the half-life), the greater the decrease of the cited half-life and the smaller the growth rates of RIF. In Fig. 4b and Fig. 5b, we can also see that RIF decreases after reaching a certain level. In example 3 and example 4, manipulation is offset by using R-impact factor to evaluate the journal’s impact. It is should be explained that the growth rate of IF or RIF is the change rate between the initial value (k = 1) and the final value (k = 1.5) of IF or RIF.

Analyses on actual journals

From the simulation results in the previous section, it is shown that RIF can partly offset the effect of manipulation; namely, the shorter the cited half-life is (i.e., the bigger the aging coefficient α is), the lower the effect of manipulation on RIF. In this section, we collect total citations, self-citations, impact factors as well as cited half-lives of eight journals over a period of 8 years (2000–2007) from the JCR database, and calculate their self-cited rates in the previous two years and R-impact factors, followed by analysis of changes of the self-cited rates in the previous two years, the impact factors and the R-impact factors of these journals.

Journals selection and data collection

In this paper, we selected four “normal” journals from the JCR database: Scientometrics, Journal of Materials Chemistry (Abbreviated Journal Title: J MATER CHEM), Transactions of Nonferrous Metals Society of China (Abbreviated Journal Title: T NONFERR METAL SOC) and Journal of Materials Processing Technology (Abbreviated Journal Title: J MATER PROCESS TECH). Here, normal journals are considered as those journals which have not requested authors to cite their journals. According to the explanatory statements and announcements made by some authors submitting their works to the journals, we selected four journals whose editors asked the authors to cite papers published in the journals in the previous two years (“manipulated journals”). In order to protect the identities, we conceal the journal names, instead referring to them as “Journal 1”, “Journal 2”, “Journal 3” and “Journal 4”. We collected the number of times of self-citation, the impact factors and the cited half-lives of the eight journals for each year from 2000 to 2007, and then calculated each journal’s self-cited rates in the previous two years and the R-impact factors. See Tables 1 and 2.

Table 1 Self-cited rates in the previous two years of eight journals for 2000–2007
Table 2 Impact factors of eight journals for 2000–2007

Analysis result of self-cited rates and impact factors

The change curves of the impact factors and self-cited rates in the previous two years of the four normal journals from 2000 to 2007 are shown in Figs. 6, 7 and 8. The change curves of the two parameters of four manipulated journals are shown in Figs. 9 and 10 from 2000 to 2007. Comparing the change curves of the two parameters of the four normal journals, we find that the change trends of the two parameters of these normal journals are inconsistent, three journals (SCIENTOMETRICS, J MATER CHE and J MATER PROCESS TECH) very obviously (see Figs. 6, 7, 8, respectively), namely, their impact factors are increasing and the self-cited rates are nearly unchanged or decreasing. This indicates that the increases of their impact factors do not rely on their self-cited times in the previous two years. Figures 9 and 10 show the change curves of the self-cited rates in the previous two years and impact factors of the four manipulated journals. It is obvious that the change trends of the two parameters of Journal 1 are the same (see Fig. 9). There are similar change trends of the two parameters for Journal 2 between 2000 and 2005, Journals 3 between 2003 and 2007, and Journal 4 between 2004 and 2007. This means that the growth of these journals’ impact factors partially relies on increasing self-citations in the previous two years.

Fig. 6
figure 6

Curves of self-cited rate and impact factor of Scientometrics

Fig. 7
figure 7

Curves of self-cited rate and impact factor of J Mater Chem

Fig. 8
figure 8

Curves of self-cited rates and impact factors of T Nonferr Metal Soc and J Mater Process Tech

Fig. 9
figure 9

Curves of self-cited rates and impact factors of Journals 1 and 2

Fig. 10
figure 10

Curves of self-cited rates and impact factors of Journal 3 and Journal 4

Analysis on RIF

Abnormal change of the self-cited rate can influence IF and RIF to a certain extent. The results of the eight journals’ R-impact factors are shown in Table 3. In order to clearly observe the changes of the impact factors and R-impact factors, the change rates of the IFs and RIFs of the four manipulated journals from 2000 to 2007 are calculated and shown in Fig. 11. Because there is no increase in the self-cited rates of the four normal journals, their R-impact factors’ changes are caused by the impact factors and cited half-lives.

Table 3 Results of R-impact factors of eight journals for 2000–2007
Fig. 11
figure 11

Change rate curves of the IFs and RIFs of four manipulated journals

The curves of the rates of change of the four manipulated journals’ IFs and RIFs are shown in Fig. 11. By comparing the four graphs, we see that the rate of change of the IF of Journal 1 is bigger than that of the RIF because there is a large increase in the journal’s self-citation which was obviously manipulated. In Table 3 and Fig. 11, it can be seen that the range of the rate of change of the IF of Journal 1 is from −0.18 to 1.14 and that of the RIF is from −0.18 to 0.574. So, when a journal’s IF is obviously manipulated, it is possible that the R-impact factor partially offsets the manipulation effect. Take Journal 1, for example, its impact factor drastically increased in 2007, but its R-impact factor increased much less. However, this is not seen for Journals 2, 3 or 4 because there were no increases in self-citation by the authors. Since there are many factors influencing the impact factor and the cited half-life, the R-impact factor will have a large increase in practice. According to the study results in Sect. “Simulation of manipulation processes”, the growth rate of the R-impact factor is much slower than that of the impact factor theoretically and the aging coefficient is particularly large (for example, α = 0.2 in Sect. “Simulation of manipulation processes”). However, in the case of manipulation of self-citations where there are small aging coefficients (for example, α ≤ 0.15), there is no convincing evidence for or against suggested advantage of RIF over IF.

Conclusions

According to the above simulation results and analyses on actual data on impact factor manipulation, the following conclusions are obtained:

  1. (1)

    As an evaluation tool of scientific journals, the impact factor is easily manipulated according to its definition. So a scientific method needs to be established to identify which journals have been manipulated by increased self-citation.

  2. (2)

    Through the simulation study on the change process of the R-impact factor caused by the manipulation process in Sect. “Simulation of manipulation processes”, it can be seen that the R-impact factor partially offsets the manipulation because the cited half-life changes in the opposite direction against manipulation, especially when the aging coefficient of a journal is large (or the cited half-life is short). So the R-impact factor has a greater fairness than the impact factor when we evaluate journals whose cited half-lives are relatively short.

  3. (3)

    As the impact factor and the cited half-life are influenced by many factors, similar plot results of the two parameters for journal evaluation may not be obtained in practice. So, whether the R-impact factor can substitute for the impact factor for evaluating scientific journals’ impact needs more concrete demonstration.

  4. (4)

    The R-impact factor is superior to the impact factor for evaluation of a journal’s quality in the long term. It has an apparent ability to offset a significant manipulation according to the simulation results and Journal 1’s analysis results. However, in this study, we believe there is no convincing evidence of suggested advantage of RIF over IF in the case of a small aging coefficient (or a long cited half-life).

  5. (5)

    The R-impact factor can be used as a supplementary tool to monitor the development of journals. However, further research needs to be done before the R-impact factor can be recognized as a stand alone tool for evaluation of the quality of journals.