Introduction

The study of the use and relative impact of scientific journals is one of the important applications of citation analysis (Egghe and Rousseau 1990, p. 254). Firstly, the duality of references and citations of scientific journals is a central feature of the citing process. The aggregated citation relations among journals included in the Science Citation Index provide us with a huge matrix, which can be analyzed in various ways (Leydesdorff 2006); therefore data of the age distribution of references or citations of a journal is easily obtained from SCI-JCR. Firstly, many authors mostly have studied inter-citation relationships of scientific journals using citation analysis. Such as Narin et al. (1972) shown journal citing relationships by means of “one-step” and “two-step” models; Leydesdorff (1994) described a method for the generation of journal-journal citation maps on basis of SCI; Pudovkin and Garfield (2002) proposed that the number of citations from one journal to another journal should be (on average) proportional to the number of papers published in the cited journal and to the number of cited references in the citing journal. Secondly, the referencing or citation process is nearly related to the publication delay (defined by Garvey 1979, p. 72; Luwel and Moed 1998). Leydesdorff (1998) proposed that cited-citing pairs of a scientific journal are relations from a formal perspective and each scientific publication potentially redistributes citation patterns. Some researchers (Luwel and Moed 1998; Egghe and Rousseau 2000) proposed the influence of publication delays on the aging of scientific literature. Rinia et al. (2001) studied citation delay in inter-disciplinary knowledge exchange and indicated that referencing or citation speed is related to the publication delay process. Lawrence (2001) analyzed 119,924 conference articles in computer science and related disciplines by the citation analysis and indicated that articles freely available online (publishing) are more highly cited. Based the transfer function model of the observed citation distribution (Yu et al. 2005), parameters of 12 citation distributions are identified from statistical data of age distributions of references of 10 journals in JCR and the identified publication delay parameters of every journal were compared with the journal’s publication delay parameters and some valuable conclusions are deduced (Yu and Li 2007).

In this article, we will theoretically study the relationship between an cited-citing pair of a scientific journal, and then show the inter-citation relationship between the individual journal and the journal group in a scientific field and analyze differences between parameters of the reference and citation distributions based on the model of the observed citation distribution and explain why the identified average delay parameter was longer than the journal’s observed average publication delay in that paper (Yu and Li 2007).

Analyses of the journal’s referencing and citation processes

The journal’s referencing process

The referencing process is a course of retrieving literature from information resources and has a close relation with obsolescence of cited literature. This process describes a citing action. As a result of obsolescence, references of a journal or a journal group on a given subject should show a regular age distribution; the age distribution of references of papers published in all journals of the subject is regarded as the observed aging process of the subject literature in synchronous study. The observed referencing process includes the diffusing process and the aging process of cited literature, and the publishing process of source items in terms of the citation and publication cycle (Egghe and Rousseau 2000). In here the diffusing process of journal literature is the stage of processing information into knowledge (Garvey 1979), including abstracting articles, ordering and indexing abstracts; the process should be rapidly achieved in digital times. An observed referencing process of a journal can be described by the physical model in Fig. 1.

Fig. 1
figure 1

The physical model of a journal’s observed referencing process

In Fig. 1, frame diagram (1) shows the diffusing process of these articles from the subject journal group cited by the journal; these articles are cumulated in the resource-database, namely frame diagram (2), and ageing along with the lapse of time; this ageing process would be reflected by the age distribution of citations of the subject. Frame diagram (3) shows the publishing process of papers of the journal which cites articles in the subject journal group; the publication delay process is presumably reflected in the observed reference age distribution of the journal (Egghe and Rousseau 2000; Yu and Li 2007).

The journal’s citation process

A scientific journal is a source item (citing journal) in the referencing process and a cited object in the citation process, respectively. Total citations reflect its scientific contribution to the subject (or discipline) development; the citation distribution of the journal images the obsolescence of the journal’s literature. The physical model of the observed citation process of a journal is shown in Fig. 2. The process includes the diffusing process and the aging process of cited papers published in the journal, and the publishing process of source items which cite the journal.

Fig. 2
figure 2

The physical model of an observed citation process of a journal

In Fig. 2, we can see a physical signification of the observed citation process. The journal’s citations to articles published in the journal reflect and scientific contribution to the subject that it belongs to. Frame diagram (1) shows the diffusing process of these articles of the journal cited by the subject journal group, including the abstracting or indexing process of these articles, the process is rapidly completed by the Web and the computer technology yet; the journal’s articles are cumulated in the resource-database, namely frame diagram (2), and aging along with the lapse of time; this aging process would be reflected by the age distribution of cited articles. Frame diagram (3) shows the publishing process of papers of the subject journal group which cite the journal’s articles cumulated in frame diagram (2) in Fig. 2; the publication delay process of the journal group is presumably reflected in the age distribution of cited articles of the journal.

Based on the qualitative analysis above, the referencing process of one journal relates to the information exchanging and obsolescence course of the subject (discipline) which the journal belongs to, and the publication process of the journal; so the citation process of the journal presumably reflects the impact on progress of the subject and the lifecycle of literature published in the journal, and the publishing process of the citing journal group.

For studying the inter-citation relation of a journal and a subject journal group that the journal belongs to, the age distribution function of citations can be used to study the duality of referencing and citation of an individual journal and the inter-citation relationship of the journal and the whole journal group. In this paper, parameters of journal’s referencing and citation processes will be identified from the actual reference distribution and the observed citation distribution of journals of three groups in JCR, respectively; differences between identified parameters will be analyzed. It should be noted that citing journals of a journal are different from its cited journals; its citing journals are different from others in same journal group.

Models and methods

According to above analyses of the disturbed referencing or citation process and transfer function models of the aging process and the publishing process (Yu et al. 2005), the transfer function frame of the observed referencing (citation) process is shown in Fig. 3.

Fig. 3
figure 3

The transfer function frame of the observed referencing (or citation) process

In Fig. 3, transfer function (1) describes the diffusing process of cited literature, transfer function (2) describes the ageing process of cited literature (Yu et al. 2005), transfer function (3) describes the publishing process of citing literature (Yu et al. 2005). Therefore, two main reasons caused longer identified average delay \( \left( {\bar{T} = T_{s} + \tau } \right) \) than the observed average publication delay are the diffusing process of cited literature and limited data of the age distribution of citations (references); the identified delay parameters from the observed citation distribution data would be added if the diffusing process was not calculated in the mathematical model of the observed citing process. Because the diffusing process is very short (1–2 months or so) in modern times, is can be neglected according to the simplifying principle of mathematical modeling. The simplified model-the second order transfer function model with delay of the disturbed citing process had been established using system identification in that paper (Yu et al. 2005); the model is

$$ W(s) = {\frac{{e^{ - \tau s} }}{{(T_{1} s + 1)(T_{s} s + 1)}}} $$
(1)

In Eq. 1, s is the Laplace variable; T 1 is called as the aging time constant related to obsolescence of cited literature; T s is called as the publishing time constant related to the average publication delay of citing literature, τ is the publication pure delay of citing literature. In that paper (Yu and Li 2007), we chose the classical age distribution data of citations of two biochemistry journals (Wang 1997, p. 353) and validated that the model can be used to analyze theoretically the citation distribution law of a journal.

About analytical expressions of the citation age distribution

Based on the transfer function model (Eq. 1), we have deduced analytical expressions of the accumulated citation distribution function (Eq. 2) and the citation distribution function (Eq. 3) using inverse Laplace transform:

$$ C(T) = 1 - {\frac{{T_{1} \cdot e^{{ - {\frac{T - \tau }{{T_{1} }}}}} }}{{T_{1} - T_{s} }}} + {\frac{{T_{s} \cdot e^{{ - {\frac{T - \tau }{{T_{s} }}}}} }}{{T_{1} - T_{s} }}} $$
(2)
$$ f(T) = {\frac{{e^{{ - {\frac{T - \tau }{{T_{1} }}}}} }}{{T_{1} - T_{s} }}} - {\frac{{e^{{ - {\frac{T - \tau }{{T_{s} }}}}} }}{{T_{1} - T_{s} }}} $$
(3)

In that paper (Yu and Li 2007), Eq. 2 was used to identify parameters of 12 citation distributions; the identified results shown that bigger one of two time constants \( \left( {T_{1} \,{\text{and}}\,T_{S} } \right) \) identified from citation distribution data was T 1 and smaller one was T s . It should be explained that \( \bar{T}\, = \,T_{s} \, + \,\tau \) was defined as the journal average publication delay indicator in that paper (YU et al., 2004), \( T_{s} = {N \mathord{\left/ {\vphantom {N Y}} \right. \kern-\nulldelimiterspace} Y},\,\;{\text{and}}\,N \) is the deposited contribution quantity of the journal at steady state, Y is the quantity of papers published in the journal per year. Both models (Eqs. 2 and 3) can be used to solve model parameters. When Eq. 3 is regarded as the model to solve optimum parameters, the parameter optimization fitting method is used to identify optimum model parameters (Ljung 1999, p. 501). This method which was introduced in our papers (Yu et al. 2005; Yu and Li 2007) is a optimization problem to estimate \( T_{1} ,T_{s} \,\;{\text{and}}\,\;\tau, \) the aim is to minimize the objective function:

$$ J^{\prime}_{{}} = \sum\limits_{k = 1}^{m} {(\hat{f}(T_{k} ) - f(T_{k} ))^{2} } $$
(4)

In Eq. 4, \( \hat{f}(T_{k} ) \) is the actual data of the citation distribution in JCR, f(T k ) is calculated by Eq. 3 as the theory value. When model parameters of the citing process are identified by fitting C(T) with the cumulative citation distribution data of a journal in JCR 2005, the identifying objective function is:

$$ J = \sum\limits_{k = 1}^{m} {(\hat{C}(T_{k} ) - C(T_{k} ))^{2} } $$
(5)

In Eq. 5, \( \hat{C}(T_{k} ) \)is the actual data of the cumulative percent of citation distribution in JCR, C(T k ) is calculated by Eq. 2 as the theory value.

Comparison and validation of citation distribution models

As a valid tool for quantitatively analyzing the observed aging process, Eq. 2 or 3 is an applicable and effective model; however it should be practically validated whether the model is superior to others. In this section, we compare this model with other citation distribution models using the classical age distribution data of 80005 citations of 2,595 papers (published in 1980) from Journal of Biological Chemistry and Biochemistry (Wang 1997, p. 353) and illuminate the superiority of the model.

In this paper, we let T describe citation age and put those classical citation distribution models into two categories as follows:

Bernal negative exponential model and its improved model

Bernal negative exponential aging model is a classical citation distribution model proposed by BERNAL, a English physical scientist, at 1958 (Wang 1997); its expression is

$$ f(T) = K \cdot e^{ - \alpha T} $$
(6)

In Eq. 6, T is the age of literature, K is a constant, α is the aging coefficient. Bernal model is laconic and can reflect the change law of references along with time on the whole; but the fitting effect was not perfect and the fitting error was dramatic at the beginning of the reference age distribution; so people reused the cumulative citation distribution model. By integrating Eq. 6, we can obtain the cumulative citation distribution function (dimensionless).

Brookes (1970) proposed an improved Bernal model-the negative exponential model of cumulative citation distribution:

$$ C(T) = Ka^{T} $$
(7)

In here, C(T) is the number of literature published before T (including T year), namely the age of cited literature is older than T or equal to T, a < 1.

Substantively, above two models is equivalent in theory, Brooks model (Eq. 7) can be deduced from Bernal model. Based on definitions of r(T) and C(T), we can obtain

$$ C(T) = \int\limits_{\text{T}}^{\infty } {r(T)} \;dT = \int\limits_{\text{T}}^{\infty } {Ke^{ - \alpha T} } \;dT=\frac K{\alpha}{e^{ - \alpha T} } $$

Let \( K^{\prime} = {K \mathord{\left/ {\vphantom {K \alpha }} \right. \kern-\nulldelimiterspace} \alpha }, \) we have

$$ C(T) = K^{\prime}a^{T} $$

In here, \( a = e^{ - \alpha } \, < \,1, \) so Eqs. 6 and 7 is equivalent.

Burton and Kebler equation and its improved models

Burton and Kebler equation (shortened form is B–K equation) is a classical obsolescence model using citation analysis (Wang 1997), its expression is

$$ C(T) = 1 - (ae^{ - T} + be^{ - 2T} ) $$
(8)

In Eq. 8, a + b = 1, C(T) is a cumulative citation distribution function (dimensionless), T is the age of cited literature and its dimension is ten year. One of improved models of B–K equation is the corrected B–K equation proposed by мoтылeь at 1982 (Wang 1997):

$$ C(T) = 1 - (ae^{ - (T - 0.1)} + be^{ - 2(T - 0. 1)} ) $$
(9)

In Eq. 9, meanings of a, b, T and C(T) are same as Eq. 8 . The corrected equation illuminates the delay phenomena in research-citation cycle; but it is inexact that the delay is defined one year (0.1). This delay is mainly caused by the publishing process.

Another corrected B–K equation is a mathematical series expression proposed by Ding (1993):

$$ \begin{gathered} C(T) = 1 - \sum\limits_{i = 1}^{n} {a_{i} } \cdot e^{ - iT} \hfill \\ \sum\limits_{i = 1}^{n} {\mathop a\nolimits_{i} } = 1\quad{i} = 1,2,3\,\ldots n \hfill \\ \end{gathered} $$
(10)

In Eq. 10, definitions of C(T) and T are same as above. Eq. 10 is same as Eq. 8 when i  =  2; Eq. 10 may be superior to Eq. 8 when i > 2; however there would be redundant identified parameters so that the solving process of the model was very complicated.

According to the classical age distribution data of two Biochemistry journals, fitting results of four models are shown in Figs. 4 and 5, Fig. 4 shows fitting curves of cumulative citation distributions of four models, Fig. 5 shown fitting errors. Identified parameter of Bernal negative exponential model (the integral expression of Eq. 6 is \( C(T) = 1 - e^{ - \alpha T} \)):α = 0.1070; identified parameters of B–K equation: a = 0.1070, b = 0.8930; Identified parameters of corrected B–K equation: a = 0.0275, b = 0.9725; identified parameters of the second order transfer function model in this paper: T 1  = 6.2430, T s  = 0.4323, τ = 0.6525. We can obtain some conclusions:

Fig. 4
figure 4

Fitting results of four citation distribution models

Fig. 5
figure 5

Fitting errors of four citation distribution models

Firstly, fitting results of our model (Eq. 2) and the corrected B–K equation (Eq. 9) are finer than Bernal aging model (Eq. 6) and B–K equation (Eq. 8), especially at the beginning of the citation distribution curve; thereby the publication delay correction is necessary.

Secondly, the fitting error of our model is least and the problem of a big fitting error at the beginning of the citation distribution curve is avoided from Fig. 5; at same time, parameters of this model are endued with clear physical meanings, so the model is superior to others and should be used to study citing and cited pairs of a scientific journal find the intrinsic law of the journal’s referencing or citation process in this paper.

The calculating method of the half-life

In JCR database, there is a lack of reliable data when the citing (cited) half-life is over ten years. Yu et al. (2006) deduced the calculating equation of the Half-life and studied the change of T 0.5 along with T s or T 1 , respectively, the expression of T 0.5 is

$$ T_{1} \cdot e^{{ - {\frac{{T_{0.5} - \tau }}{{T_{1} }}}}} - T_{s} \cdot e^{{ - {\frac{{T_{0.5} - \tau }}{{T_{s} }}}}} = 0.5 \cdot (T_{1} - T_{s} ) $$
(11)

Eq. 11 is an implicit function of T 0.5 and describes the relation between T 0.5 and model parameters (τ, T s and T 1 ). when T s , T 1 and τ of a journal’s referencing (citation) process are identified from data of the citation (reference) distribution using Eq. 2 or Eq. 3, T 0.5 can be solved by Eq. 11.

Results

In this paper, thirty-eight journals of three groups of journals in two subjects are selected, including 28 Neurology journals from JCR 2003, 14 neurology journals and 10 applied mathematics journals from JCR 2005. JCR reports inter-citation frequencies for thousands of journals and provides, for each journal, a set of its most closely related journals based on citation relationships. According to statistical rule, the meaning which is reflected in identified results may be logical when original data is enough; therefore we select those journals which have more citations and references. In data tables, both journal groups are ranked according to the citing half-life from big to small, respectively. Selected two groups of Neurology journals are shown in Tables 1 and 3; a group of applied mathematics journals is shown in Table 5.

Table 1 28 neurology journals and their data from JCR (2003)

In Table 1, original data of citing half-lives and cited half-lives of 28 Neurology journals (from JCR 2003) are listed and every journal’s citing half-life and cited half-life are different; Table 2 shows identified parameters T 1, T s and τ from the reference distribution data and the citation distribution data of every journal in this group, respectively. Table 3 shows 14 Neurology journals and their data from JCR 2005; Table 4 shows identified parameters (T 1, T s and τ) data of the their referencing and citation processes. Table 5 shows 10 applied mathematics journals and their half-life data from JCR 2005; Table 6 shows identified parameters (T 1, T s and τ) of the referencing and citation processes of 10 journals. In these tables, parameter dimension is year. Some interesting results are as follows:

Table 2 Identified parameters data of referencing and citation processes of 28 neurology journals
Table 3 14 neurology journals and their data from JCR (2005)
Table 4 Identified parameters of referencing and citation processes of 14 neurology journals
Table 5 The essence data of 10 journals of applied mathematics in JCR (2005)
Table 6 Identified parameters of referencing and citation processes of 10 mathematics journals
  1. (1)

    From identified parameters of the referencing process in Tables 2, 4 and 6, differences of aging time constants (T 1) of referencing processes between most journals are small, such as aging time constants of most neurology journals (between 6 and 8 years), aging time constants of applied mathematics journals (between 12 and 17 years). The reason is that the time constant (T 1) is closely related to the aging of the subject which cited journals belong to; cited journals of a journal mostly belong to the same subject (neurology or applied mathematics); small differences between journals in the same group are brought by different cited journal groups of journals or diversities of scientific research domains. Identified parameters (T s and τ) of journals are different because two parameters (T s and τ) of a journal’s referencing process presumably reflects the publication delay of the journal, so identified average publication delays \( \left( {\bar{T}\, = T_{s} \, + \tau } \right) \) of these journals are quite different.

  2. (2)

    From identified parameters of the citation process in Table 2, the process of every journal of same subject mainly reflects characteristics of individual journal. Differences of aging time constants (T 1 ) of journals in every group are quite big, the change range of cited-T 1 is from 0.971 to 11.4 years in Table 2, from 3.0 to 11.4 years in Table 4 and from 2.91 to 31.58 years in Table 6. It is reflected that life cycles of these journals are very different, such as Acta Neuropathologica and Pain with longer aging time constants and cited half-lives than others. Identified publication delay parameters (T s and τ) from citation processes reflect the average publication delay of the referencing journal group of every journal to a certain extent.

Conclusions

According to the theoretical analysis and parameter identification results of observed reference or citation distributions of journals, some conclusions are obtained:

  1. (1)

    In a inter-citation journal group, a observed referencing process of a journal is consisted of the diffusing and aging of the cited literature and the publishing process of all citing articles published in the journal; a observed citation process of a journal is consisted of the diffusing and aging process of all cited articles to the journal and the publishing process of referencing paper published in the citing journal group.

  2. (2)

    To any subject journal group, the diffusing process of literature can be neglected under the modern communication technology; so the citation distribution model-Eq. 2 and the cumulated citation distribution model-Eq. 3 are logical and can be used to describe the observed referencing or citation process of a journal.

  3. (3)

    To a journal in a subject journal group, its referencing process and the citation process reflect different relations between individual and whole, former is active, later is passive. The citation process of the scientific journal presumably reflects the impact and the vitality of articles published in the journal and the average of publication delays of citing papers published in the journal group to certain extent; the journal’s referencing process presumably reflects the obsolescence of the subject literature, the citation mode and the average publishing speed of the journal.

  4. (4)

    The model describing the referencing and citation processes should be same, but significations of model parameters are different. Identified citing-T 1 of a journal reflects the obsolescence of cited literature in the subject; the identified parameter \( \bar{T}\, = T_{s} + \,\tau \) presumably reflects the mean publishing delay of the journal but is not real value. Identified cited-T 1 of the journal reflects the obsolescence of cited literature of the journal; the identified parameter \( \bar{T}\, = T_{s} + \,\tau \) presumably reflects the mean publishing delay of citing articles to the journal group, but is not real value too.

  5. (5)

    In that paper (Yu and Li 2007), one of main reasons which brought the difference between the identified publication delay parameter and the actual is that the diffusing process of the cited literature is not taken into account in modeling; another reason is the fitting error from restricted data points in ISI-JCR.

  6. (6)

    In identification results of the journal’s referencing process, differences of citing-T 1 s of journals are smaller than that of cited-T 1 ; the reason is that citing-T 1 is closely related to the literature obsolescence of the subject journal group, small differences are because the cited journal group of every citing journal is different. In the identification results of the journal’s citation process, differences of cited-T 1 s of journals are dramatic, one reason is that the cited-T 1 is closely related to the obsolescence of literature of every journal which has quite different aging rate, another is that the scholarly quality and the research domain of every journal is distinctive and its life-cycle is quite different from others; such as cited-T 1 s of application mathematic journals vary from 2.9 to 31.54 years. Aging also depends on document and journal types, the half-life of the basic theory discipline must be longer than the technical discipline, thus, citing- T 1 s and citing half-lives of application mathematic journals are longer than Neurology.

  7. (7)

    According to the model of the citation distribution, parameters T s and τ of the citation process theoretically reflect the average publication delay of whole citing journals (a journal group) to a certain extent; but identified parameters \( \bar{T}\, = T_{s} + \,\tau \) of every journal from actual citation data are quite different, it might be caused by different citing journal groups of journals and fitting errors; the significant of identified \( \bar{T} \) cannot be expressly explained because it is quite difficult to calculate the actual \( \bar{T} \) of the citing journal group.