Keywords

1 Introduction

A citation is a relation between two scientific publicationsFootnote 1. We can visualize it as an arrow from a node representing a citing publication to a node representing a cited publication. A collection of articles and citations between them form a directed graph called a citation graph or citation network [1]. A citation network analysis provides useful data for many research information systems. However, a citation is more than merely a relationship between two papers without any precise meaning and inner structure. Consequently, there is a vast amount of literature on the creation and analysis of citation content data (e.g. [2, 3]). For example, the citation context can provide us with knowledge about the reasons for a citation. This knowledge allows us to add meaning to an arrow representing a citation. To this end, we can use the CiTO ontology, enabling characterization of the nature or type of citations [4]. In our opinion, we can do something more. After reading two papers (citing and cited), we can add meaning to relations between parts of papers (e.g. concepts, definitions, figures) linked by a citation. It is possible because we know which parts (entities) from a cited paper are used in a citing paper and how they are used. Moreover, we can name relations between papers and entities from these papers. In this way, the structure of a citation emerges. The structure which is usually known to a reader but is not represented explicitly and machines cannot process it. Until recently, such a representation has not been possible. Nowadays, however, using semantic technologies, we can represent the structure of a citation in a machine-readable way. We have proposed such a representation based on the so-called expanded citations in our previous papers [5, 6].

Fig. 1.
figure 1

The global and local structure of a citation network (\(\bigcirc \) - publications, \(\bullet \) - entities from publications).

There is a vast amount of literature on citation networks and their global structure (see, e.g. [7]). In this paper, we are interested in the local properties of a citation network. Namely, we are going to present and shortly analyze the structures of individual citations containing not only papers but also entities from them (Fig. 1). The paper is organized as follows. Section 2 gives a brief overview of expanded citations. In Sect. 3, we analyze selected structures of citations. In particular, we discuss their meaning and consider whether the structures can be useful in the evaluation of scientist’s work. The paper ends with a short discussion and the outline of future work.

2 Expanded Citations

A bibliographic citation links two articles (see Fig. 2a). Egghe and Rousseau [1] state: the fact that a document is mentioned in a reference list indicates that in the author’s mind there is a relationship between a part or the whole of the cited document and a part or the whole of the citing document. Most studies have focused on a relationship between entire papers [1, 4, 7]. The point is that using expanded citations we can represent in a machine-readable form a relationship between parts of papers. Consequently, instead of one relation (cites) between two papers, we consider more relations between these papers and also between parts of them called concepts (see Fig. 2b). A concept we define as any entity (part) of a paper named with a URI (Uniform Resource Identifier) [5]. We assume that it is possible to assign a URI to each entity from a scientific publication (for details - see [5]). In the rest of this paper, a concept from a publication X we denote by \(C_X\).

Fig. 2.
figure 2

A citation A cites B (a) and its exemplary structure (b).

We are now ready to introduce the main definition of this paper (see [6] and references therein). Let A and B be two publications. We say that a citation \(A\rightarrow B\) (A cites B) is expandable if there exist concepts \(C_A\) and \(C_B\), relations r, \(r_A\), \(r_B\) represented by object properties from some ontology O and the following RDF (Resource Description Framework) triples:

$$\begin{aligned} B\,r_B\,C_B.\,\,\,C_A\,r\,C_B.\,\,\,A\,r_A\,C_A. \end{aligned}$$
(1)

We call the set of triples (1) an expanded citation. Moreover, an expanded citation created for a standard citation we call its expansion.

Fig. 3.
figure 3

An expanded citation.

Note that, using one expanded citation we can describe in RDF one “path” between nodes A and B representing articles (Fig. 3). Consequently, to represent a citation structure that can be made up of several “paths”, we may need a few expansions (compare Figs. 2b and 3). Moreover, note that in order to create expanded citations, we need terms from appropriate ontologies to add semantics to relations between publications and concepts. An example of such an ontology is CiTO [4].

3 Citations and Their Structures

Let us now consider three examples of citations and their structuresFootnote 2.

  1. (I)

    Citing entity: DOI 10.2478/plc-2013-0010

    Cited entity: ISBN 10 0195070038

    Citation context: This process is called “knowledge construction” (e.g. Rogoff, 1990).

    The author refers to the concept of knowledge construction introduced in the cited article. We present in Fig. 4-I an expansion for this citation.

  2. (II)

    Citing entity: DOI 10.1103/PhysRevA.58.4336

    Cited entity: DOI 10.1103/PhysRevA.54.4676

    Citation contexts: Grot, Rovelli and Tate [13] introduced a regularized self-adjoint operator and considered the full expression (10) for possible application to more general states having both positive and negative momenta but vanishing in the proximity of \(p=0\).

    Grot, Rovelli, and Tate [13] (...) produce a “regularized” self-adjoint time operator \(T_\varepsilon \) with eigenstates (...), \(\langle p|T\pm \rangle _\varepsilon =...\) (17).

    Formulas (10) and (17) from the citing paper are in well-defined relations with formulas (56) and (41) from the cited paper. Hence, we represent the structure of this citation by two expanded citations - see Fig. 4-II.

  3. (III)

    Citing entity: DOI 10.1007/s10814-010-9045-7

    Cited entity: ISBN 10 0122598504

    Citation contexts: Fig. 3 Ground plans Ground plans of prehispanic houses from Oaxaca redrawn and adapted from the following sources: (...)(Flannery and Winter 1976, Fig. 2.17)(...) Fig. 6 Two extensively and meticulously excavated houses.(...) Flannery and Winter 1976, Fig. 2.17(...).

    Figures 3 and 6 use figure 2.17 from the cited paper. The structure of this citation is presented in Fig. 4-III.

Fig. 4.
figure 4

The structures of citations from examples A-C.

Fig. 5.
figure 5

The structures of citations with at most 2 expansions.

The structures of citations presented in Fig. 4 do not exhaust all possibilities of connection between two papers. Let us consider what structures, in general, are possible in this case. We limit ourselves to structures of citations with at most two expansions. All possible structures, in this case, are presented in Fig. 5. The citation presented above in example (I) has structure \(\mathbf {1}\) (1-chain). Citations having this structure are often used in Introduction, Related Works or Discussion sections. Note that, in structure \(\mathbf {1}\), paper A refers directly to \(C_B\). If \(C_B\) is somehow used in A, then there may exist a concept \(C_A\) in some relation with \(C_B\). A citation has then structure \(\mathbf {2}\) (2-chain). Structure \(\mathbf {3}\) (diamond) corresponds to the situation when A directly refers to two concepts from B. However, these concepts are not “linked” to any concepts from A. In turn, structures \(\mathbf {4}\) (pentagon) and \(\mathbf {5}\) (hexagon) contain concepts from A related to concepts from B. Structure \(\mathbf {5}\) contains two concepts from A. This structure appeared already in the above example (II). In structure \(\mathbf {4}\), a concept from B is related to a concept from A. Moreover, another concept from B is only discussed or mentioned in paper A. It is important to note that two expansions of a citation may overlap. For example, they may contain the same concept. In structure \(\mathbf {6}\) two different concepts from A are linked to the same concept from B. By analogy to bibliographic coupling [7] we can say that the two concepts from A are conceptually coupled because they both are linked to the same concept from B. Figures #Fig_3 and #Fig_6 from our example (III) are conceptually coupled (see Fig. 4-III). In structure \(\mathbf {7}\) a concept from A is linked to two concepts from B. In this case, by analogy to co-citation [7], we can say that two concepts from B are co-used by a concept from A. It seems reasonable to assume that all structures \(\mathbf {1}\)\(\mathbf {7}\) may appear in practice. So one may ask the question: how can we use them? Let us now consider whether the structures of citations can be useful in the evaluation of scientist’s work.

Nowadays, in the evaluation of a scientist’s work, only the presence of a citation is taken into account [1]. The structure (meaning) of a citation is ignored. However, the structure may contain important information. It is reasonable to assume that for authors particularly valuable are those publications of others in which concepts (e.g. propositions, approaches, formulas) from their publications are somehow used, i.e. they are related to (new) concepts from citing publications. We may say that such used concepts contribute to the progress of science. On the contrary, citations of the form: In the literature, there are many examples of... placed in the Introduction section are of less value. Nowadays, these citations are treated equally. The knowledge of a citation structure enables us to distinguish between them. Without considering the details of the structures, we can assume that the more concepts (from a citing paper) in a citation structure, the higher the value of a citation. Using this “rule”, we can sort the structures presented in Fig. 5 (in importance increasing order) by their values as follows: \(\mathbf {1,3,2,4,7,6,5}\). Note that the above considerations are valid, assuming that citations are positive or at least neutral. However, this is not necessarily the case. There are also negative citations which may indicate problems or flaws in the work or an opposing viewpoint. For example, in paper A, two counterexamples to a statement proposed in B can be given. This negative citation has structure \(\mathbf {6}\), which is very valuable according to the above list. Thus, for negative citations, the proposed “rule” of citation evaluation does not apply.

4 Discussion and Future Work

Expanded citations allow representing the structure of a citation in a machine-readable way. This possibility opens new perspectives for processing (searching and visualizing) of scientific domains and the evaluation of a scientist’s work (see [6] and references therein). However, the picture is still far from completeness. Further studies are needed to estimate how large is the class of expandable citations. Does the size of this class depend on the scientific domain? In order to create an expanded citation, we need appropriate ontologies. Consequently, future work should also determine ontologies and terms from them useful in expanded citations. Another critical issue for future studies is to determine the extent to which machines can support the creation of expanded citations. The results obtained in the automatic classification of citation function using CNNs (Convolutional Neural Networks) and NLP (Natural Language Processing) suggest that machine support in the creation of expanded citations cannot be ruled out [8].