1 Introduction

In a temporal network, the presence and activity of nodes and links can change through time. In the last two decades, the interest for the analysis of temporal networks increased partially motivated by travel-support services and the analysis of sequences of interaction events (e-mails, news, phone calls, collaboration, etc.). The approaches and results were recently surveyed in the book Holme and Saramäki (2013). See also papers Holme and Saramäki (2012) and Holme (2015).

Most of temporal social networks data contain the information about activity time intervals of their links, sometimes augmented by the activity intensity. The usual approach to the (data) analysis of temporal networks is to transform the network into a sequence of time slices—static networks corresponding to selected time intervals—see, for example, Moody et al. (2005), Kim et al. (2012), Gulyás et al. (2013). Afterward, each time slice is analyzed using the standard methods for analysis of static networks. Finally, the results are collected into a temporal sequence of results. In this paper, we propose an alternative approach, based on the notion of a temporal quantity, that bypasses explicit construction of time slices. The developed algorithms are transforming temporal networks directly into results in the form of temporal quantities, vectors, temporal vectors or partitions, and temporal networks.

In the paper, we first present the basic notions about temporal networks. In Sect. 3, we introduce the temporal quantities and propose an algebraic approach, based on semirings, to the analysis of temporal networks. In the following sections we show that most of the traditional network analysis concepts and algorithms such as degrees, clustering coefficient, closeness, betweenness, weak and strong connectivity, and PathFinder skeleton can be straightforwardly extended to their temporal versions.

2 Description of temporal networks

For the description of temporal networks we propose an elaborated version of the approach used in Pajek (de Nooy et al. 2012). Pajek supports two types of descriptions of temporal networks based on presence and on events (Pajek 0.47, July 1999). Here, we describe only the approach to capturing the presence of nodes and links.

A temporal network \(\mathcal {N}_T =(\mathcal {V},\mathcal {L}, \mathcal {T},\mathcal {P},\mathcal {W})\) is obtained by attaching the time, \(\mathcal {T}\), to an ordinary network, where \(\mathcal {T}\) is a set of time points, \(t \in \mathcal {T}\). \(\mathcal {V}\) is the set of nodes, \(\mathcal {L}\) is the set of links, \(\mathcal {P}\) is the set of node properties, and \(\mathcal {W}\) is the set of link properties or weights (Batagelj 2009). The time \(\mathcal {T}\) is usually either a subset of integers, \(\mathcal {T}\subseteq \mathbb {Z}\), or a subset of reals, \(\mathcal {T}\subseteq \mathbb {R}\). In Pajek \(\mathcal {T}\subseteq \mathbb {N}\). In a general setting, it could be any linearly ordered set.

In a temporal network, nodes \(v \in \mathcal {V}\) and links \(l \in \mathcal {L}\) are not necessarily present or active at all time points. Let T(v), \(T \in \mathcal {P}\), be the activity set of time points for the node v; T(l), \(T \in \mathcal {W}\), the activity set of time points for the link l. The following consistency condition is imposed: if a link l(uv) is active at the time point t then its end-nodes u and v should be active at the time t. Formally, we express this by

$$\begin{aligned} T(l(u,v)) \subseteq T(u) \cap T(v) . \end{aligned}$$

The activity set T(e) of a node/link e is usually described as a sequence of activity time intervals

$$\begin{aligned} ([s_i,f_i))_{i=1}^k, \end{aligned}$$

where \(s_i\) is the starting time and \(f_i\) is the finishing time.

We denote a network consisting of links and nodes active at the time \(t \in \mathcal {T}\) by \(\mathcal {N}(t)\) and call it the (network) time slice or footprint of t. Let \(\mathcal {T}' \subset \mathcal {T}\) (for example, a time interval). The notion of a time slice is extended to \(\mathcal {T}'\) by

$$\begin{aligned} \mathcal {N}(\mathcal {T}') = \bigcup _{t\in \mathcal {T}'} \mathcal {N}(t) . \end{aligned}$$

2.1 Examples

Let us look at some examples of temporal networks.

Citation networks can be obtained from bibliographic data bases such as Web of Science (Knowledge) and Scopus. In a citation network \(\mathcal {N} =(\mathcal {V},\mathcal {L}, \mathcal {T},\mathcal {P},\mathcal {W})\), its set of nodes \(\mathcal {V}\) consists of selected works (papers, books, reports, patents, etc.). There exists an arc l(uv) \(\in \mathcal {L}\) iff the work u cites the work v. The time set \(\mathcal {T}\) is usually an interval of years \([\textit{year}_{\textit{first}}, \textit{year}_{\textit{last}}]\) in which the works were published. The activity set of the work v, T(v), is the interval \([\textit{year}_{\textit{pub}}(v), \textit{year}_{\textit{last}}]\); the activity set of the arc l(uv), T(l), can be set to the interval \([\textit{year}_{\textit{pub}}(u), \textit{year}_{\textit{pub}}(u)]\) (instances approach) or to the interval \([\textit{year}_{\textit{pub}}(u), \textit{year}_{\textit{last}}]\) (cumulative approach). An example of a property \(p \in \mathcal {P}\) is the number of pages or the number of authors. Other properties, such as work’s authors and keywords, are usually represented as two-mode networks.

Project collaboration networks are usually based on some project data base such as Cordis. The set of nodes \(\mathcal {V}\) consists of participating institutions. There is an edge \(e(u:v) \in \mathcal {L}\) iff institutions u and v work on a joint project. The time set \(\mathcal {T}\) is an interval of dates/days \([\textit{day}_{\textit{first}}, \textit{day}_{\textit{last}}]\) in which the collaboration data were collected. \(T(v) = \mathcal {T}\) and \(T(e) = \{ [s,f] :\) there exists a project P such that u and v are partners on P; s is the start and f is the finish date of \(P \}\).

KEDS/WEIS networks are networks registering political events in critical regions in the world (Middle East, Balkans, and West Africa) on the basis of daily news. Originally they were collected by KEDS (Kansas Event Data System). Currently they are hosted by Parus Analytical Systems. The set of nodes \(\mathcal {V}\) contains the involved actors (states, political groups, international organizations, etc.). The links are directed and are describing the events:

$$\begin{aligned} ( date, actor_1, actor_2, action ) \end{aligned}$$

on a given date the \(actor_1\) made the action on the \(actor_2\). Different actions are determining different relations—we get a multirelational network with a set of links partitioned by actions \(\mathcal {L} = \{ \mathcal {L}_\alpha : \alpha \in \textit{Actions} \}\). The time set is determined by the observed period \(\mathcal {T}= [day_{\textit{first}},day_{\textit{last}}]\). Since most of the actors are existing during all the observed period their node activity time sets are \(T(v) = \mathcal {T}\). Another option is to consider as their node activity time sets the period of their engagement in the region. The activity time set T(l) of an arc \(l(u,v) \in \mathcal {L}_\alpha\) contains all dates – intervals \([day,day+1)\) – in which the actor u made an action α on the actor v. Another possibility is to base the description on a single relation network and store the information about the action α as a structured value in a triple \((day,day+1,value)\) \(value = [(action_1,count_1), (action_2,count_2), \ldots ,\) \((action_k,count_k) ]\) and introduce an appropriate semiring over such values (see Sect. 3).

There are many other examples of temporal networks such as genealogies, contact networks, and networks of phone calls. In Sect. 4, we shall analyze the Franzosi’s temporal network on violence in Italy (1919–1922), in Sect. 6, we shall apply proposed methods to the temporal bibliographic network on the stem cell research in Spain (1997–2012), and in Sect. 13 to the September 11 Reuters terror news temporal network.

3 Temporal quantities

Besides the presence/absence of nodes and links also their properties can change through time. For example, in the World trade flows (1962–2000) temporal network an arc from a country u to country v contains for each year the information on the value of export from u to v (Feenstra et al. 2005).

To describe the changes, we introduce the notion of a temporal quantity a with the activity set \(T_a \subseteq \mathcal {T}\)

where \(a'(t)\) is the value of a at an instant t, and denotes the value undefined.

We assume that the values of temporal quantities belong to a set A which is a semiring \((A,\oplus ,\odot ,0,1)\) for binary operations \(\oplus : A\times A \rightarrow A\) and \(\odot : A\times A \rightarrow A\) (Gondran and Minoux 2008; Batagelj 1994). This means that \((A,\oplus ,0)\) is an Abelian monoid—the addition \(\oplus\) is associative and commutative, and has 0 as its neutral element; \((A,\odot ,1)\) is a monoid – the multiplication \(\odot\) is associative and has 1 as its neutral element. In addition, multiplication distributes from both sides over the addition. Note that 0 and 1 denote the two elements of A that satisfy the required properties. In expressions, the precedence of the multiplication \(\odot\) over the addition \(\oplus\) is assumed. We can extend both operations to the set by requiring that for all it holds

The structure is also a semiring.

Fig. 1
figure 1

Semiring addition and multiplication in networks

The “default” semiring is the combinatorial semiring \((\mathbb {R}_0^+,+,\cdot ,0,1)\) where \(+\) and \(\cdot\) are the usual addition and multiplication of real numbers. In some applications other semirings are useful.

In applications of semirings in the analysis of graphs and networks the addition \(\oplus\) describes the composition of values on parallel walks and the multiplication \(\odot\) describes the composition of values on sequential walks—see Fig. 1. For the combinatorial semiring these two schemes correspond to basic principles of combinatorics: the Rule of Sum and the Rule of Product Riordan (1958).

The semiring \((\overline{\mathbb {R}_0^+}, \min , +, \infty , 0)\), \(\overline{\mathbb {R}_0^+} = \mathbb {R}_0^+ \cup \{\infty \}\), is suitable to deal with the shortest paths problem in networks; the semiring \((\{0,1\}, \vee , \wedge , 0, 1)\) for reachability problems. The standard references on semirings are Carré (1979) and Gondran and Minoux (2008).

3.1 Semiring of temporal quantities

Let denote the set of all temporal quantities over in the time \(\mathcal {T}\). To extend the operations to networks and their matrices, we first define the sum (parallel links) \(a \oplus b = s\) as

$$\begin{aligned} (a \oplus b)(t) = a(t) \oplus b(t) \end{aligned}$$

and \(T_s = T_a \cup T_b\); the product (sequential links) \(a \odot b = p\) as

$$\begin{aligned} (a \odot b)(t) = a(t) \odot b(t) \end{aligned}$$

and \(T_p = T_a \cap T_b\).

In these definitions and also in the following text, to avoid the ‘pollution’ with many different symbols, we use the symbols \(\oplus\) and \(\odot\) to denote the semiring operations. The appropriate semiring can be determined from the context. For example, in the definition of addition of temporal quantities the symbol \(\oplus\) on the left-hand side of the equation operates on temporal quantities and the symbol \(\oplus\) on the right-hand side denotes the addition in the basic semiring .

Let us define the temporal quantities \(\mathbf {0}\) and \(\mathbf {1}\) with requirements and \(\mathbf {1}(t) = 1\) for all \(t \in \mathcal {T}\). It is easy to verify that the structure is also a semiring, and therefore so is the set of square matrices of order n over it for the addition \(\mathbf {A} \oplus \mathbf {B} = \mathbf {S}\)

$$\begin{aligned} s_{ij} = a_{ij} \oplus b_{ij} \end{aligned}$$

and multiplication \(\mathbf {A} \odot \mathbf {B} = \mathbf {P}\)

$$\begin{aligned} p_{ij} =\mathop{\bigoplus}\limits_{{h=1}}^{n} a_{ih} \odot b_{hj} . \end{aligned}$$

Again, the symbols \(\oplus\) and \(\odot\) on the left-hand side operate on temporal matrices and on the right-hand side in the semiring of temporal quantities.

The matrix multiplication is closely related to traveling on networks. Consider an entry \(p_{ij}\) at an instant t

$$\begin{aligned} p_{ij}(t) =\mathop{\bigoplus} \limits _{{h=1}}^{n} a_{ih}(t) \odot b_{hj}(t) . \end{aligned}$$

For a value \(p_{ij}(t)\) to be defined (different from ) there should exist at the instant t at least one node h such that both the link (ih) and the link (hj) exist—the transition from the node i to the node j through a node h is possible. Its contribution is \(a_{ih}(t)\odot b_{hj}(t)\). This means that the matrix multiplication is taking into account only the links inside the time slice \(\mathcal {N}(t)\).

A construction of semirings for problems on temporal walks—journeys is a topic for further research (Praprotnik and Batagelj 2016b).

3.2 Operationalization

In the following, we shall limit our discussion to temporal quantities that can be described in the form of time interval/value sequences

$$\begin{aligned} a = ( (I_i, v_i) )_{i=1}^k \end{aligned}$$

where \(I_i\) is a time interval and \(v_i\) is a value of a on this interval. The number k denotes the length (number of terms) of the sequence a. In general, the intervals can be of different types: 1) \([ s_i, f_i]\); 2) \([ s_i, f_i)\); 3) \(( s_i, f_i]\); 4) \(( s_i, f_i)\). Also the value \(v_i\) can be structured. For example, \(v_i = (w_i, c_i, \tau _i)\)—weight, capacity, and transition time; or \(v_i = (d_i,n_i)\)—the length of geodesics and the number of geodesics, etc. We require \(s_i \le f_i\), for \(i=1,\ldots ,k\) and \(s_{i-1} < s_i\), for \(i=2,\ldots ,k\).

To simplify the exposition we will assume in the following that all the intervals in our descriptions of temporal quantities are of type 2: \([ s_i, f_i)\) and \(f_{i-1} \le s_i\), for \(i=2,\ldots ,k\). Therefore, we can describe the temporal quantities with sequences of triples

$$\begin{aligned} a = ( (s_i, f_i, v_i) )_{i=1}^k . \end{aligned}$$

In the examples we also assume that

$$\begin{aligned} \mathcal{T} = [t_{min}, t_{max}] \subset \mathbb {N}. \end{aligned}$$

To provide a computational support for the proposed approach we are developing in Python a library TQ (Temporal Quantities). In the examples, we will use the Python notation for temporal quantities.

The following are two temporal quantities a and b represented in Python straightforwardly as a list of triples

figure k
figure l
figure m

The temporal quantity a has on the interval [1, 5) (i.e., in instances 1, 2, 3, and 4) value 2; on the interval [6, 8) value 1; on the interval [11, 12) value 3, etc. Outside the specified intervals its value is undefined, .

The temporal quantities can also be visualized as it is shown for a and b at the top half of Fig. 2.

For the simplified version of temporal quantities we wrote procedures \(\textit{sum}\) (Algorithm 1) for the addition and \(\textit{prod}\) (Algorithm 2) for the multiplication of temporal quantities over the selected semiring. Because, by assumption, the triples in a description of a temporal quantity are ordered by their starting times, we can base both procedures on the ordered lists merging scheme. The basic semiring operations of addition and multiplication are provided by functions \(\textit{sAdd}\) and \(\textit{sMul}\).

The function \(\textit{length}(a)\) returns the length (number of items) of the list a. The function \(\textit{get}(a)\) returns the current item of the list a and moves to the next item; if the list is exausted it returns a ‘sentinel’ triple \((\infty ,\infty ,0)\). The statement \((s,f,v) \leftarrow e\) describes the unpacking of the item e into its parts. The statement \(c.\textit{append}(e)\) appends the item e to the tail of the list c. The function \(\textit{standard}(a)\) joins, in the list a, adjacent time intervals with the same value into a single interval.

Fig. 2
figure 2

Addition and multiplication of temporal quantities

Fig. 3
figure 3

Addition and multiplication of temporal quantities—growth of size

The following are the sum s and the product p of temporal quantities a and b. They are visually displayed at the bottom half of Fig. 2.

figure n

Let \(l_a = \textit{length}(a)\) and \(l_b = \textit{length}(b)\). Then, assuming that the semiring operations take constant time each, the time complexity of both algorithms is \(O(l_a+l_b)\). The example in Fig. 3 shows that in extreme cases the sum can be almost four times longer than each of its arguments, and the product almost twice as long as the arguments. If \(\mathcal{T} = [t_{min}, t_{max}] \subset \mathbb {N}\) the length of a list describing a temporal quantity can not exceed \(L = 1 + t_{max}-t_{min}\).

3.3 The aggregated value

In some applications over the combinatorial semiring we shall use the aggregated value of a temporal quantity \(a = ( (s_i, f_i, v_i) )_{i=1}^k\). It is defined as

$$\begin{aligned} \Sigma a = \sum _{i=1}^k \left( f_i-s_i\right) \cdot v_i \end{aligned}$$

and is computed using the procedure \(\textit{total}(a)\). For example \(\Sigma a = 23\) and \(\Sigma b = 30\). Note that \(\Sigma a + \Sigma b = \Sigma (a+b)\).

3.4 Temporal partitions

The description of temporal partitions has the same form as the description of temporal quantities

$$\begin{aligned} a = ( (s_i, f_i, v_i) )_{i=1}^k. \end{aligned}$$

They differ only in the interpretation of values \(v_i \in \mathbb {N}\). In case of partitions \(v_i = j\) means that the unit described with a belongs to a class j in the time interval \([s_i,f_i)\). We shall use temporal partitions to describe connectivity components in Sect. 10.

We obtain a more adequate description of temporal networks using vectors of temporal quantities (temporal vectors and temporal partitions) for describing properties of nodes and making also link weights into temporal quantities. In the current version of the library TQ, we use a representation of a network \(\mathcal {N}\) with its matrix \(\mathbf {A} = [a_{uv}]\)

figure a

where w(uv) is a temporal weight attached to a link (uv).

3.5 Products of a temporal matrix and a temporal vector

In some applications the product of a temporal matrix with a temporal vector is useful. There are two products—left and right.

Let \(\mathbf {A}\) be a temporal matrix of size \(n\times m\), \(\mathbf {v}\) a temporal vector of size n, and \(\mathbf {u}\) a temporal vector of size m. The product from left of \(\mathbf {A}\) with \(\mathbf {v}\), denoted by \(\mathbf {u} = \mathbf {v} \bullet \mathbf {A}\), is defined by

$$\begin{aligned} u_j =\mathop {\bigoplus} \limits_{{i=1}}^{n} v_i \odot a_{ij}, \qquad j = 1, \ldots , m \end{aligned}$$

and the product from right of \(\mathbf {A}\) with \(\mathbf {u}\), denoted by \(\mathbf {v} = \mathbf {A} \bullet \mathbf {u}\), is defined by

$$\begin{aligned} v_i = \mathop{\bigoplus}\limits_{{j=1}}^{m} a_{ij} \odot u_j, \qquad i = 1, \ldots , n. \end{aligned}$$

In the TQ library both products are implemented as functions \(\textit{MatVecMulL}(A,v)\) and \(\textit{MatVecMulR}(A,v)\).

If a vector \(\mathbf {v}\) of size n is considered as a column vector – an \(n \times 1\) matrix – it holds \(\mathbf {v} \bullet \mathbf {A} = (\mathbf {v}^T \odot \mathbf {A})^T\) and \(\mathbf {A} \bullet \mathbf {u} = \mathbf {A} \odot \mathbf {u}\). The symbol T denotes the matrix transposition operation.

4 Node activities

In this section, we show how we can use the proposed operations with temporal quantities (the addition) for a simple analysis of temporal networks.

Assume that the values in temporal quantities \(a_{uv}\) from a temporal network matrix \(\mathbf {A}\) are positive real numbers measuring the intensity of the activity of the node u on the node v. We define the activity of a group of nodes \(\mathcal {V}_1\) on a group \(\mathcal {V}_2\) (using the combinatorial semiring) as

$$\begin{aligned} \text{act}(\mathcal {V}_1,\mathcal {V}_2) = \sum _{u\in \mathcal {V}_1} \sum _{v\in \mathcal {V}_2} a_{uv} . \end{aligned}$$

To illustrate the notion of activity, we applied it on Franzosi’s violence temporal network (Franzosi 1997). Roberto Franzosi collected from the journal news in the period January 1919–December 1922 information about the different types of interactions between political parties and other groups of people in Italy. The violence network contains only the data about violent actions and counts the number of interactions per month.

Fig. 4
figure 4

Intensity of violent activities of police, fascists, and all

We determined the temporal quantities

$$\begin{aligned} pol = \text{act}(\{ \text{ police } \}, \mathcal {V}) + \text{act}(\mathcal {V}, \{ \text{ police } \}),\\ \textit{fas} = \text{act}(\{ \text{fascists} \}, \mathcal {V}) + \text{act}(\mathcal {V}, \{ \text{fascists} \}), {\text {and}} \\ all = \text{act}(\mathcal {V}, \mathcal {V}).\\ \end{aligned}$$

They are presented in Fig. 4. Comparing the intensity charts of police and fascists activity with overall activity, we see that most of the violent activities in the first two years 1919 and 1920 were related to the police. In the next two years (1921 and 1922) they were taken over by the fascists.

Fig. 5
figure 5

First example network. All unlabeled links have a value of [(1, 9, 1)]

5 Temporal degrees

For an ordinary graph with a (binary) adjacency matrix \(\mathbf {A}\) we can compute the corresponding indegree, \(\mathbf {i}\), and outdegree, \(\mathbf {o}\), vectors using (over the combinatorial semiring) the relations

$$\begin{aligned} \mathbf {i} = \mathbf {e} \bullet \mathbf {A} \qquad \text{ and } \qquad \mathbf {o} = \mathbf {A} \bullet \mathbf {e}, \end{aligned}$$

where \(\mathbf {e}\) is a column vector of size \(n = |\mathcal {V}|\) with all its entries equal to 1. The same holds for temporal networks. In this case, the vector \(\mathbf {e}\) contains as values the temporal unit \(\mathbf {1} = [(0,\infty ,1)]\).

For a temporal network presented in Fig. 5, the corresponding temporal indegrees and outdegrees are given in Table 1. For example, the node 5 has in the time interval [1, 5) outdegree 2. Because the arc (5, 7) disappears at the time point 5, the outdegree of the node 5 diminishes to 1 in the interval [5, 9).

Table 1 Temporal indegrees and outdegrees for the first example network

We will use the simple temporal network from Fig. 5 also for the illustration of some other algorithms because it allows the users to manually check the presented results.

6 Temporal co-occurrence networks

Let the binary matrix \(\mathbf {A}=[a_{ep}]\) describe a two-mode network on the set of events E and the set of participants P:

$$\begin{aligned} a_{ep} = \left\{ \begin{array}{ll} 1 \qquad &{} p \text{ participated } \text{ in } \text{ the } \text{ event } e \\ 0 &{} \text{ otherwise } \end{array}\right. . \end{aligned}$$

The function \(d: E \rightarrow \mathcal {T}\) assigns to each event e the date d(e) when it happened. \(\mathcal {T}= [\textit{first}, \textit{last}]\). Using these data we can construct two temporal affiliation matrices:

  • instantaneous \(\mathbf {Ai}=[ai_{ep}]\), where

  • cumulative \(\mathbf {Ac}=[ac_{ep}]\), where

In Python, the value is represented as \([\ ]\).

Using the multiplication of temporal matrices over the combinatorial semiring we get the corresponding instantaneous and cumulative co-occurrence matrices

$$\begin{aligned} \mathbf {Ci} = \mathbf {Ai}^T \cdot \mathbf {Ai} \qquad \text{ and } \qquad \mathbf {Cc} = \mathbf {Ac}^T \cdot \mathbf {Ac} . \end{aligned}$$

A typical example of such a matrix is the papers' authorship matrix where E is the set of papers, P is the set of authors, and d is the publication year (Batagelj and Cerinšek 2013).

Table 2 Temporal collaboration

The triple (sfv) in a temporal quantity \(ci_{pq}\) tells that in the time interval [sf) there were v events in which both p and q took part.

The triple (sfv) in a temporal quantity \(cc_{pq}\) tells that in the time interval [sf) there were in total v accumulated events in which both p and q took part.

The diagonal matrix entries \(ci_{pp}\) and \(cc_{pp}\) contain the temporal quantities counting the number of events in the time intervals in which the participant p took part.

For example, in a data set on the stem cell research during 1997–2012 in Spain collected by Gisela Cantos-Mateos (Cantos-Mateos et al. 2014), we get from the basic two-mode network, where E is the set of papers and P is the set of institutions, for selected two institutions (HCL/B \(=\) University Hospital Clínic de Barcelona, Barcelona and IDI/B \(=\) Institut d’Investigacions Biomé-diques August Pi i Sunyer, Barcelona) the collaboration temporal quantities presented in Table 2.

The first column in the table contains the yearly collaboration (co-authorship) data and the second column contains the cumulative collaboration data. Let us read the table:

  • \(ci[\texttt {IDI/B},\texttt {HCL/B}](2005, 2006) = 3\)—in the year 2005 researchers from both institutions published three joint papers;

  • \(ci[\texttt {IDI/B},\texttt {HCL/B}](2011, 2013) = 18\)—in the years 2011 and 2012 researchers from both institutions published 18 joint papers each year;

  • \(ci[\texttt {HCL/B},\texttt {HCL/B}](2010, 2011) = 78\)—in the year 2010 researchers from the institution HCL/B published 78 papers;

  • \(cc[\texttt {IDI/B},\texttt {HCL/B}](2008, 2009) = 16\)—till the year 2008 (included) researchers from both institutions published 16 joint papers.

Note that the violence network from Sect. 4 is essentially a co-occurrence network that could be obtained from the more primitive instantaneous two-mode network about violent actions reported in journal articles and the involved political actors.

7 Clustering coefficients

Let us assume that the network \(\mathcal {N}\) is based on a simple directed graph \(\mathcal {G} = (\mathcal {V},\mathcal {A})\) without loops. From a simple undirected graph we obtain the corresponding simple directed graph by replacing each edge with a pair of opposite arcs. In such a graph the clustering coefficient, C(v), of the node v is defined as the proportion between the number of realized arcs among the node’s neighbors and the number of all possible arcs among the node’s neighbors N(v), that is

$$\begin{aligned} C(v) = \frac{|\mathcal {A}(N(v))|}{k(k-1)} \end{aligned}$$

where k is the number of neighbors of the node v. For a node v without neighbors or with a single neighbor we set \(C(v)=0\).

The clustering coefficient measures a local density of the node’s neighborhood. A problem with its applications in network analysis is that the identified densest neighborhoods are mostly very small. For this reason we provided in Pajek the corrected clustering coefficient,

$$\begin{aligned} C'(v) = \frac{|\mathcal {A}(N(v))|}{\Delta (k-1)} \end{aligned}$$

where \(\Delta\) is the maximum number of neighbors in the network.

To count the number of realized arcs among the node’s neighbors, we use the observation that each arc from \(\mathcal {A}(N(v))\) forms a triangle with links from its end-nodes to the node v; and that the number of triangles in a simple undirected graph can be obtained as the diagonal value in the third power of the graph matrix (over the combinatorial semiring).

Fig. 6
figure 6

Counting triangles

For simple directed graphs the counting of triangles is slightly more complicated. Let us denote \(\mathbf {T} = \mathbf {A}^T\) and \(\mathbf {S} = \mathbf {A}+\mathbf {T}\). From Fig. 6 we see that each triangle (determined with a link opposite to the dark node) appears exactly once in

$$\begin{aligned} \mathbf {AAA} + \mathbf {AAT} + \mathbf {TAT} + \mathbf {TAA} =\\ = \mathbf {AAS} + \mathbf {TAS} = \mathbf {SAS} . \end{aligned}$$

This gives us a simple way to count the triangles which is used in Algorithm 4 (see "Appendix"). Its time complexity is \(O(n^3 \cdot L)\).

In Tables 3 and 4, the ordinary and the corrected clustering coefficients are presented for the example network from Fig. 5 and its undirected skeleton.

Table 3 Clustering coefficients for the first example network
Table 4 Clustering coefficients for the skeleton of the first example network

8 Closures in temporal networks

When the basic semiring \((A,\oplus ,\odot ,0,1)\) is closed—an unary closure operation \(\star\) with the property

$$\begin{aligned} a^\star = 1 \oplus a\odot a^\star = 1 \oplus a^\star \odot a , \qquad \text{ for } \text{ all } a \in A, \end{aligned}$$

is defined in it – this property can be extended also to the corresponding matrix semiring. When it exists, a standard closure is obtained as

$$\begin{aligned} a^\star =\mathop {\bigoplus} \limits_{{i=0}}^{\infty} a^i . \end{aligned}$$

In some semirings different closures can exist. For computing the matrix closure we can apply the Fletcher’s algorithm (Fletcher , 1980). The entry \(c_{uv}\) in the matrix \(\mathbf {C} = \mathbf {A}^\star\) is equal to the sum of values of all walks from the node u to the node v. In most of the semirings, except the combinatorial, for which we are interested in determining the closures, also the absorption law holds

$$\begin{aligned} 1\oplus a=1, \qquad \text{ for } \text{ all } a \in A . \end{aligned}$$

In these semirings \(a^\star = 1\), for all \(a \in A\), and therefore the Fletcher’s algorithm can be simplified and performed in place as implemented in Algorithm 3.

For a temporal quantity a over a closed semiring it holds \(T_{a^\star } = \mathcal {T}\).

The time complexity of Algorithm 3 is \(O(n^3 \cdot L)\).

figure r

9 Temporal node partitions

In the previous sections, the nodes of temporal networks were considered as being present all the time. We can describe the presence of nodes through time using a temporal binary (single valued) node partition

figure s
$$\begin{aligned} T(u) = ( (s_i, f_i, 1) )_{i=1}^k, \quad \text{ for } u \in \mathcal {V}, \end{aligned}$$

specifying that a node u is present in time intervals

$$\begin{aligned}{}[s_i, f_i), i = 1, \ldots , k. \end{aligned}$$

The node partition \(T_{\mathrm{Min}}\) determined from the temporal network links by

$$\begin{aligned} T_{\mathrm{Min}}(u) = \bigcup _{l \in \mathcal {L}: u \in \text{ ext }(l)} \textit{binary}(a_l) , \end{aligned}$$

for \(u \in \mathcal {V}\), is the smallest temporal partition of nodes that satisfies the consistency condition from Sect. 2. The term \(\text{ ext }(l)\) denotes the set of end-nodes of the link l, \(a_l\) is the temporal quantity assigned to the link l, and the function \(\textit{binary}\) sets all values in a given temporal quantity to 1. In the library TQ, the partition \(T_{\mathrm{Min}}\) can be computed using the function \(\textit{minTime}\).

A temporal node partition q can also be used to extract a corresponding subnetwork from the given temporal network described with a matrix \(\mathbf {A}\). The subnetwork contains only the nodes active in the partition q and the active links satisfying the consistency condition with respect to q.

To formalize the described procedure we first define the procedure \(\textit{extract}(p,a) = b\), where p is a binary temporal quantity and a is a temporal quantity, as

figure b

Let \(\mathbf {B}\) be a temporal matrix describing the links of the subnetwork determined by the partition q. Its entries for \(l(u,v) \in \mathcal {L}\) are determined by

$$\begin{aligned} b_l = \textit{extract}(q(u) \cap q(v),a_l) . \end{aligned}$$

In TQ this operation is implemented as a procedure \(\textit{MatExtract}(\mathbf {q},\mathbf {A})\).

10 Temporal reachability and weak and strong connectivity

For a temporal network represented with a binary matrix \(\mathbf {A}\) its transitive closure \(\mathbf {A}^\star\) (over the reachability semirings based on the semiring \((\{0,1\},\vee ,\wedge ,0,1)\)) determines its reachability relation matrix. We obtain its weak connectivity temporal matrix \(\mathbf {W}\) as

$$\begin{aligned} \mathbf {W} = (\mathbf {A} \cup \mathbf {A}^T)^\star \end{aligned}$$

and its strong connectivity temporal matrix \(\mathbf {S}\) as

$$\begin{aligned} \mathbf {S} = \mathbf {A}^\star \cap (\mathbf {A}^\star )^T . \end{aligned}$$

The use of the strict transitive closure instead of a transitive closure in these relations preserves the inactivity value \(\mathbf {0}\) on the diagonal for all isolated nodes.

10.1 Reachability degrees

Let \(\mathbf {R} = \overline{\mathbf {A}} = \mathbf {A} \odot \mathbf {A}^\star\) be the strict reachability relation of a given network. Then the temporal vectors \(\textit{inReach} = \textit{inDeg}(\mathbf {R})\) and \(\textit{outReach} = \textit{outDeg}(\mathbf {R})\) contain temporal quantities counting the number of nodes: from which a given node v is reachable \(( \textit{inReach}[v] )\) / which are reachable from the node v \(( \textit{outReach}[v] )\). The results for our example network are presented in Table 5. For example, 8 nodes \(\{ 4,5,6,7,8,9,10,11 \}\) are reachable from node 6 in the time interval [1, 5), and 3 nodes \(\{ 4,5,6 \}\) are reachable in the time interval [5, 9).

Table 5 Temporal input and output reachability degrees for the first example network

10.2 Temporal weak connectivity

The function \(\textit{weakConnMat}(\mathbf {A})\) for a given temporal network matrix \(\mathbf {A}\) determines the corresponding temporal weak connectivity matrix \(\mathbf {W}\). Every time slice \(\mathcal {N}(t)\), \(t \in \mathcal {T}\), of the matrix \(\mathbf {W}\) is an equivalence relation that can be compactly described with the corresponding partition.

To transform the temporal equivalence matrix \(\mathbf {E}\) into the corresponding temporal partition \(\mathbf {p}\), we use in the function \(\textit{eqMat2Part}\) (see Algorithm 5) the fact that on a given time interval equivalent (in our case weakly connected) nodes get the same value on this interval in the product of the matrix \(\mathbf {E}\) with a vector computed over the combinatorial semiring \((\mathbb {N},+,\cdot ,0,1)\). We take for the vector values randomly shuffled integers from the interval 1 : n. With a very high probability, the values belonging to different equivalence classes are different.

The classes of the obtained temporal partition are finally renumbered with consecutive numbers using the function renumPart(p) (see Algorithm 6).

For our first example network, we obtain the temporal weak partition presented in Table 6.

10.3 Temporal strong connectivity

The procedure \(\textit{strongConnMat}(\mathbf {A})\) for a given temporal network matrix \(\mathbf {A}\) determines the corresponding temporal strong connectivity matrix \(\mathbf {S}\). To determine the intersection of temporal network binary matrices \(\mathbf {A}\) and \(\mathbf {B}\) we use the function \(\textit{MatInter}(\mathbf {A},\mathbf {B})\). Again, to get the strong connectivity partition we have to apply the function \(\textit{eqMat2Part}\) to the strong connectivity matrix.

The time complexity of algorithms for temporal weak and strong connectivity partitions is \(O(n^3 \cdot L)\).

For our first example network, we obtain the temporal strong partition presented in Table 6. In the library TQ both matrices and partitions are based on the strict transitive closure.

Table 6 Temporal weak and strong connectivity partitions for the first example network

11 Temporal closeness and betweenness

Closeness and betweenness are among the traditional social network analysis indices measuring the importance of nodes (Freeman 1978). They are somehow problematic when applied to non-(strongly) connected graphs. In this section, we will not consider these questions. We will only show how to compute them for non-problematic temporal graphs.

11.1 Temporal closeness

The output closeness of the node v is defined as

$$\begin{aligned} ocl(v) = \frac{n-1}{\displaystyle \sum _{u \in \mathcal {V}\setminus \{v\}} d_{vu}} . \end{aligned}$$

To determine the closeness, we first need to compute the matrix \(\mathbf {D} = [d_{uv}]\) of geodetic distances \(d_{uv}\) between the nodes u and v. It can be obtained as a closure of the network matrix \(\mathbf {A}\) over the shortest paths semiring \((\overline{\mathbb {R}_0^+},\min ,+,\infty ,0)\). Note that the values in the matrix \(\mathbf {A}\) can be any nonnegative real numbers.

Fig. 7
figure 7

Second example network. All unlabeled arcs have the value [(1, 9, 1)]

In Fig. 7, we present our second example temporal network which is an extended version of the example given in Fig. 3 from Batagelj (1994).

Because a complete strict closure matrix \(\mathbf {D}\) is too large to be listed, we present only some of its selected entries:

figure u

To compute the vector of closeness coefficients of nodes, we have to sum the temporal distances to other nodes over the combinatorial semiring. See Algorithm 7 in the "Appendix". The time complexity of this algorithm is \(O(n^3 \cdot L)\).

The temporal closeness coefficients for our second example network are given in Table 7.

Table 7 Output closeness for the second example network

11.2 Temporal betweenness

The betweenness of a node v is defined as

$$\begin{aligned} b(v) = \frac{1}{(n-1)(n-2)} \sum _{\begin{array}{c} u,w \in \mathcal {V} \\ |\{v,u,w\}| = 3 \end{array}} \frac{n_{u,w}(v)}{n_{u,w}} \end{aligned}$$

where \(n_{u,w}\) is the number of u-w geodesics (shortest paths) and \(n_{u,w}(v)\) is the number of u-w geodesics passing through the node v.

Suppose that we know the matrix

$$\begin{aligned} \mathbf {C} = [( d_{u,v}, n_{u,v} ) ]\end{aligned}$$

where \(d_{u,v}\) is the length of u-v geodesics. Then it is also easy to determine the quantity \(n_{u,w}(v)\):

$$\begin{aligned} n_{u,w}(v) = \left\{ \begin{array}{ll} n_{u,v} \cdot n_{v,w} \qquad &{} d_{u,v} + d_{v,w} = d_{u,w} \\ 0 &{} \text{ otherwise } \end{array} \right. . \end{aligned}$$

This gives the following scheme of procedure for computing the nontemporal betweenness coefficients \(\mathbf {b}\)

figure v

In Batagelj (1994), it is shown that the matrix \(\mathbf {C}\) can be obtained by computing the closure of the network matrix over the geodetic semiring

$$\begin{aligned} (\overline{\mathbb {N}}^2, \oplus , \odot , (\infty ,0), (0,1)), \end{aligned}$$

where \(\overline{\mathbb {N}} = \mathbb {N}\cup \{ \infty \}\) and we define addition \(\oplus\) with

$$\begin{aligned} (a,i) \oplus (b,j) = ( \min (a,b), \left\{ \begin{array}{ll} i &{} a < b \\ i+j \qquad &{} a = b \\ j &{} a > b \end{array} \right. ) \end{aligned}$$

and multiplication \(\odot\) with:

$$\begin{aligned} (a,i) \odot (b,j) = (a+b,i\cdot j) . \end{aligned}$$

To compute the geodetic closure, we first transform the network temporal adjacency matrix \(\mathbf {A}\) to a matrix \(\text{ G } = [(d,n)_{u,v} ]\) which has for entries pairs defined by

figure c

where d is the length of a geodesic and n is the number of geodesics from u to v. In temporal networks, the distance d and the counter n are temporal quantities.

The presented scheme adapted for computing the temporal betweenness vector is implemented in TQ as the function \(\textit{betweenness}(A)\). First we compute the strict geodetic closure \(\mathbf {C}\) of the matrix \(\mathbf {A}\) over the geodetic semiring. We present selected entries of the temporal matrix \(\mathbf {C}\) for our second example network:

figure x

For example, the value \(\mathbf {C}[4,6]\) reflects the facts that an arc exists from node 4 to node 6 in time intervals [1, 4) and [6, 9); in the time interval [4, 6) they are connected with 3 geodesics of length 5: (4, 7, 8, 2, 5, 6), (4, 7, 1, 3, 5, 6), (4, 7, 1, 2, 5, 6).

We continue and using the combinatorial semiring we compute the temporal betweenness vector \(\mathbf {b}\). The specificity of temporal quantities d[uv] and n[uv] is considered in the auxiliary function \(\textit{between}\) that implements the temporal version of the statement

$$\begin{aligned}&\mathbf{if}\, d[u,w]=d[u,v]+d[v,w]\, \mathbf{then}\\&r \leftarrow r + n[u,v]\cdot n[v,w]/n[u,w] \end{aligned}$$

from the basic betweenness algorithm. Again, we apply the merging scheme. The time complexity of the procedure \(\textit{betweenness}\) is \(O(n^3 \cdot L)\).

The temporal betweenness coefficients for our second example network are presented in Table 8.

Table 8 Betweenness for the second example network

12 Temporal PathFinder

The Pathfinder algorithm was proposed in the 80s (Schvaneveldt 1990) for the simplification of weighted networks—it removes from the network all links that do not satisfy the (generalized) triangle inequality—if for a weighted link there exists a shorter path connecting its end-nodes then the link is removed. The basic idea of the Pathfinder algorithm is simple. It produces a network \(\text{ PFnet }(\mathbf {W},r,q)\) \(= (\mathcal {V},\mathcal {L}_{PF})\) determined by the following scheme of procedure

figure y

where \(\mathbf {W}\) is a network dissimilarity matrix and \(\mathbf {W}^{(q)} = \bigoplus _{i=1}^q \mathbf {W}^i = (\mathbf {1} \oplus \mathbf {W})^q\) is the matrix of the values of all walks of length at most q computed over the Pathfinder semiring with and \(a \oplus b = \min (a,b)\). The value of \(w_{uv}(q)\) in the matrix \(\mathbf {W}^{(q)}\) is equal to the value of all walks of length at most q from the node u to the node v.

The scheme of Pathfinder is implemented as the function \(\textit{pathFinder}\) (Algorithm 8 in the "Appendix"). The time complexity of Algorithms 8 + 9 is \(O(L \cdot n^3 \cdot \log q)\) (Guerrero-Bote et al 2006).

Fig. 8
figure 8

Pathfinder example

The bottom network in Fig. 8 presents the Pathfinder skeleton \(\text{ PFnet }(\mathcal {N},1,\infty )\) of a network \(\mathcal {N}\) presented in the top part of the same figure. Because \(r=1\), a link e is removed if there exists a path, connecting its initial node to its terminal node, with the value (sum of link values) smaller than the value of the link e. The arc (1, 2) is removed because \(3 = v(1,2) > v(1,3)+v(3,2) = 2\). The arc (1, 6) is removed in the time interval [5, 9) because in this interval \(5 = v(1,6) > v(1,3)+v(3,4)+v(4,5) +v(5,6) = 4\).

13 September 11 Reuters terror news

The Reuters terror news network was obtained from the CRA (Centering Resonance Analysis) networks produced by Steve Corman and Kevin Dooley at Arizona State University. The network is based on all the stories released during 66 consecutive days by the news agency Reuters concerning the September 11 attack on the U.S., beginning at 9:00 AM EST 9/11/01. The nodes of this network are important words (terms). There is an edge between two words iff they appear in the same utterance [for details see the paper Corman et al. (2002)]. The weight of an edge is its frequency. The network has \(n = 13332\) nodes (different words in the news) and \(m = 243447\) edges, 50859 with value larger than 1. There are no loops in the network.

The Reuters terror news network was used as a case network for the Viszards visualization session on the Sunbelt XXII International Sunbelt Social Network Conference, New Orleans, USA, 13-17. February 2002.

We transformed the Pajek version of the network into the Ianus format used in TQ. To identify important terms, we computed their aggregated frequencies and extracted the subnetwork of the 50 most frequently used (during 66 days) nodes. They are listed in Table 9.

Trying to draw this subnetwork, it turns out to be almost a complete graph. To obtain something readable, we removed all temporal edges with the aggregated value smaller than 10. The corresponding underlying graph is presented in Fig. 9. The isolated nodes were removed.

For each of the 50 nodes we determined its temporal activity and drew it. By visual inspection we identified 6 typical activity patterns—types of terms (see Fig. 10). For all charts in the figure the displayed values are in the interval [0, 200]—the largest activity value for the term Wednesday is larger than 200. The timescale contains 66 days from September 11 to November 15.

Table 9 50 most frequent terms in the Terror news network
Fig. 9
figure 9

September 11, subnetwork of the most frequently used words

The primary terms are the terms with a very high frequency of appearance in the first week after September 11 and smaller, slowly declining values in the following period. The representative of this group in Fig. 10 is hijack and other members are: airport, american, attack, city, day, flight, nation, New York, official, Pentagon, people, plane, police, president Bush, security, tower, United States, Washington, world, World Trade center. These are the terms describing the event.

The secondary terms are a reaction to the event. There are no big changes in their values. We identified three subgroups: (a) slowly declining represented with bin Laden (country, foreign, government, military, minister, new, Pakistan, tell, terrorism, terrorist, time, war, week); (b) stationary represented with taliban (afghan, Afghanistan, force, group, leader); (c) occasional with several peaks, represented with bomb (air, building, office, strike, worker).

There are three special patterns—two periodic Wednesday and Tuesday; one episodic anthrax.

Fig. 10
figure 10

Types of activity

To consider in a measure of importance of the node \(u \in \mathcal {V}\) also the node’s position in the network, we constructed the attraction coefficient \(\text{ att }(u)\).

Let \(\mathbf {A} = [ a_{uv}]\) be a network matrix of temporal quantities with positive real values. We define the node activity \(\text{act}(u)\) as (see Sect. 4)

$$\begin{aligned} \text{act}(u) = \text{act}\left( \{u\},\mathcal {V}\setminus \{u\}\right) = \sum _{v \in \mathcal {V}\setminus \{u\}} a_{uv} . \end{aligned}$$

Then the attraction of the node u is defined as

$$\begin{aligned} \text{att}(u) = \frac{1}{\Delta } \sum _{v \in \mathcal {V}\setminus \{u\}} \frac{a_{vu}}{\text{act}(v)} . \end{aligned}$$

Note that the fraction \(\frac{a_{vu}}{\text{ act }(v)}\) is measuring the proportion of the activity of the node v that is shared with the node u.

From \(0 \le \frac{a_{vu}}{\text{ act }(v)} \le 1\) and \({\text{deg}} (v)=0 \Rightarrow a_{vu}=0\) it follows that

$$\begin{aligned} \sum _{v \in \mathcal {V}\setminus \{u\}} \frac{a_{vu}}{\text{ act }(v)} \le {\text{deg}}(u) \le \Delta \end{aligned}$$

where \(\Delta\) denotes the maximum degree. Therefore, we have \(0 \le \text{ att }(u) \le 1\), for all \(u \in \mathcal {V}\).

The maximum possible attraction value 1 is attained exactly for nodes: (a) in an undirected network: that are the root of a star; (b) in a directed network: that are the only out-neighbors of their in-neighbors—the root of a directed in-star.

We computed the temporal attraction and the corresponding aggregated attraction values for all the nodes in our network. We selected 30 nodes with the largest aggregated attraction values. They are listed in Table 10. Again, we visually explored them. In Fig. 11, we present temporal attraction coefficients for the six selected terms. For all charts in the figure the displayed attraction values are in the interval [0, 0.2].

Comparing on the common terms (Taliban, bomb, anthrax) the activity charts in Fig. 10 with the corresponding attraction charts in Fig. 11, we see that they are “correlated” (obviously \(\text{act}(a;t) = 0\) implies \(\text{att}(a;t) = 0\)) but different in details.

For example, the terms Taliban and bomb have small attraction values at the beginning of the time window—the terms were disguised by the primary terms. On the other hand, the terms Taliban and Kabul get increased attraction towards the end of the time window.

Table 10 30 most attractive terms in the Terror news network
Fig. 11
figure 11

Attraction patterns

14 Conclusions

In the paper, we proposed an algebraic approach to the “deterministic” analysis of temporal networks based on temporal quantities and presented algorithms for the temporal variants of basic network analysis measures and concepts. We expect that the support for temporal variants of many other network analysis notions can be developed in similar ways. Our results on temporal variants of eigen value-/vector-based indices (Katz, Bonacich, hubs and authorities, page rank) are presented in a separate paper (Praprotnik and Batagelj 2016a).

The proposed approach is an alternative to the traditional cross-sectional approach based on time slices. Its main advantages are:

  • The data and the results are expressed using temporal quantities that are natural descriptions of properties changing through time;

  • The user does not need to be careful about the intervals on which the time slices are determined—exactly the right intervals are selected by the merging (sub)operations. This also improves, on average, the efficiency of the proposed algorithms.

All the described algorithms (and some others) are implemented in a Python library TQ (Temporal Quantities) available at http://vladowiki.fmf.uni-lj.si/doku.php?id=tq.

We started to develop a program Ianus that will provide a user-friendly (Pajek like) access to the capabilities of the TQ library.

The main goal of the paper was to show: it can be done. Therefore, we based the current version of the library TQ on a matrix representation of temporal networks as it is presented in the paper. For this representation most of the network algorithms have the time complexity of \(O(n^3 \cdot L)\) and the space complexity of \(O(n^2 \cdot L)\). This implies that their application is limited to networks of moderate size (up to some thousands of nodes). Large networks are usually sparse. On this assumption, more efficient algorithms can be developed based on a graph (sparse matrix) representation—one of the directions of our future research.

In a description of a temporal network \(\mathcal {N}\) we can consider also a transition time or latency \(\tau \in \mathcal {W}\): \(\tau (l,t)\) is equal to the time needed to traverse the link l starting at the instant t. Problems considering latency are typical for operations research but could be important, when such data are available, also in social network analysis (Moody 2002; Xuan et al. 2003; George et al. 2007; Casteigts et al. 2012; Kontoleon et al. 2013). The analysis of temporal networks considering also the latency seems a much harder task—for example, in such temporal networks the strongly connected components problem is NP-complete (Bhadra and Ferreira 2003).

The results obtained from temporal procedures are relatively large. To identify interesting elements, we used in the paper the aggregated values and the visualization of selected elements. Additional tools for browsing and presenting the results should be developed.