1 Introduction

A network (also called graph) is a mathematical object which is used extensively to represent a binary relation among a group of agents. Analyzing such networks for different structural patterns remains an active area of study in different domains including Computational Biology (Hulovatyy et al. 2015), Social Network Analysis, Computational Epidemiology (Masuda and Holme 2017), Criminal Network Analysis (Ficara et al. 2021), and many more. Among many, one such structural pattern is the maximally connected subgraphs, which are popularly called cliques. Finding the maximum cardinality clique in a given network is a well-known NP-Complete Problem (Garey and Johnson 2002). However, in network analysis, perspective more general problem is not finding the maximum size clique, but also to enumerate all the maximal cliques present in the network. Bron and Kerbosch (1973) first proposed an enumeration algorithm for maximal cliques in the network which forms the foundation of study on this problem. Later, there were advancements for this problem for different types of networks (Cheng et al. 2012; Eppstein et al. 2013).

Real-world networks from biological to social are time-varying, which means that the existence of an edge between any two agents changes with time. Temporal networks (Holme and Saramäki 2012) (also known as link streams or time-varying networks) are the mathematical objects used to formally represent the time-varying relationships. For these types of networks, a natural supplement of clique is the temporal clique which consists of two things: a subset of the vertices and a time interval. In this direction, recently, Viard et al. (2016) put forward the notion of \(\Delta \)-clique, where a vertex subset along with a time interval is said to be a \(\Delta \)-clique if every vertex pair from that set has at least a single edge in every \(\Delta \) duration within the time interval. Next, we report the existing studies on clique enumeration on networks.

As mentioned previously, a temporal network consists of a set of agents and a time-varying relationship. Now, the following questions are essential to understand the contact pattern among them: which subset of agents comes in contact very frequently among each other? Given a time duration, how many times do they contact each other? etc. The frequency of communication also adds another dimension of information to their relationship strength. Motivated by such questions, recently, the notion of \(\Delta \)-clique has been extended to \((\Delta , \gamma )\)-cliques, which is basically a vertex subset and time interval pair in which each pair of vertices of the subset has at least \(\gamma \) interactions in every \(\Delta \) duration within the time interval. We propose a different approach for listing out all the maximal \((\Delta , \gamma )\)-cliques contained in a temporal network. The main contributions of this paper are as follows:

  • In this paper, we propose a different approach for listing out maximal \((\Delta , \gamma )\)-cliques that are there in a temporal network.

  • By drawing sequential arguments, we prove the correctness of the proposed methodology.

  • A detailed analysis of the proposed methodology has been done to understand its computational time and space requirement.

  • The proposed methodology has been implemented with five publicly available temporal network datasets to bring out nontrivial insights about contact patterns and compare the efficiency of the proposed methodology with the existing one.

  • Also, a set of experiments has been conducted to show that the proposed methodology of maximal \((\Delta , \gamma )\)-clique enumeration can also be efficiently used for enumerating maximal \(\Delta \)-clique as well (By putting \(\gamma =1\)).

The remaining portion of this article is arranged in the following way: Sect. 2 describes some relevant studies from the literature. Section 3 discusses some preliminary concepts regarding temporal networks and formally defines the maximal \((\Delta , \gamma )\)-clique enumeration problem. Section 4 contains the proposed enumeration technique with its detailed analysis, proof of correctness, and an illustrative example. Section 5 describes an experimental evaluation of the proposed methodology. Finally, Sect. 6 concludes this study and gives future directions.

2 Related work

In recent times, mining and analysis of temporal networks have become an active area of research as most of the real-world networks from social to biological are temporal in nature (Rozenshtein and Gionis 2019). Several problems including community analysis (Qin et al. 2020), finding matching (Zschoche 2022), finding separators (Zschoche et al. 2020), coloring (Mertzios et al. 2021), traversal (Byun et al. 2019), etc., have been studied in the context of temporal graphs. As per the title of the paper, here, we discuss the literature related to clique enumeration of static and followed by temporal graphs.

The problem of maximal clique enumeration is a classic computational problem on network algorithms and has been extensively studied on static networks. Akkoyunlu (1973) was the first to propose an algorithm for this problem. Later, Bron and Kerbosch (1973) introduced a recursive approach for the maximal clique enumeration problem. These two studies are the foundations on maximal clique enumeration and trigger a huge amount of research due to many practical applications from computational biology to spatial data analytics (Al-Naymat 2008) and Bhowmick and Seah (2015). In the past two decades, several methodologies have been developed for enumerating maximal cliques in different computational paradigms, and different kinds of networks, such as in sparse graphs (Eppstein et al. 2013; Manoussakis 2019), in large networks (Cheng et al. 2010, 2011; Rossi et al. 2014), in map-reduce framework (Hou et al. 2016; Xiang et al. 2013), in uncertain graphs (Mukherjee et al. 2016; Zou et al. 2010; Dai et al. 2022), in parallel computing framework (Chen et al. (2016); Rossi et al. (2015); Schmidt et al. (2009)), in signed networks (Chen et al. 2020), in temporal networks (Banerjee and Pal 2022), and many more (Dai et al. 2023; Manoussakis 2023).

Though there are many existing studies on maximal clique enumeration on static networks, the literature on temporal graphs is limited. Viard et al. (2015) proposed an enumeration algorithm for the maximal \(\Delta \)-clique of a temporal network. They did a detailed analysis of contact relationships among a group of students, based on their introduced methodology. They were able to show that their analysis draws deeper insights of their communication pattern (Viard et al. 2015). Later, Himmel et al. (2016) proposed a different approach for the maximal \(\Delta \)-clique enumeration problem. Their methodology is based on the Bron–Kerbosch Algorithm for maximal clique enumeration in static graphs. Their methodology is better in both of the following aspects: theoretically (measured in terms of worst case computational complexity analysis) as well as practically (measured in terms of computational time when the algorithm is implemented with real-world datasets). Molter et al. (2019) introduced the notion of isolation in clique enumeration of a time-varying graph. They developed fixed parameter enumeration algorithms based on different notions of isolation employing the parameter “degree of isolation." Viard et al. (2018) generalized the notion to contact with duration and introduced the concept of \(\Delta \)-clique with duration. They also proposed an algorithm for enumerating such cliques present in a temporal network. Bentert et al. (2019) studied the maximal \(\Delta \)-Plex enumeration problem. Recently, Banerjee and Pal (2019) proposed an enumeration algorithm for maximal \((\Delta , \gamma )\)-cliques present in a time-varying graph. The method initializes a clique for each link in the temporal network and expands its duration and cardinality to find the maximal cliques. In this work, we propose a two-phase approach by generating the initial cliques as duration-wise maximal cliques, which significantly reduces the number of intermediate cliques generated in the enumeration process. As far as we know, other than the last one, there is no other work available which studies \((\Delta , \gamma )\)-cliques.

3 Background and problem definition

In this section, we present some preliminary concepts to understand the problem, that we work on this paper and the proposed solution methodology. In a temporal network, its edges are marked with the corresponding occurrence timestamp(s). Formally, it is stated in Definition 1.

Definition 1

(Temporal Network) A temporal network is defined as \(\mathcal {G}(V, E, \mathcal {T})\), where \(V(\mathcal {G})\) is the set of vertices of the network, and \(E(\mathcal {G})\) is the set of edges among them. \(\mathcal {T}\) is the mapping that maps each edge of the graph to its occurrence time stamp(s), i.e., \(\mathcal {T}:E(\mathcal {G}) \longrightarrow 2^{T} {\setminus } \emptyset \) where \(T=\{1,2, \ldots , \mathbb {T}\}\) is the set of discrete time stamps in which the network is observed.

A temporal network can be represented in two ways. One approach is to represent a temporal network using the link stream model where we show the relationships among the entities over the time horizon. The other approach is the time stamp-wise snapshot graph representation. In this approach, a temporal network is represented as a collection of static graphs overtime stamps. Figures 1 and 2 show the representation of the same temporal network in the form of link stream model and snapshot graph representation model, respectively. In the rest of the paper, we consider that the temporal network is represented in the link stream model.

Fig. 1
figure 1

Link stream representation of a temporal network

Fig. 2
figure 2

Snapshot graph representation of the temporal network shown in Figure 1

As just mentioned, Fig. 1 shows a temporal graph with five vertices and 29 edges, where edges are shown in the time horizon. In temporal network analysis, it is assumed that the network changes its topology in discrete time steps. So, starting at time t, if the network is observed in every dt time difference till \(t^{'}\), the time instances are \(\mathbb {T}=\{t, t+dt, t+2dt, \dots , t^{'}\}\). In the rest of our study, we assume, \(t,t^{'} \in \mathbb {Z}^{+}\) and \(dt=1\). The difference between the beginning and ending time stamp, i.e., \(t^{'}-t\) is called as the Lifetime of the Network. In the temporal network \(\mathcal {G}\), if there is an edge between two vertices \(v_i\) and \(v_j\) at time \(t^{''}\), then it is symbolized as \((v_i,v_j,t^{''})\), signifying that there is a contact between u and v at time \(t^{''}\). For some \(t^{''} \in \mathbb {T}\) if \((u,v,t^{''}) \in E(\mathcal {G})\), then we say that there exists a static edge between \(v_i\) and \(v_j\). The frequency of an edge is defined as how many distinct time stamps \(t^{''}\) are there in the time span \(\mathbb {T}\) such that \((v_i,v_j,t^{''}) \in E(\mathcal {G})\) and denoted as \( f_{{(v_{i} v_{j} )}} \), i.e., \(f_{(v_iv_j)}=|\{t^{''} \in \mathbb {T}: (v_i,v_j,t) \in E(\mathcal {G}) \}|\). If there does not exist any \(t^{''} \in \mathbb {T}\) such that \((v_i,v_j,t^{''}) \notin E(\mathcal {G})\), then we say that \(f_{(v_i,v_j)}=0\). In the rest of our study, we work with undirected temporal network, i.e., there is no difference between \((v_i,v_j,t^{''})\) and \((v_j,v_i,t^{''})\).

In a static network, a subset of vertices, where every pair is adjacent, is known as a clique. The size of the clique is defined as the number of vertices it contains. A clique is said to be maximal if it is not part of another clique of larger size. In one of our recent studies, we introduced the notion of \((\Delta , \gamma )\)-clique by extending the concept of \(\Delta \)-clique and incorporating an additional parameter \(\gamma \) as a frequency threshold. This is stated in Definition 2.

Definition 2

(\((\Delta , \gamma )\)-clique) (Banerjee and Pal 2019) Given a temporal network \(\mathcal {G}(V, E, \mathcal {T})\), time duration \(\Delta \), and a frequency threshold \(\gamma \in \mathbb {Z}^{+}\), a \((\Delta , \gamma )\)-clique of \(\mathcal {G}\) is a tuple consisting of vertex subset, and time interval, i.e., \((\mathcal {X}, [t_a,t_b])\) where \(\mathcal {X} \subseteq V(\mathcal {G})\), \(\vert \mathcal {X} \vert \ge 2\), and \([t_a,t_b] \subseteq \mathbb {T}\). Here \(\forall v_i,v_j \in \mathcal {X}\) and \(\tau \in [t_a, max(t_b - \Delta , t_a)]\), there must exist at least \(\gamma \) number of edges, i.e., \((v_i, v_j, t_{ij}) \in E(\mathcal {G})\) and \(f_{(v_iv_j)} \ge \gamma \) with \(t_{ij} \in [\tau , min (\tau + \Delta , t_b)]\). Here, \(f_{(v_iv_j)}\) denotes the frequency of the static edge \((v_i, v_j)\).

In a static graph G(VE), a maximal clique is formed as \(\mathcal {S} \subset V(G)\), if for each \( v \in V(G) {\setminus } \mathcal {S}\), \(\mathcal {S} \cup \{v\}\) is not a clique. Now, as the \((\Delta , \gamma )\)-clique is defined in the setting of temporal networks, its maximality depends on two parameters: One is the cardinality (referred to as inclusion-wise maximality) and the other one is the time interval (referred to as temporally maximal). We introduce the maximality conditions for an arbitrary \((\Delta , \gamma )\)-clique in Definition 3 considering both the factors.

Definition 3

(Maximal \((\Delta , \gamma )\)-clique) Given a temporal network \(\mathcal {G}(V, E, \mathcal {T})\) and a \((\Delta , \gamma )\)-clique \((\mathcal {X}, [t_a,t_b])\) of \(\mathcal {G}\), \((\mathcal {X}, [t_a,t_b])\) will be maximal if none of the following is true.

  • \(\exists v \in V(\mathcal {G}) \setminus \mathcal {X}\) such that \((\mathcal {X} \cup \{v\}, [t_a,t_b])\) is a \((\Delta , \gamma )\)-clique.

  • \((\mathcal {X}, [t_a - 1,t_b])\) is a \((\Delta , \gamma )\)-clique. This applies only if \(t_a - 1 \ge t\).

  • \((\mathcal {X}, [t_a,t_b + 1])\) is a \((\Delta , \gamma )\)-clique. This applies only if \(t_b + 1 \le t^{'}\).

In this paper, we study the problem of listing out all the maximal \((\Delta ,\gamma )\)-cliques of a given temporal network, which we call as the maximal \((\Delta , \gamma )\)-clique enumeration problem defined next.

Definition 4

(Maximal \((\Delta , \gamma )\)-clique enumeration problem) Given a temporal network \(\mathcal {G}(V, E, \mathcal {T})\), \(\Delta \), and \(\gamma \) the maximal \((\Delta , \gamma )\)-clique enumeration problem asks to list out all the maximal \((\Delta , \gamma )\)-cliques (as mentioned in Definition 3) present in \(\mathcal {G}\).

Table 1 lists out all the symbols and notations used in this paper along with their interpretation. Next, we proceed to describe the proposed enumeration methodology for maximal \((\Delta , \gamma )\)-cliques.

Table 1 Symbols and notations used in this paper

4 Proposed enumeration technique

As stated earlier, the proposed methodology is broadly divided into two steps, and each of them is described in the following two subsections. The broad idea of the proposed enumeration process is as follows: Given all the links with time duration of the temporal network, initially, we find out the maximal cliques of cardinality two. Next, taking these duration-wise maximal cliques, we add vertices into the clique without violating the definition of \((\Delta , \gamma )\)-cliques.

4.1 Stretching phase (initialization)

Algorithm 1 describes the initialization process of the proposed methodology. For a given temporal network \(\mathcal {G}\), initially, we construct the dictionary \(\mathcal {D}_{e}\) with the static edges as the keys, and correspondingly, the occurrence time stamps are the values. By the definition of \((\Delta , \gamma )\)-clique, if the end vertices of an edge are part of the same clique, then the edge has to occur at least \(\gamma \) times in the link stream. Hence, for each static edge (uv) of \(\mathcal {G}\), if its frequency is at least \(\gamma \), it is processed further. The occurrence time stamps of (uv) are fed into the list \(\mathcal {T}_{(u,v)}\). A temporary list, Temp, is created to store each current processing timestamp from \(\mathcal {T}_{(u,v)}\) with its previous occurrences, till it has maintained \((\Delta , \gamma )\)-clique property. Now, the for loop from Lines 8–32 computes all the \((\Delta , \gamma )\)-cliques with maximum duration where \(\{u,v\}\) is the vertex set. During the processing of \(\mathcal {T}_{(u,v)}\), any one of the following two cases can happen. In the first case, if the current length of Temp is less than \(\gamma \), the difference between the current timestamp from \(\mathcal {T}_{(u,v)}\) and the first entry of Temp is checked (Line 10). Now, if the difference is less than or equal to \(\Delta \), current timestamp is appended in Temp. Otherwise, all the previous timestamps that have occurred within past \(\Delta \) duration from the current timestamp are added in Temp (Line 14). This process basically checks \(\Delta \) timestamps backward from each occurrence times of the static edge (uv). In the second case, when the current length of Temp is greater than or equal to \(\gamma \), it is checked whether the current processing time from \(\mathcal {T}_{(u,v)}\) falls within the interval of (last \(\gamma \)-th occurrence time + 1) to (last \(\gamma \)-th occurrence time + 1 + \(\Delta \)). Now, if it is true, the current timestamp is appended in Temp. It can be easily observed that this appending is done if at least the consecutive \(\gamma \) occurrences are within each \(\Delta \) duration. Otherwise, the clique is added in \( \mathcal {C}^I_T\) with the vertex set \(\{u,v\}\) and time interval \([t_a, t_b]\) (Line 22), where \(t_a\) is the \(\Delta \) ahead timestamp from the first \(\gamma \)-th entry in Temp, and \(t_b\) is the \(\Delta \) onwards timestamp from the last \(\gamma \)-th entry in Temp. Next, all the previous timestamps that have occurred within past \(\Delta \) duration from the current timestamp are added in Temp as before (Line 24). It allows to consider overlapping cliques. Now, this may happen when we process the last occurrence from \(\mathcal {T}_{(u,v)}\), it is added in Temp. However, no clique can be added by the condition of 9–26 if the length of Temp is greater than or equal to \(\gamma \). This situation is handled by Lines 27–31. This process is iterated for each key from the dictionary \(\mathcal {D}_{e}\). Now, we present lemmas that together they will help to argue the correctness of the proposed methodology. An illustrative example of Algorithm 1 for one link is shown in Fig. 3.

Fig. 3
figure 3

An illustrative example of algorithm 1 using the temporal network of Fig. 1, for the link \((v_1, v_2)\) with \(\Delta =4\) and \(\gamma =2\). All the temporally maximal \((\Delta , \gamma )\)-cliques of the vertex pair \(\{v_1, v_2\}\) are kept in the initialized clique set, \(\mathcal {C}^I_T\), marked in gray color

Algorithm 1
figure a

Stretching phase of the \((\Delta , \gamma )\)-clique enumeration

Lemma 1

For a link (uv), if there exist any consecutive \(\gamma \) occurrences within \(\Delta \) duration, then it has to be in “Temp” at some stage, in Algorithm 1.

Proof

Follows from the description of Algorithm 1. \(\square \)

Lemma 2

In any arbitrary iteration of the “for loop” at Line 8 in Algorithm 1, each consecutive \(\gamma \) occurrences of “Temp” will be within \(\Delta \) duration.

Proof

Initially, Temp contains the first occurrence of a link. Now, when the length of Temp is less than \(\gamma \) (Line 9), next occurrence times are added in Temp (Line 11) if the difference from initial to current occurrence time lies within \(\Delta \) (Line 10), else the times at which the links have occurred in previous \(\Delta \) duration from the current time are added (Line 13, 14). This shows that all the entries in Temp are within \(\Delta \) duration when the length of Temp is less than \(\gamma \).

When the length of Temp is greater than or equal to \(\gamma \), without loss of generality, let us take any arbitrary \(\gamma \) occurrences of Temp as \( t^1, t^2, \dots t^{(\gamma -1)}, t^{\gamma }\), which are not within a \(\Delta \) duration, i.e., \(t^{\gamma } -t^1 > \Delta \). Let us also assume that from \(t^{(\gamma -1)}\), all the previous occurrences in Temp follow the statement of this lemma. Now, from our assumptions, we have the following conditions:

$$\begin{aligned}{} & {} t^{0}+\Delta \ge t^{\gamma -1} \implies t^{1}+\Delta >t^{(\gamma -1)} \end{aligned}$$
(1)
$$\begin{aligned}{} & {} t^{1}+\Delta <t^{\gamma } \end{aligned}$$
(2)
$$\begin{aligned}{} & {} t^{1} \ge t^{0}+1 \end{aligned}$$
(3)

Now, let us assume the previous occurrence of the link from \(t^1\) in Temp is \(t^0\), and our goal is to infer the possible positions of \(t^{0}\) in the time horizon. From the definition of \((\Delta , \gamma )\)-clique, there will be \(\gamma \) occurrences from \(t^{1}-\Delta \) to \(t^{1}\). If the first \((\gamma -1)\) links have occurred in consecutive times, then \(t^{0}=t^{1}-\Delta + \gamma -2\). This is the minimum value for \(t^{0}\). From Eq. 3, the maximum value for \(t^{0}\) is \(t^{1}-1\). Hence, \(t^{0} +1 \le t^{1} \le t^{0}+ \Delta +2 -\gamma \). Now, from Eq. 2, we have \(t^{0} + \Delta + 1 < t^{\gamma }\), when \(t^{1} = t^{0} +1\) and replacing \(t^1\) with \( t^{0}+ \Delta +2 -\gamma \) in Eq. 2, we get \(t^{0} + \Delta + 1 + (\Delta +1 -\gamma )< t^{\gamma } \implies t^{0} + \Delta + 1 < t^{\gamma } \) as \(\Delta +1 \ge \gamma \). This violates the condition imposed in Line 17. Hence, \(t^{\gamma }\) cannot be added in Temp. So, we reach a contradiction and this completes the proof. \(\square \)

Lemma 3

Let, \(t^f\) and \(t^l\) be the first and last occurrence in Temp. In the interval \([t^f, t^l]\), Temp contains at least \(\gamma \) links in each \(\Delta \) duration.

Proof

When the length of Temp is less than \(\gamma \), Lines 9–15 in Algorithm 1 ensure the statement of the lemma by adding consecutive \(\gamma \) occurrences in \(\Delta \) duration. So, it is trivial that we need to prove the statement when the length of Temp is greater than \(\gamma \). Let us assume that the occurrence times of the first \(\gamma +1\) entries of Temp are \(t^1, t^2, \dots , t^{\gamma }, t^{(\gamma +1)}\), where \(t^1=t^f\) and \(t^{(\gamma +1)} \le t^l\).

Now, by Lemma 2, \(t^{\gamma } - t^1 \le \Delta \) and \(t^{(\gamma +1)} - t^2 \le \Delta \). Without loss of generality, we want to show that there exist at least \(\gamma \) links from \(t^1+1\) to \(t^1 +1 + \Delta \). As \(t^{\gamma } - t^1 \le \Delta \), the maximum difference between \(t^1\) and \(t^2\) can be \((\Delta - \gamma + 2)\), and this case will arise when all the \(\gamma -1\) links appear in each consecutive timestamp from \(t^1+\Delta \) toward \(t^1\) (shown in Fig. 3). Now, as \(t^{(\gamma +1)} - t^2 \le \Delta \), we have to show \(t^{(\gamma +1)}=t^{\gamma }+1\). This extreme case will intuitively prove the rest of the cases. So, we can infer the following conclusion from Lemma 2 and the assumption \(t^2=t^1+\Delta -\gamma +2\). Now,

$$\begin{aligned} \begin{aligned}&t^{(\gamma +1)} - t^2 \le \Delta \\&t^{(\gamma +1)} - t^1-\Delta +\gamma -2 \le \Delta \\&t^{(\gamma +1)} \le t^1 + \Delta + 1 + \{ (\Delta + 1) - \gamma \} \\ \end{aligned} \end{aligned}$$

Again, from the condition imposed at Line 17 in Algorithm 1, we also have \(t^{(\gamma +1)} \le t^1 + \Delta + 1\). Now, as per our assumption of extreme case \(t^{\gamma } = t^1 + \Delta \). So, \(t^{(\gamma +1)} \le t^{\gamma } + 1 \implies t^{(\gamma +1)}= t^{\gamma } + 1\).

Now, as \(t^{(\gamma +1)} \le t^1 + \Delta + 1\), we can argue \(t^{(\gamma +1)} < t + \Delta \), for all \( t \in (t^1 + 1, t^2]\). Moreover, from Lemma 2, there are \(\gamma \) links within \([t^2, t^{(\gamma +1)} ]\), which concludes the existence of at least \(\gamma \) links from t to \(t + \Delta \). Now, for any \(t^i \in [t^f, t^l-\Delta ]\), there will be at least \(\gamma \) links in Temp from \(t^i\) to \(t^i+\Delta \). This completes the proof of the claimed statement. \(\square \)

Lemma 4

In Algorithm 1, the contents of \(\mathcal {C}_{T}^{I}\) are \((\Delta , \gamma )\)-cliques of size 2.

Proof

We are processing each static edge of the temporal network \(\mathcal {G}\) in its time horizon and add the \((\Delta , \gamma )\)-clique(s) formed by the end vertices of the edge into \(\mathcal {C}_{T}^{I}\). Hence, the cliques in \(\mathcal {C}_{T}^{I}\) are of size 2. Now, in Algorithm 1, the cliques are added into \(\mathcal {C}_{T}^{I}\) in Lines 22 and 30. In both the cases, cliques are added if the current length of the Temp is greater than or equal to \(\gamma \). As per Lemma 3, Temp at least \(\gamma \) links in each \(\Delta \) duration. While adding the duration of the clique, \(t_a\) is obtained by subtracting \(\Delta \) duration from first \(\gamma \)-th occurrence time, and \(t_b\) is obtained by adding \(\Delta \) duration from last \(\gamma \)-th occurrence time in Temp. This ensures the existence of at least \(\gamma \) occurrences of the link in each \(\Delta \) duration between \(t_a\) to \(t_b\). \(\square \)

Lemma 5

All the cliques returned by Algorithm 1 and contained in \(\mathcal {C}_{T}^{I}\) are duration-wise maximal.

Proof

We prove the duration-wise maximality of each clique in \(\mathcal {C}_{T}^{I}\) by contradiction. Let us assume, a clique \((\{u,v\}, [t_a,t_b]) \in \mathcal {C}_{T}^{I}\) is not duration-wise maximal. Then, there exists a \(t_a^{'}\) with \(t_a^{'} < t_a\) such that \((\{u,v\}, [t_a^{'},t_b])\) is a \((\Delta , \gamma )\)-clique or a \(t_{b}^{'}\) with \(t_{b}^{'} > t_b\) such that \((\{u,v\}, [t_a,t_b^{'}])\) is a \((\Delta , \gamma )\)-clique.

Now, if \((\{u,v\}, [t_a^{'},t_b])\) is a \((\Delta , \gamma )\)-clique, then its first \(\gamma \) occurrences will be in Temp at some stage as per Lemma 1. Later, this Temp is expanded till \(t_b\) either by Line 11 or 18 in Algorithm 1. Hence, \((\{u,v\}, [t_a^{'},t_b])\) will be added in \(\mathcal {C}_{T}^{I}\), instead of \((\{u,v\}, [t_a,t_b])\). So, the assumption that there exists a \(t_a^{'}\) with \(t_a^{'} < t_a\) is false.

Now, by Lemma 4, as \((\{u,v\}, [t_a,t_b])\) is a \((\Delta , \gamma )\)-clique, in each \(\Delta \) duration within \(t_a\) to \(t_b\), there will be at least \(\gamma \) links between u and v. Let us assume, that \(l^{\gamma }\) and \(l^{(\gamma -1)}\) are the last \(\gamma \)-th and \((\gamma -1)\)-th occurrence time of (uv), respectively. From the definition of \((\Delta , \gamma )\)-clique, \(l^{\gamma }+\Delta \ge t_b\), hence, \(l^{(\gamma -1)}+\Delta > t_b\). Now, let \(\{u, v\}\) be a \((\Delta , \gamma )\)-clique in the interval \([t_a, l^{(\gamma -1)}+\Delta ]\), there must be at least one link between u and v in the interval \([t_b, l^{(\gamma -1)}+\Delta ]\). If there exists such links, it indicates the presence of \(\gamma \) or more links in the interval \([l^{(\gamma -1)}, l^{(\gamma -1)}+\Delta ]\). This case is handled by Algorithm 1 either in Line 11 or 18, and \((\{u,v\}, [t_a,t_b])\) will not be added to \(\mathcal {C}_{T}^{I}\). So, there cannot exist any \(t_b^{'}\) which is greater than \(t_b\).

Hence, all the cliques of \(\mathcal {C}_{T}^{I}\) returned by Algorithm 1 are duration-wise maximal. \(\square \)

Lemma 6

All the duration-wise maximal \((\Delta , \gamma )\)-cliques of size 2 are contained in \(\mathcal {C}_{T}^{I}\).

Proof

In Lemmas 4 and 5, we have already shown that each \((\Delta , \gamma )\)-clique of \(\mathcal {C}_{T}^{I}\) is of size 2 and duration-wise maximal, respectively. Hence, in this lemma, we have to prove that none of such cliques are missed out in the final \(\mathcal {C}_{T}^{I}\). As each edge is processed independently by Algorithm 1, it is sufficient to prove that all the duration-wise maximal \((\Delta , \gamma )\)-cliques for a particular vertex pair (corresponding to an edge) are contained in \(\mathcal {C}_{T}^{I}\).

Let, \((\{u,v\}, [t_a,t_b])\) be a duration-wise maximal \((\Delta , \gamma )\)-clique and not present in \(\mathcal {C}_{T}^{I}\). Now, as \((\{u,v\}, [t_a,t_b])\) is a \((\Delta , \gamma )\)-clique, so there exist at least \(\gamma \) links in each \(\Delta \) duration from \(t_a\) to \(t_b\). Let \(f^{\gamma }\) and \(l^{\gamma }\) are the first \({\gamma }\)-th and last \({\gamma }\)-th occurrence time of the link (uv) between \(t_a\) to \(t_b\). We denote the occurrence timestamps for the static edge (uv) as \(t^1, t^2, \dots , t^{f_{(u,v)}}\), and \(f_{(u,v)} \ge \gamma \). Now, there can be one of the following cases for the values of \(t_a\) and \(t_b\).

  1. 1.

    \(t_a = t^{1+\gamma -1} - \Delta \) and \(t_b \le t^{f_{(u,v)} - \gamma +1} + \Delta \): The clique is formed at the beginning of the occurrence stream of (uv). According to Lemma 1, all the occurrence time will be in Temp. Now, if \(t_b = t^{f_{(u,v)} - \gamma +1} + \Delta \), it will be added in \(\mathcal {C}_{T}^{I}\) by Line 30 of Algorithm 1. Otherwise, \(\exists t^k: t^k > l^{\gamma } + 1+ \Delta \) and \(t^{k-1} \le t_b\). Hence, it breaks the if condition at Line 17, and the clique will be added in \(\mathcal {C}_{T}^{I}\) by Line 22.

  2. 2.

    \(t_a \ge t^{1+\gamma -1} - \Delta \) and \(t_b = t^{f_{(u,v)} - \gamma +1} + \Delta \): The clique is formed at the end of the occurrence stream of (uv). If \(t_a = t^{1+\gamma -1} - \Delta \), it follows from the above case. For the else part, we need to show that \(t_a = f^{\gamma }+\Delta > t^{1+\gamma -1} - \Delta \) is handled by the Algorithm 1. Here, \(\exists t^k: t^k < f^{\gamma } -1 - \Delta \) and \(t^{k-1} \ge t_a\). Along with Lemma 1 and 2, the Lines 14 and 24 are responsible to have all the timestamps within \([t_a, t_b]\) must be Temp. So, the clique will be added in \(\mathcal {C}_{T}^{I}\) by Line 30.

  3. 3.

    \(t_a > t^{1+\gamma -1} - \Delta \) and \(t_b < t^{f_{(u,v)} - \gamma +1} + \Delta \): The clique is formed in the middle of the occurrence stream of (uv). Both the scenarios of \(t_a\) and \(t_b\) values are shown in the above two cases, so the clique will be added in \(\mathcal {C}_{T}^{I}\) by Line 22.

\(\square \)

Lemma 7

The running time of finding all the duration-wise maximal \((\Delta , \gamma )\)-cliques of size 2 in Algorithm 1 is of \(\mathcal {O}(\gamma m)\).

Proof

Preparing the dictionary \(\mathcal {D}_{e}\) at Line 1 in Algorithm 1 will take \(\mathcal {O}( \sum _{(u,v,t^{''}) \in E(\mathcal {G})} f_{(u,v)})\). Assuming the frequency of each static edge is at least \(\gamma \), we evaluate the running time for processing a static edge. It will be identical for the rest of the edges. During the processing, all the operations from Line 8 to 32 take \(\mathcal {O}(1)\) times, except the appending at Lines 14 and 24. Now, the appending of previous occurrences within past \(\Delta \) duration can lead to copying of at most \(\gamma -2 \) previous entries in Temp, which take \(\mathcal {O}(\gamma )\) times. Now, the worst case may occur when in every iteration of the for loop at Line 8, \(\gamma -2\) previous occurrences are copied in Temp (at Line 24), and this case may occur at most \(f_{(u,v)} -\gamma + 1\) times. In this case, the running time of the for loop from Line 8 to 32 is \((\gamma -2)(f_{(u,v)} -\gamma + 1) \approx \mathcal {O}(\gamma f_{(u,v)})\) for a particular static edge. Now, for all the static edges, the for loop at Line 3 will run with \(\mathcal {O}(\sum _{(u,v,t^{''}) \in E(\mathcal {G})} \gamma f_{(u,v)} )\) times. Now, the total running time of Algorithm 1 is \(\mathcal {O}( \sum _{(u,v,t^{''}) \in E(\mathcal {G})} f_{(u,v)} + \gamma \sum _{(u,v,t^{''}) \in E(\mathcal {G})} f_{(u,v)}) = \mathcal {O}( \gamma \sum _{(u,v,t^{''}) \in E(\mathcal {G})} f_{(u,v)}) \). Here, summing up all the frequencies of the static edges gives the total number of links of the temporal network, i.e., \(m = \sum _{(u,v,t^{''}) \in E(\mathcal {G})} f_{(u,v)}\). So, the time complexity of the initialization is of \(\mathcal {O}(\gamma m)\). \(\square \)

We have provided a weak upper bound on running time of the initialization process (Algorithm 1) in Lemma 7. Now, we focus on space requirement of Algorithm 1. Storing the Dictionary \(\mathcal {D}_{e}\) in Line Number 1 requires \(\mathcal {O}(m)\) space. In the worst case, space requirement by the list \(\mathcal {T}_{uv}\) is of \(\mathcal {O}(m)\). The size of Temp can go up to the maximum number of times that any static edge has occurred consecutively more than gamma times in each delta duration, and in the worst case, it may take \(\mathcal {O}(m)\) space. As all the initial cliques are of size 2, hence space requirement due to \(\mathcal {C}_{T}^{I}\) is of \(\mathcal {O}(n^{2}.f_{{{\text{max}}}} )\), where \(f_{{{\text{max}}}} \) is the highest frequency of the initial cliques. So, the total space requirement by Algorithm 1 is of \(\mathcal {O}(m+n^{2}.f_{{{\text{max}}}} )= \mathcal {O}(n^{2}.f_{{{\text{max}}}} )\). Hence, Lemma 8 holds.

Lemma 8

The space requirement of Algorithm 1 is of \(\mathcal {O}(n^{2}.f_{{{\text{max}}}} )\).

Now for the temporal network shown in Fig. 1, the initial cliques with \(\Delta =3\) and \(\gamma =2\), in \(\mathcal {C}_{T}^I\) are \((\{v_1, v_2\}, [1,7])\), \((\{v_1, v_2\}, [7,13])\), \((\{v_1, v_3\}, [2,7])\), \((\{v_1, v_3\}, [8,14])\), \((\{v_2, v_3\}, [2,6])\), \((\{v_2, v_3\}, [7,11])\), \((\{v_2, v_3\}, [5,8])\), \((\{v_2, v_4\}, [4, 12])\), \((\{v_3, v_4\}, [1,9])\), \((\{v_3, v_5\}, [5, 10])\), \((\{v_4, v_5\}, [4,8])\).

4.2 Shrink and bulk phase (enumeration)

Algorithm 2 describes the enumeration strategy of our proposed methodology. For the given temporal network \(\mathcal {G}\), we construct a static graph G where V(G) is the vertex set of \(\mathcal {G}\), and each link of \(\mathcal {G}\) induces the corresponding edge in E(G) without the time component, which we call as a static edge. Next, the dictionary \(\mathcal {D}\) is built from the initial clique set \(\mathcal {C}^I_T\) of Algorithm 1, where the vertex set of the clique is the key, and corresponding occurrence time intervals are the values. This data structure is also updated in the intermediate steps of Algorithm 2. Now, two sets \(\mathcal {C}^{\mathcal {T}_1}\) and \(\mathcal {C}^{\mathcal {T}_2}\) are maintained during the enumeration process. At any i-th iteration of the while loop at Line 5, \(\mathcal {C}^{\mathcal {T}_1}\) maintains the current set of cliques which is yet to be processed for vertex addition and \(\mathcal {C}^{\mathcal {T}_2}\) stores the new cliques formed in that i-th iteration. At the beginning, all the initial cliques from \(\mathcal {C}^I_T\) are copied into \(\mathcal {C}^{\mathcal {T}_1}\). A clique \((\mathcal {X},[t_a,t_b])\) is taken out from \(\mathcal {C}^{\mathcal {T}_1}\) which is duration-wise maximal, and the IS_MAX flag is set to TRUE for indicating the current clique as maximal \((\Delta , \gamma )\)-clique. For vertex addition, it is trivial to convince oneself that only for the neighboring vertices of \(\mathcal {X}\) \((v \in \mathcal {N}_{G}(\mathcal {X}))\), there is a possibility of \((\mathcal {X} \cup \{v\},[t_a^{'},t_b^{'}])\) to be a \((\Delta , \gamma )\)-clique. If the new vertex set \(\mathcal {X} \cup \{v\}\) is found in \(\mathcal {D}\) with one of its value as \([t_a,t_b]\), the IS_MAX flag is set to FALSE, signifying that the processing clique \((\mathcal {X},[t_a,t_b])\) is not maximal. Otherwise, if \(\mathcal {X} \cup \{v\}\) is not present in \(\mathcal {D}\), all the possible time intervals in which \(\mathcal {X} \cup \{v\}\) can form a \((\Delta , \gamma )\)-clique are computed from Line 16 to 37. This process is iterated for all the neighboring vertices of \(\mathcal {X}\) (Lines 10–38). Now, we describe the statements from Line 17 to 36 in detail. As mentioned earlier, to form a \((\Delta , \gamma )\)-clique with the new vertex set \(\mathcal {X} \cup \{v\}\), all the possible combinations from \(\mathcal {X} \cup \{v\}\) of size \(\vert \mathcal {X} \vert \), (represented as C\((\mathcal {X} \cup \{v\}, \mathcal {X})\)), have to be a \((\Delta , \gamma )\)-clique. Now, for all \(z \in \) C\((\mathcal {X} \cup \{v\}, \mathcal {X})\)), if z is present in \(\mathcal {D}.keys()\), it signifies the possibility of forming a new clique with the vertex set \(\mathcal {X} \cup \{v\}\) (Line 17). Now, all the entries of these combinations are taken into a temporary data structure \(\mathcal {D}_{Temp}\) from \(\mathcal {D}\). For the clarity of presentation, we describe the operations from Line 19 to 35 for one vertex addition, i.e., \(\mathcal {X} \cup \{v\}\) with the help of an example shown in Fig. 4. Now, let the entries of \(\mathcal {D}_{Temp}\) be \(z_1, z_2, \dots z_n\), i.e., all \(z_i \in \) C\((\mathcal {X} \cup \{v\}, \mathcal {X})\), and the length of the corresponding entries in \(\mathcal {D}_{Temp}\) be \(l_1, l_2, \dots l_n\), respectively. So, one sample from \(z_1 \otimes z_2 \otimes \dots \otimes z_n\) is taken as timeSet in Line 19 of Algorithm 2. One possible value of timeSet is \([t_{11}, t_{21}, \dots , t_{n1}]\). For this value, the resultant interval \([t_{a}^{'}, t_{b}^{'}]\) is computed as \(t_{11} \cap t_{21} \dots \cap t_{n11}=[max(t_{z_1}^{a^1}, t_{z_2}^{a^1}, \dots , t_{z_n}^{a^1}), \ min(t_{z_1}^{b^1}, t_{z_2}^{b^1}, \dots , t_{z_n}^{b^1})]\). If the difference between \(t_{b}^{'}\) and \(t_{a}^{'}\) is more than or equal to \(\Delta \), then the newly formed \((\Delta , \gamma )\)-clique, \((\mathcal {X} \cup \{v\}, [t_{a}^{'}, t_{b}^{'}])\), is added in \(\mathcal {C}^{\mathcal {T}_2}\) and \(\mathcal {D}\). Also, if \([t_{a}^{'}, t_{b}^{'}]\) matches with the current interval of \(\mathcal {X}\), then the flag \(IS\_MAX\) is set to FALSE, i.e., \((\mathcal {X}, [t_a, t_b])\) is not maximal. Now, this step is repeated for all the samples from \(z_1 \otimes z_2 \otimes \dots \otimes z_n\) from Line 19 to 35. This ensures that all the intervals in which \(\mathcal {X} \cup \{v\}\) forms \((\Delta , \gamma )\)-clique are added in \(\mathcal {D}\). Now, if none of the vertices from \(\mathcal {N}_{G}(\mathcal {X}) {\setminus } \mathcal {X}\) is possible to add in \(\mathcal {X}\), \((\mathcal {X}, [t_a, t_b])\) becomes a maximal \((\Delta , \gamma )\)-clique and added into final maximal clique set \(\mathcal {C}_{\mathcal {L}}\) at Line 40. Vertex addition checking is performed for all the cliques of \(\mathcal {C}^{\mathcal {T}_{1}}\) in the while loop from Line 7 to 42. When \(\mathcal {C}^{\mathcal {T}_{1}}\) is exhausted and \(\mathcal {C}^{\mathcal {T}_{2}}\) is not empty, the contents of \(\mathcal {C}^{\mathcal {T}_{2}}\) are copied back into \(\mathcal {C}^{\mathcal {T}_{1}}\) for further processing, signifying that all the maximal cliques have not been found yet. This is controlled using the flag \(ALL\_MAXIMAL\) in the while loop at Line 5. If no clique is added into \(\mathcal {C}^{\mathcal {T}_{2}}\), the flag \(ALL\_MAXIMAL\) is set to TRUE so that in the next iteration, the condition of the while loop at Line 5 will be false, and finally, Algorithm 2 terminates. At the end, for the temporal network \(\mathcal {G}\), \(\mathcal {C}_{T}\) contains all the maximal \((\Delta , \gamma )\)-cliques of it. One illustrative example of the enumeration algorithm is given in Fig. 5.

Algorithm 2
figure b

Shrinking and bulking phase of the maximal \((\Delta , \gamma )\)-clique enumeration

Now, from the description of the enumeration process of our proposed methodology, we have the following claims:

Claim 1

For any arbitrary clique \((\mathcal {X}, [t_a, t_b]) \in \mathcal {C}^{\mathcal {T}_{1}}\) and \(v \in \mathcal {N}_G(\mathcal {X}) {\setminus } \mathcal {X}\), all the time intervals in the whole lifespan of the linked stream \(\mathcal {L}\), at which \(\mathcal {X} \cup \{v\}\) forms a \((\Delta , \gamma )\)-clique, are added in \(\mathcal {D}\).

Claim 2

In any arbitrary iteration i of the while loop at Line 5, the cliques of \(\mathcal {C}^{\mathcal {T}_{1}}\) and \(\mathcal {C}^{\mathcal {T}_{2}}\) are of size \(i+1\) and \(i+2\), respectively.

Fig. 4
figure 4

The entries of \(\mathcal {D}_{Temp}\) and \(z_i \in \mathcal {D}_{Temp}.keys()\)

Lemma 9

In Algorithm 2, the elements of \(\mathcal {C}_{T}\) are \((\Delta , \gamma )\)-cliques.

Proof

All the cliques are added in \(\mathcal {C}_{T}\), only from \(\mathcal {C}^{\mathcal {T}_{1}}\) at Line 40 in Algorithm 2. Now, initially \(\mathcal {C}^{\mathcal {T}_{1}}\) contains the elements from \(\mathcal {C}_{T}^{I}\), which are \((\Delta , \gamma )\)-cliques from Lemma 4, and later, it is updated with the entries of \(\mathcal {C}^{\mathcal {T}_{2}}\). So, if we show that the elements of \(\mathcal {C}^{\mathcal {T}_{2}}\) are \((\Delta , \gamma )\)-cliques, the statement will be proved. Now, all the cliques of \(\mathcal {C}^{\mathcal {T}_{2}}\) are of at least \(\Delta \) duration, from the condition at Line 28. Also, from the description of the Algorithm 2, it is easy to verify that in each iteration of vertex, addition to a clique of \(\mathcal {C}^{\mathcal {T}_{1}}\) can only be made, if all the possible combinations of vertices form \((\Delta , \gamma )\)-cliques. This ensures that all the vertex pairs of the clique in \(\mathcal {C}^{\mathcal {T}_{2}}\) are linked at least \(\gamma \) times in each \(\Delta \) duration within the intersected time interval of all the combinations. Hence, the elements of \(\mathcal {C}_{T}\) are \((\Delta , \gamma )\)-cliques.

Fig. 5
figure 5

Illustrative example of the proposed maximal \((\Delta , \gamma )\)-clique enumeration algorithm, a input temporal graph with \(\Delta =4\) and \(\gamma =2\), b output of the Algorithm 1—stretching phase, and c and d the content of \(\mathcal {C}^{\mathcal {T}_1}\) at different iterations of Algorithm 2. The cliques in red are duration-wise maximal but not w.r.t. cardinality

\(\square \)

Lemma 10

In Algorithm 2, all the intermediate cliques are duration-wise maximal.

Proof

From the proof of Lemma 9, it is sufficient to show that the contents of \(\mathcal {C}^{\mathcal {T}_{1}}\) are duration-wise maximal. We prove the statement by induction. From Lemma 5, the contents of initial clique set are duration-wise maximal. Let us assume that in the i-th iteration of the while loop at Line 5, the contents of \(\mathcal {C}^{\mathcal {T}_{1}}\) are duration-wise maximal. We need to show that the same will hold in the \((i+1)\)-th iteration also. After adding a vertex to an existing clique obtained in i-th iteration for possible expansion, the new vertex set is considered to be a \((\Delta , \gamma )\)-clique within the intersected interval of all \((i+2)\)-combinations, if the length of the intersected interval is more than \(\Delta \) (Lines 17–36 in Algorithm 2). Now, it can be observed that the latest first \(\gamma \)-th occurrence time \((f_{i+1}^{\gamma })\) of the resultant clique must be same with the latest first \(\gamma \)-th occurrence time \((f_{i}^{\gamma })\) of the constituting clique from which \(t_a\) is coming. Similarly, the earliest last \(\gamma \)-th occurrence time \((l_{i+1}^{\gamma })\) of the resultant clique must be same with the earliest last \(\gamma \)-th occurrence time \((l_{i}^{\gamma })\) of the constituting clique from which \(t_b\) is coming. When both the \(t_a\) and \(t_b\) are coming from the same constituting clique, the original clique is not maximal as vertex addition is possible. Now, for the resultant clique, the beginning time \(t_a\) can not be extended to \(t_a - 1\) as in the i-th iteration, the constituting clique is also duration-wise maximal from the assumption, i.e., \(f_i^{\gamma } - \Delta =t_a \implies f_{i+1}^{\gamma } - \Delta =t_a\). Similarly, \(t_b\) can not be extended to \(t_b + 1\) as in the i-th iteration, the constituting clique is also duration-wise maximal from the assumption, i.e., \(l_i^{\gamma } + \Delta =t_b \implies l_{i+1}^{\gamma } + \Delta =t_b\). So, the resultant clique at \((i+1)\)-th iteration is also duration-wise maximal. This is true for all the cliques generated in each iteration. Hence, all the intermediate cliques in Algorithm 2 are duration-wise maximal. \(\square \)

Lemma 11

In Algorithm 2, at the begining of any i-th iteration, \(\mathcal {C}^{\mathcal {T}_{1}}\) holds all the duration-wise maximal \((\Delta , \gamma )\)-cliques of size \(i+1\).

Proof

For \(i=1\), \(\mathcal {C}^{\mathcal {T}_{1}}\) holds all the duration-wise maximal \((\Delta , \gamma )\)-cliques of size 2 from Lemma 6. Let, \(\mathcal {C}^{\mathcal {T}_{1}}_{i-1}\) and \(\mathcal {C}^{\mathcal {T}_{1}}_{i}\) are the clique sets at the beginning of the iteration \(i-1\) and i, respectively, and \(\mathcal {C}^{\mathcal {T}_{1}}_{i-1}\) holds all the duration-wise maximal \((\Delta , \gamma )\)-cliques of size i. Then, we have to show that during the construction of \(\mathcal {C}^{\mathcal {T}_{1}}_{i}\) from \(\mathcal {C}^{\mathcal {T}_{1}}_{i-1}\), the clique set \(\mathcal {C}^{\mathcal {T}_{1}}_{i}\) remains exhaustive. For a clique from \(\mathcal {C}^{\mathcal {T}_{1}}_{i-1}\), we check for all the possible \(i+1\) vertex combinations in Line 17 of Algorithm 2, which does not leave any possible vertex addition to the clique. Next, for each added vertex, all the possible time interval combinations are generated and checked from Line 19 to 35. Now, for each possible time combination, the \((\Delta , \gamma )\)-clique is generated from the maximum possible common interval of them. This guarantees that all the possible cliques are generated during this process. Again, from Lemma 10, in the i-th iteration, all the generated cliques are also duration-wise maximal, which are now in \(\mathcal {C}^{\mathcal {T}_{1}}_{i}\). So, the same can be proved in the clique building from i-th to \(i+1\)-th iteration. Hence, for any value of i, the claimed statement is true. \(\square \)

Lemma 12

All the \((\Delta , \gamma )\)-cliques returned by Algorithm 2 and contained in \(\mathcal {C}_{T}\) are maximal.

Proof

We prove this statement by contradiction. Assume that \(C_{i}=(\mathcal {X}, [t_a, t_b])\) be an element of \(\mathcal {C}^{I}_{T}\), which is not maximal. In Algorithm 2, the cliques are added in \(\mathcal {C}^{I}_{T}\) from \(\mathcal {C}^{\mathcal {T}_{1}}\), and all the cliques in \(\mathcal {C}^{\mathcal {T}_{1}}\) are duration-wise maximal \((\Delta , \gamma )\)-cliques from Lemma 10. If, \(C_{i}\) is not maximal, then the only thing that can happen is that one or more vertex addition is possible to make \(C_{i}\) maximal. Now, let us assume that \(\exists v \in \mathcal {N}_{G}(\mathcal {X})\), such that \((\mathcal {X} \cup \{v\}, [t_a, t_b])\) is a \((\Delta , \gamma )\)-clique. From the enumaration process described in Algorithm 2, if a clique is added to \(\mathcal {C}_{\mathcal {L}}\), it has to be in \(\mathcal {C}^{\mathcal {T}_{1}}\) in any previous iteration. As \((\mathcal {X} \cup \{v\}, [t_a, t_b])\) is a \((\Delta , \gamma )\)-clique, the \(IS\_MAX\) flag becomes FALSE so that it is not going to be added in \(\mathcal {C}_{\mathcal {L}}\) but in \(\mathcal {C}^{\mathcal {T}_{2}}\). Hence, the assumption \(C_{i} \in \mathcal {C}_{\mathcal {L}} \) is a contradiction. So, all the elements of \(\mathcal {C}_{\mathcal {L}}\) returned by Algorithm 2 are maximal \((\Delta , \gamma )\)-cliques. \(\square \)

Theorem 1

All the maximal \((\Delta , \gamma )\)-cliques of \(\mathcal {G}\) are contained in \(\mathcal {C}_{T}\).

As mentioned previously, m denotes the temporal links in the time-varying graph \(\mathcal {G}\). At Line Number 2, computing the static graph from the given time-varying graph requires \(\mathcal {O}(m)\) time. Time requirement for creating the dictionary \(\mathcal {D}\) will be of \(\mathcal {O}(|\mathcal {C}_{T}|. f_)\) time, where \( f_{{\max }} \) denotes the highest number of times a clique appeared. Copying the cliques from the list \(\mathcal {O}(\mathcal {C}_{T})\) to \(\mathcal {C}^{T_{1}}\) requires \(\mathcal {O}(|\mathcal {C}_{T}|)\) time. Setting the \(ALL\_MAXIMAL\) flag to FALSE in Line Number 4 requires \(\mathcal {O}(1)\) time. So, from Line Number 1 to 4, the time requirement is of \(\mathcal {O}(m+|\mathcal {C}_{T}^{I}|. f_)\). Now, it is easy to verify that the instructions in Line Numbers 6, 8, and 9 require \(\mathcal {O}(1)\) time. The for loop in Line Number 10 can run at most \(\mathcal {O}(n)\) time. Adding the vertex v to the existing clique \(\mathcal {X}\) to form \(\mathcal {X}_{new}\) in Line Number 11 requires \(\mathcal {O}(1)\) time. The maximum number of comparisons in the condition of the if statement in Line Number 12 will be \(\mathcal {O}(|\mathcal {C}_{T}|)\). In the worst case, each comparison can take at most \(\mathcal {O}(n^{2})\) time. Hence, the total time requirement for Line Number 12 requires \(\mathcal {O}(|\mathcal {C}_{T}|. n^{2})\) time. Number of comparisons in the conditional statement in Line Number 13 requires at most \(\mathcal {O}(f_)\) time. Setting the \(IS\_MAX\) flag to “False” in Line Number 14 requires \(\mathcal {O}(1)\) time. Now, in the if statement of Line Number 17, the number of combinations can be \(\mathcal {O}(n)\) in the worst case. Hence, the number of comparisons for checking the existence in the dictionary \(\mathcal {D}\) is of \(\mathcal {O}(n|\mathcal {C}_{T}|)\). As mentioned previously, each individual comparison requires \(\mathcal {O}(n^{2})\) time. Hence, total execution time for Line 17 is of \(\mathcal {O}(n^{3}.|\mathcal {C}_{T}|)\) time. Now, copying the newly generated combinations from the dictionary \(\mathcal {D}\) to \(\mathcal {D}_{Temp}\) requires \(\mathcal {O}(n f_{max})\). It can be verified from the description of the Algorithm 2 that the number of possible combinations among the time duration is of \(\mathcal {O}(f_{max}^{n})\). Hence, the for loop in Line Number 19 will execute \(\mathcal {O}(f_{max}^{n})\) times. Line Numbers 20 and 21 take \(\mathcal {O}(1)\) time. Executing the for loop from Line Number 22 to 25 requires \(\mathcal {O}(n)\) time. Computing the maximum and minimum value among the elements of the list \(max\_t_a\) and \(min\_t_b\) requires \(\mathcal {O}(n)\) time. It is easy to verify that execution of Line Number 28 to 34, 39 to 41, 43 to 45 and 46 requires \(\mathcal {O}(1)\) time. Copying the cliques from in Line Number 44 can take \(\mathcal {O}(|\mathcal {C}_{T}|)\) time. Now, we need to wrap up the computational time requirement for the looping structures to obtain the total time requirement of Algorithm 2. From the previous analysis, it can be verified that the time requirement for executing the for loop from Line Number 19 to 35 will be of \(\mathcal {O}(f_{max}^{n}.n)\). The for loop from Line Number 10 to 38 will execute at max \(\mathcal {O}(n)\) times. Hence, the running time from 10 to 38 is of \(\mathcal {O}(n(n^{2}.|\mathcal {C}_{T}|.f_{max}+n^{3}.|\mathcal {C}_{T}|+n.f_{max}+f_{max}^{n}.n))=\mathcal {O}(n^{3}.|\mathcal {C}_{T}|.f_{max}+n^{4}.|\mathcal {C}_{T}|+n^{2}.f_{max}+f_{max}^{n}.n^{2})=\mathcal {O}(n^{3}.|\mathcal {C}_{T}|.f_{max}+n^{4}.|\mathcal {C}_{T}|+f_{max}^{n}.n^{2})\). The while loop from Line Number 7 to 42 can execute at most \(\mathcal {O}(|\mathcal {C}_{T}|)\) times. Hence, execution time of this while loop is of \(\mathcal {O}(n^{3}.|\mathcal {C}_{T}|^{2}.f_{max}+n^{4}.|\mathcal {C}_{T}|^{2}+ |\mathcal {C}_{T}|. f_{max}^{n}.n^{2})\). Also, the number of times the while loop from Line Number 5 to 48 can execute is at most \(\mathcal {O}(n)\) times. Hence, time requirement for execution of Line Numbers 5–48 is \(\mathcal {O}(n(n^{3}.|\mathcal {C}_{T}|^{2}.f_{max}+n^{4}.|\mathcal {C}_{T}|^{2}+ |\mathcal {C}_{T}|. f_{max}^{n}.n^{2} + |\mathcal {C}_{T}|))=\mathcal {O}(n^{4}.|\mathcal {C}_{T}|^{2}.f_{max}+n^{5}.|\mathcal {C}_{T}|^{2}+ |\mathcal {C}_{T}|. f_{max}^{n}.n^{3} + n.|\mathcal {C}_{T}|)=\mathcal {O}(n^{4}.|\mathcal {C}_{T}|^{2}.f_{max}+n^{5}.|\mathcal {C}_{T}|^{2}+ |\mathcal {C}_{T}|. f_{max}^{n}.n^{3})\). As already derived that running time from Line Number 1 to 4 is of \(\mathcal {O}(m+|\mathcal {C}_{T}^{I}|. f_{max})\), hence, the total time requirement for Algorithm 2 is of \(\mathcal {O}(n^{4}.|\mathcal {C}_{T}|^{2}.f_{max}+n^{5}.|\mathcal {C}_{T}|^{2}+ |\mathcal {C}_{T}|. f_{max}^{n}.n^{3} + m+|\mathcal {C}_{T}^{I}|. f_{max})=\mathcal {O}(n^{4}.|\mathcal {C}_{T}|^{2}.f_{max}+n^{5}.|\mathcal {C}_{T}|^{2}+ |\mathcal {C}_{T}|. f_{max}^{n}.n^{3})\). Maximum number of cliques could be at {{{\text{max}}}} max \(2^{n}\). Hence, plugging the worst case value of \(|\mathcal {C}_{T}|\), we have the running time of Algorithm 2, is \(\mathcal {O}(n^{4}.2^{2n}.f_{{{\text{max}}}} +n^{5}.2^{2n}+ 2^{n}. f_{{{\text{max}}}} ^{n}.n^{3})\).

Additional space requirement of the Algorithm 2 is due to the “static graph” G, which requires \(\mathcal {O}(m)\) space; dictionary \(\mathcal {D}\), which requires \(\mathcal {O}(|\mathcal {C}_{T}^{I}|. f_{{{\text{max}}}} )\) space; dictionary \(\mathcal {D}_{Temp}\) which requires \(\mathcal {O}(n.f_{{{\text{max}}}} )\) space, the list \(\mathcal {X}_{new}\) which requires \(\mathcal {O}(n)\) space, the lists \(\mathcal {C}^{\mathcal {T}_{1}}\), \(\mathcal {C}^{\mathcal {T}_{2}}\), and \(\mathcal {C}_{T}\) which in the worst case these may require \(\mathcal {O}(n 2^{n})\) space; and the lists \(max\_t_{a}\) and \(min\_t_{b}\) which require \(\mathcal {O}(|\mathcal {C}_{T}|)\) space. Hence, total space requirement of Algorithm 2 is of \(\mathcal {O}(m+|\mathcal {C}_{T}^{I}|. f_{{{\text{max}}}} +n.f_{{{\text{max}}}} + n + n.2^{n}+2^{n})=\mathcal {O}(m+|\mathcal {C}_{T}^{I}|. f_{{{\text{max}}}} +n.f_{{{\text{max}}}} + n.2^{n})\). Hence, Lemma 13 holds.

Lemma 13

The running time and space requirement of Algorithm 2 are of \(\mathcal {O}(n^{4} \cdot 2^{2n} \cdot f_{{{\text{max}}}} + n^{5} \cdot 2^{2n}+ 2^{n} \cdot f_{{{\text{max}}}} ^{n} \cdot n^{3})\) and \(\mathcal {O}(m+|\mathcal {C}_{T}^{I}| \cdot f_{{{\text{max}}}} + n \cdot f_{{{\text{max}}}} + n \cdot 2^{n})\), respectively.

As mentioned previously, Algorithm 1 and 2 together constitute the proposed enumeration strategy for maximal \((\Delta ,\gamma )\)-cliques of a temporal network. It has been shown in Lemma 7 that the time requirement of Algorithm 1 is of \(\mathcal {O}(\gamma .m)\). Hence, the total time requirement of the proposed methodology (i.e., Algorithm 1 and 2) is of \(\mathcal {O}(n^{4}.2^{2n}.f_{{{\text{max}}}} +n^{5}.2^{2n}+ 2^{n}. f_{{{\text{max}}}} ^{n}.n^{3} + \gamma .m)\). As mentioned in Lemma 8, the space requirement is of \(\mathcal {O}(n^{2}. f_{{{\text{max}}}} )\). Hence, total space requirement of the proposed methodology is of \(\mathcal {O}(m+|\mathcal {C}_{T}^{I}|. f_{{{\text{max}}}} +n.f_{{{\text{max}}}} + n.2^{n}+n^{2}. f_{{{\text{max}}}} )=\mathcal {O}(m+|\mathcal {C}_{T}^{I}|. f_{{{\text{max}}}} + n.2^{n}+n^{2}. f_{{{\text{max}}}} )\). Now, the Theorem 2 states regarding the time and space requirement of the proposed methodology.

Theorem 2

The computational time and space requirement of the proposed methodology are of \(\mathcal {O}(n^{4}\cdot 2^{2n}\cdot f_{{{\text{max}}}} +n^{5}\cdot 2^{2n}+ 2^{n}\cdot f_{{{\text{max}}}} ^{n}\cdot n^{3} + \gamma \cdot m)\) and \(\mathcal {O}(m+|\mathcal {C}_{T}^{I}| \cdot f_{{{\text{max}}}} + n \cdot 2^{n}+n^{2} \cdot f_{{{\text{max}}}} )\), respectively.

5 Experimental evaluation

In this section, we present the experimental evaluation of the proposed methodology and compare its efficacy with the existing methods from the literature. First, we outline the background of the used datasets, followed by the objectives, comparing algorithm description, and discussion on results.

5.1 Description of the datasets

In our experiments, we have used the following datasets: (1) Hypertext 2009 dynamic contact network (Hypertext) (Isella et al. 2011): This dataset was collected during the ACM Hypertext 2009 conference, where the attendees volunteered to wear radio badges that monitored their face-to-face proximity. The dataset represents the dynamical network of face-to-face proximity of 110 conference attendees over about 2.5 days. (2) College Message Temporal Network (College Message) (Panzarasa et al. 2009): This dataset contains the interaction information among a group of students from the University of California, Irvine. (3) Bitcoin OTC Trust Weighted Signed Network (Bitcoin)Footnote 1 (Kumar et al. 2016, 2018): This is a who-trusts-whom network of people who trade using Bitcoin on a platform called Bitcoin OTC. Members of Bitcoin OTC rate other members on a scale of -10 (total distrust) to +10 (total trust) in steps of 1. This is a weighted, signed, and directed network. However, as per our requirement, we do not consider the direction and weight. As trust of a person changes over time, it is a temporal network. (4) Infectious SocioPatterns Dynamic Contact Network I & II (Infectious I & II) (Isella et al. 2011): The datasets are collected during the Infectious SocioPatterns event that took place in Dublin, Ireland, during the art science exhibition INFECTIOUS: STAY AWAY. The dataset contains the set of tuples of the form (tuv), where u and v are the anonymous ids of the person who are in contact for at least 20 s. Basic statistics of the datasets are given in Table 2.

Table 2 Basic statistics of the datasets

5.2 Setup of the experimentation

The only parameters involved in our study are \(\Delta \) and \(\gamma \). For analyzing a temporal network dataset, one intuitive question will be just to find out the frequently connected groups for a given time duration, which is comparable with the lifetime of the network. For this reason, we select the \(\Delta \) value based on the network lifetime only. For “Hypertext” and “Infectious II” datasets, we start with the \(\Delta \) value of 1 min and keep on increasing it by 1 min till it reaches 10 min. Whereas it is increased in multiplicative order of 10 starting from 1 and 2 min to 100 and 200 min in the “Infectious I” dataset, due to its larger lifetime. For “College Message” and “Bitcoin” datasets, we choose the \(\Delta \) value as 1, 12, 64, 72, and 168 h.

For \(\Delta \)-clique enumeration in all the datasets, we have to set \(\gamma \) value as 1. Now, for enumerating \((\Delta , \gamma )\)-clique, in case of the “Hypertext” and “Infectious II,” we start with the \(\gamma \) value as 2, keep on increasing it by 1 till the maximal clique set becomes empty. In case of “Infectious I” dataset for initial \(\Delta \) values (e.g., 60, 120), we start that \(\gamma \) value is chosen similarly with that of the “Infectious II” dataset. However, for larger \(\Delta \) values (e.g., 6000, 12000), we start with a \(\gamma \) value of 5, and then 10; next incremented by 10 till it reaches 30, and subsequently incremented by 30 till it reaches 330. For the “Bitcoin” dataset, for every \(\Delta \) value, if we increase the \(\gamma \) value beyond 2, the maximal clique set becomes null, due to very small links per static edges ratio, compared to the lifespan of the temporal network. In case of “College Message” dataset, as the chosen \(\Delta \) value is larger, the \(\gamma \) value is incremented by 5 till it goes to 20 and then by 10 till the maximal clique set becomes empty. The goals of the experiments are to analyze, how the number of maximal cliques, maximum cardinality, maximum duration, computational time, and space change with \(\Delta \) and \(\gamma \), and compare the results with the existing algorithms.

5.3 Algorithms compared

In our experiments, we compare the performance of the proposed methodology with the following methods from the literature. (1) Virad et al.’s method (Viard et al. 2016): This is the first method proposed to enumerate maximal \(\Delta \)-cliques of a temporal network. (2) Himmel et al.’s method (Himmel et al. 2017): This method incorporates the famous Born–Kerbosch algorithm to improve Virad et al.’s method. (3) Banerjee & Pal’s method (Banerjee and Pal 2019): This is the existing maximal \((\Delta , \gamma )\)-clique proposed by us in one of our previous studies. We obtain the source code of the first two methodologies as implemented by the respective authors. The proposed methodology is developed in Python 3.4 along with NetworkX 2.0. All the experiments have been carried out on a high-performance computing cluster having 5 nodes, and each of them having 40 cores and 160 GB of RAM. Implementations of the algorithms are available at https://github.com/BITHIKA1992/Delta-gamma-Clique-Update.

Fig. 6
figure 6

Plots for the change in Clique Count (denoted as CC), Maximum Cardinality (denoted as MC), Maximum Duration (denoted as MD), Computational Time (denoted as CT), and Space Requirement (denoted as SR) with the change of \(\Delta \) and \(\gamma \) for different datasets; ae Hypertext; fj College Message; ko Infectious I; and pt Infectious II; the computational time and space of Banerjee & Pal’s (Banerjee and Pal 2019) marked in red

Table 3 Results for the Maximal Clique Count (N), Maximum Cardinality (C), Maximum Duration (D), Computational Time (in Secs.) / Space (in MB) for maximal \(\Delta \)-clique (\((\Delta , \gamma )\)-clique with \(\gamma =1\)) enumeration for different datasets

5.4 Experimental results with discussions

Results for \(\Delta \)-clique enumeration: First, we focus on \(\Delta \)-clique, which is equivalent to \((\Delta , \gamma )\)-clique with \(\gamma =1\). This result is shown in Table 3. In all the datasets, the maximal clique count(N) decreases with the increment of \(\Delta \). This quantity increases for a large \(\Delta \) when there exist some user pairs who contact each other very frequently for a long duration. This generates many maximal cliques with cardinality 2 and different \([t_a, t_b]\) for bitcoin and infectious. The maximum cardinality(C) identifies if there is a large group, and the maximum duration(D) signifies maximum how long users contacted each other. Both C and D are non-decreasing with the growth in \(\Delta \), in all the datasets except “Hypertext.” It is observed that the proposed methodology is the fastest one compared to the existing methods. The computational time increases with \(\Delta \) in Viard et al. (2016), as the algorithm starts with the clique(link) with duration=1, and expands in both right and left by \(\Delta \) and creates more intermediate cliques. Whereas, the processing time depends on the maximal clique count in both (Himmel et al. 2017) and the proposed method. The computational space is mainly dependent on the size of the intermediate clique set, and it gets penalized more in Virad et al.’s method due to the same reason as discussed. The effect can be seen in “Infectious I” for \(\Delta = 6000 \text { and } 12000\). The system’s memory becomes insufficient to compute for these two \(\Delta \) values. For all the datasets, the proposed method beats Virad et al.’s method, both in terms of space and time. Comparing with the Himmel et al.’s method with the proposed method, the trade-off between time and space can be observed for the dense datasets.

Results for \((\Delta , \gamma )\)-clique enumeration: The results for \(\gamma > 1\) are shown in Fig. 6. As the maximal clique set becomes null for \(\gamma >2\) in Bitcoin, we do not show the plots for it. For a fixed \(\Delta \), the maximal clique count decreases exponentially with the increase in \(\gamma \) (refer to Fig. 6 [a,f,k,p]), which, in turn, reduces the computational time and space as well. Maximum cardinality and the duration also reduce with the increment in \(\gamma \). For fixed \(\gamma \), the same observation of \(\gamma =1\) is found. While comparing with the only existing method (Banerjee and Pal 2019), it can be observed that the improvement is more significant for larger value of \(\Delta \) and the small value of \(\gamma \) (Refer to Fig. 6 [d,e,i,j,n,o,s,t]). Lastly, we can conclude for both \(\Delta \) and \((\Delta , \gamma )\)-clique enumeration, the proposed methodology is better when the input dataset is sparse. Among all the datasets, hypertext and Infectious II are comparatively more dense than others, in terms of static graph density and number of links per timestamp. Hypertext dataset is the most dense network with density 34.7% (3.2% for Infectious II), and a node in the hypertext dataset is involved in three links at a time (5 for Infectious II). The same for other datasets are in the order of \(\sim 10^{-3}\), which are much lower. Now, from Table 3, it can be seen that Himmel et al. perform much better than the proposed method, and the difference is more for a small value of \(\Delta \) in hypertext data. It has a very high ratio of MaxDuration/\(\Delta \), which signifies that the network is dense, and there is frequent communication among many of the nodes within a small span of time interval. This results in many more duration-wise maximal clique generation and their corresponding vertex expansion. The Himmel et al.’s method grows based on neighborhood creation at specific time intervals for each node, and as the number of nodes is less in the dense network, this boosts up the performance. Again, when the number of initial cliques is more (like in Infectious I), Virad et al. fails to proceed even if the dataset is sparse. The significant improvement in \(\Delta \)-clique enumeration is seen in the performance for other datasets which are sparse. The scenario for sparse cases is more evident for the \((\Delta , \gamma )\)-clique enumeration case in Fig. 5. It can be seen if the \(\Delta \) is increased for small \(\gamma \), the improvement is more in the proposed method from the existing method of Banerjee & Pal. In the increased \(\Delta \), the initial duration-wise maximal clique set reduces the intermediate clique count, and the neighborhood in that large \(\Delta \) also becomes smaller due to the sparse nature of the problem instance. Hence, we conclude that the proposed methodology significantly improves when the graph is sparse.

6 Conclusion and future directions

In this paper, we have proposed a methodology to enumerate all the maximal \((\Delta , \gamma )\)-cliques present in a temporal network. The proposed methodology has been analyzed for time and space requirements, and also, its correctness has been shown. To highlight its effectiveness, we have compared the execution time of the proposed methodology on five real-world publicly available datasets over the existing methods. As in many real-world applications, links are probabilistic in nature, so extending this study for such scenarios may be one possible future direction.