1 Introduction

Since communities are at the heart of human society, community analysis has always been one of the chief topics in sociology. Although there is no unique definition of a community amongst researchers [27], a community, cluster, or module can be seen as a set of nodes which share similar properties such as features, objectives, and roles. With information technology revolutionising almost every aspect of life in the last decade, social networks have become immensely popular both in industry and academia.

Depending on the availability of network information, community detection methods are categorised into global and local approaches. Global community detection algorithms need global information of the entire network and are computationally expensive when applied to all but the smallest networks. Considering these limitations, it is preferable to address the problem of community detection from a local perspective. Local methods try to find community patterns in subsets of a graph without considering the global information. This avoids the need to calculate and consider information about the entire graph whenever a node is included in a community. Also, in some cases such as the World Wide Web or the Internet the information of the entire network is not available. For these reasons, local community detection methods scale more easily to large networks of interest.

As a result, many researchers try to analyse communities by focusing on a particular region of the network. In most local community detection approaches, the goal is to find the communities around a seed node. Web search engines are interested to find similar pages around a particular page. HITS [30] and PageRank [58] are some of popular ranking algorithms which can be seen as local community detection algorithms in the network of the Web.

Communities can also be investigated based on the dynamics of the networks. As most real-world networks change over time, analysing and tracking dynamic communities is a crucial task. Surprisingly, the majority of studies in this area consider communities in static networks.

In order to address some shortcomings of the existing methods, especially efficiency and the ability to trace dynamic communities, we propose the Derivative-based Community Detection (DCD) method inspired by geometric active contours [7]. Geometric Active Contours are used extensively for detecting objects in 2D images in the field of machine vision. They are known for their speed, their autonomous and unsupervised nature, and their ability to track dynamic objects. The analogy between the discovery of shapes in images and the detection of communities in graphs suggests that an application of the active contours method to graph spaces might provide an efficient alternative to existing community detection techniques.

In geometric active contours, the boundary of an object evolves until it accurately reflects the edge of an object, which is defined by large differences in the pixel values. The boundary of the object is defined in terms of curvature and gradient, which are based on derivatives. Curvature and gradient constitute the velocity of the boundary, which determines whether the boundary moves to include further pixels.

This principle of moving a boundary until all suitable nodes have been included can be translated into graph space provided we can determine the derivatives of a function in a graph. In this work, which is the extended version of our previous work [47], we provide a novel means of approximating derivatives of functions in graph space and use these derivatives to find the boundaries of communities and, also, to update the boundaries when network changes in dynamic settings. The contributions of this research are providing a general framework with feasible time complexity for approximating derivatives in graph space; proposing a derivative based method for detecting local communities; providing an effortless method using the proposed local community detection approach to track local communities in dynamic networks; and, addressing and covering the gap between image processing and networks. In the experiments, we compare our proposed local community detection method with two other local community detection methods presented in the work of Jeub et al. [27].

The rest of this paper is organized as follow: Section 2 reviews related works, Section 3 presents active contours and the application of differential geometry in image processing, Section 4 generalizes the differential geometry to graph space, and finally in Sections 5 and 6 the proposed derivative approach has been validated on community detection and tracking applications.

2 Related work

Community detection is a well studied area [33, 38]. Detecting communities is an important problem in various domains. In sociology, in order to discover human behaviour patterns, researchers have studied social communities for decades [14, 31]. In biology and cancer research, researchers are looking for communities in protein interaction networks which lead to some specific actions in the cell [10, 46]. Community detection algorithms have been extensively used in computer science. For example, clustering Web clients in the World Wide Web is used to improve the performance of service providers [32], or finding communities in social networks. Community detection approaches can be classified by the amount of information available about the graph. If the information of the entire network is available, the method is called a global method; otherwise, it is called local method. In this section, firstly, global and local methods are briefly reviewed. Secondly, the problem of tracking communities in dynamic networks is reviewed. Thirdly, we investigate the problem of finding a local community around a seed node. Finally, since one main contribution of this research is a framework for approximating derivatives in graph space, the studies in this domain are reviewed.

2.1 Global community detection

In global community detection algorithms, the data on the entire network is available. In this class of algorithm each method has a quality function which takes the partitions as input and returns a number as a quality measure. Since there is no global census regarding the definition of a community, different methods consider different criteria for defining quality functions. The main drawback of global community detection is that it has to extract pairwise information for all pairs of nodes in the entire graph.

Even though most of the complex networks in the real world are dynamic, the larger part of the studies focuses on static global community detection methods. Numerous survey papers on the topic exist [16, 22, 49, 50].

Spectral Clustering uses the eigenvectors of Laplacian adjacency matrices of graphs, and then eigenvectors are used as a basis of a clustering algorithm such as hierarchical or K-means in order to clusters vertices into communities [41, 45]. Modularity [13, 18] is a well-known quality function in community detection research. Modularity based methods assume that the edges that have the highest degree of betweenness centrality connect two communities. The algorithm finds the betweenness centrality for every edge and removes the one with highest degree at each step until graph becomes disconnected, and splits into two partitions. The same procedure is repeated in each partition. Modularity optimisation is proved to be an NP-hard problem [5]. Correlation clustering assumes edges have real-valued weights. It tries to maximise the sum of the intra-cluster edge weights, or, alternatively, minimise the sum of the extra-cluster weights [9]. In both cases, finding the optimal solution is an NP-hard problem [4].

2.2 Local community detection

In local community detection, researchers assume that communities have independent patterns. They try to find community patterns in subsets of a graph without considering global information. They only need to calculate information for a part of a graph. Therefore, they are more applicable to real-world large networks such as the Internet.

In local community detection, most algorithms try to find a community that surrounds a node or a seed. Many algorithms in this category are extensions of global community detection algorithms. In local modularity, one defines a quality function for one community, and then, in an agglomerative procedure, adds nodes to the community [12, 34]. At each step of this class of algorithms a candidate node which has the highest quality (based on local modularity) is added to the community until the maximum size of the community has been reached. Mahoney et al. [39] proposed the local spectral clustering algorithm discussed before. Anderson and Lang [2] used random walks in order to find local communities. When random walks start with a small number of steps from an initial seed node, the random walks are more likely to become trapped in the same community rather than traveling to other communities. Another problem in local community detection in social networks is how to find different social circles of a node. A node may belong to more than one community, which induces overlapping communities. The majority of studies in this domain are categorised into two domains: Detecting communities based on network structure [1, 43] and based on nodes features [28]. There are very few works that consider both structure and attributes [40, 62]. In this research, we propose a method which is able to find overlapping and hierarchical communities of a user (or seed) by only considering the structure of network. The proposed local community detection method, DCD, belongs to seed-expansion methods. Seed-expansion methods are a class of algorithms that find local communities around a seed node [3, 13, 39] .

2.3 Tracking communities in dynamic networks

Based on the nature of changes (over time) or the nature of the algorithms, there are different categorisation for tracking communities.

Two approaches based on changes are snapshot and temporal smoothness. In the snapshot model, different snapshots of the network at different time steps are available and the task is to use find communities or clusters and track clusters over time in order to interpret their evolution over time [61]. In the temporal smoothness model, the goal is to derive the communities over time given a stream of atomic changes.

Two approaches based on the nature of tracking algorithms are evolutionary clustering and dynamic clustering. In evolutionary clustering, after each change, communities are discovered using a static clustering method whereas in dynamic clustering the goal is to update the clusters in the next time step using the information of the current time step.

2.3.1 Snapshot model

In the snapshot model, snapshots of the evolving network at different time steps are available. Due to emergence or disappearance of nodes or edges, there exist a collection of changes between snapshots. In the first snapshot at t = 0 there are n communities {C1,..., Cn}. In a more general case, n, the number of communities, can also change over time. This means communities can be born, die, or merged. Each node belongs to a certain community. The job of community maintenance algorithms is to update communities (membership of nodes to communities) in the next time steps (t = 1,2,...), and then to analyse the evolution of communities over time.

Since in the snapshot model at each time step the fate of very node should be decided, the algorithms in this domain should consider the entire network. As a result, the community tracking models in this framework are usually adaptations of global community detection methods.

2.3.2 Temporal smoothness

In temporal smoothness, the available information is the graph at t = 0, and a continuous stream of atomic changes. An atomic change can be node/edge addition deletion. Up to now, all studies in temporal smoothness fall into global community detection category. That is to say they assume at t = 0, the initial network is partitioned into n clusters {C1,..., Cn}. Each node belongs to a certain community. Then, with the stream of atomic changes, the cluster and membership for each node a cluster is updated.

2.3.3 Tracking global communities in dynamic networks

In snapshot model, using evolutionary methods, one takes different snapshots of network, finds communities in each snapshot with a static clustering model, and, then, interprets their change over time [61]. Basically, all methods in Section 2.1 can be used in such settings.

In temporal-smoothness, the goal is to derive the communities over time given a stream of changes. A change can be the addition or removal of a node or edge. Falkowski et al. [19] use Girvan-Newman modularity-based community detection for both finding and tracking communities. Tong et al. [57] suggested low rank approximation for detecting dynamic networks; however, their research lacks evaluation. Xu et al. [60] used a hidden Markov model to address dynamic networks. In Verted-centred methods [6, 29], which have similar concept as K-means clustering algorithm, evolving leaders and, therefore, the communities around leaders are found in each time step. Leskovec et al. [36] used the Clique Percolation Method (CPM) to identify communities at each time step, and then match them with community evolution methods. MONIC, a framework for modeling and monitoring clusters transitions over time, was suggested by Spiliopoulou [53]. Graphscope [54] is a parameter-free algorithm which mines time evolving graphs in order to find communities, and their change over time. Nguyen et al. [42] developed a framework for identifying and tracking overlapping communities by defining a global objective function which is summation of a set of local communities. Samie et al. [48] developed a two-phased model that is comprised from a global and local method. In the first phase, they find global communities and, in the second phase, the find and track local communities in the detected clusters using the global approach. Shang et al. [51] proposed a learning based approach for tracking global communities in dynamic networks. They train and use a classifier in order to find and inspect the vertices that are more likely to change their community after the network is changed.

2.4 Tracking local communities in dynamic networks

Based on the categorisation of previous section, local communities can be investigated in four different ways.

  1. 1.

    Snapshot model: There are several changes in each time step.

    • Evolutionary clustering: Any local community detection method from Section 2.2 can simply be used at each time step.

    • Dynamic clustering: In this category, a method must be able to maintain and update the local community given several changes, and the current structure of the community. With the assumption that the changes between two snapshots are not considerable, only few studies [55, 56] have tackled this problem. Takaffoli et al. [56] use static L-metric method [11] to find local communities in each snapshot from using the structure of the local communities in the previous snapshot.

  2. 2.

    Temporal smoothness: Stream of atomic changes.

    • Evolutionary clustering: Local methods can be used after each atomic change. For example, Hu et a.l [26] proposed a personalised page rank algorithm that finds the local communities in the area of changes after each change atomic change.. This class of algorithms might not be computationally efficient because after every single atomic change, local community detection method should run.

    • Dynamic clustering: The goal in this category is to update the structure of a local community after every atomic change without recalculations. According to best of our knowledge, there is no other methods under this setting. DCD belongs to this category.

2.5 Derivatives in graphs

Friedmann and Tillich [23] developed a connection between graph theory and calculus. The authors defined continuous concepts like continuously differentiable functions, boundary, and gradient over the graph space to develop a wave equation in order to understand the changes in the connectivity of graph nodes. In this domain, Diao et al. [17], explored a bounded symmetric function defined over the edges of a finite labeled graph called graphon space. They proposed a general theory of differentiation over this space. As this space is not a vector space, the authors refined Gateaux derivative to make it appropriate for graphon space. Both of these studies proposed partial differential equations (PDEs) over a continuous topology given on a graph. In an attempt to avoid complex differential theory and to take advantage of finite dimensional linear algebra, an alternative approach is to formulate derivatives on the original discrete graph space.

However, according to Solomon [52], the derivatives cannot be derived as easily as in the continuous domain; Solomon modeled the continuous Laplacian in the discrete domain of graphs by finite difference approximation. Inspired by discretisation of continuous PDEs, the Laplacian was computed for each two nodes in the graph and used to solve Laplace and Poisson equations in the graphs domain. Solomon approach has its own limitations which makes it impractical and difficult to use. Solomon [52] mentions: “Differential graph analysis, on the other hand, requires the development of a new framework to understand diffusion, oscillation, and other phenomena as they might occur on graphs, which may not be physically realizable.”.

In this research, it is shown that extracting differential geometry properties using the proposed framework is fairly easy – with some intuitive imagination. Moreover, when it comes to finding higher order derivatives (second or higher) Solomon’s framework is computationally unfeasible since it needs to solve a challenging combinotorial problem whereas the time complexity of the proposed framework is polynomial. The proposed framework finds the derivatives by solving systems of linear equations which is considerably faster than Solomon’s approach [52]. The proposed approach also does not deal with the mathematical difficulty and limitations of Friedmann and Tillich [23] and Diao et al. [17] approaches.

3 Geometric active contours

In the field of image processing, the problem of object detection has been addressed in many different ways. Active Contours is a method devised first in 1988 [7]. A related approach, based on differential geometry, was devised in 1997. Due to its efficiency, autonomy, and unsupervised nature geometric active contours [7, 20, 21] are used extensively for detecting objects in 2D images in the field of machine vision. In this method, an initial contour deforms and evolves in order to find the boundary of an object. In an image, shapes distinguish themselves from the background by boundaries characterised by pixels whose properties are very different from those of the adjacent pixels which form part of the background.

Initially, a curve is created at a random location of the image with the goal of finding the boundary of an object. The curve evolves based on two concepts, curvature and gradient. The curvature of a function f(x), defined in (1), describes how fast the curve changes its tangent or direction.

$$ \begin{array}{@{}rcl@{}} \kappa = \frac{f^{\prime\prime}(x)}{(1+f^{\prime2}(x))^{\frac{3}{2}}} \end{array} $$
(1)

\(f^{\prime }(x)\) and \(f^{\prime \prime }(x)\) are the first and second derivatives. The vector differential operator ∇ has the following definition:

$$ \begin{array}{@{}rcl@{}} \nabla = \frac{\partial}{\partial x} i + \frac{\partial}{\partial y} j+ \frac{\partial}{\partial z} \end{array} $$
(2)

Assuming three-dimensional Euclidian space, the gradient of f(x, y, z) is obtained by applying the vector operator ∇ to the scalar function f(x, y, z) as defined in (3).

$$ \begin{array}{@{}rcl@{}} \nabla f(x,y,z) = \frac{\partial f}{\partial x} i + \frac{\partial f}{\partial y} j+ \frac{\partial f}{\partial z} k \end{array} $$
(3)

In geometric active contours, the curve evolves in the direction that is perpendicular to the curve. The curve is considered the current boundary, and an adjacent pixel on the movement direction of the boundary is evaluated for inclusion based on the magnitude of the gradient between the pixel on the boundary and the neighbouring pixel in that direction. In images, the gradient is obtained by subtracting the grey values of neighbouring pixels. A second criterion for the inclusion of a neighbouring pixel is the curvature of the current boundary. A straight line has a curvature of zero. A curve that ’recedes’ inward towards the shape has a high curvature. Intuitively, an object is likely to strive to include ’inserts’ into its area. Hence an increased curvature favours the inclusion of pixels on the outside of the boundary. In combination, gradient and curvature result in the velocity s of the curve, expressed in (4). The velocity decides the likelihood of a pixel being included in the shape.

$$ \begin{array}{@{}rcl@{}} s = g \kappa \boldsymbol{n}- (\nabla g \boldsymbol{n}) \boldsymbol{n}, where\ g=\frac{1}{1+|\nabla I|^{2}} \end{array} $$
(4)

where n denotes the normal direction and I the pixel values in an image with |∇I| as the magnitude of the gradient between two pixels [7].

4 Differential geometry in graph space

Image processing is a data-intensive process which benefits from localised methods like active contours. Graphs as encountered in social networks are similarly demanding because of the potential sizes of graphs and their high dimensionality. The analogy between the discovery of shapes in images and the detection of communities in graphs suggests that an application of the active contours method to graph spaces might provide an efficient alternative to existing clustering techniques.

Mapping the relevant concepts from Euclidean to graph space poses a few challenges. While in image processing, the goal is to identify shapes with an external boundary, communities in graphs are defined as sets of nodes that share more properties with other nodes within the same community than they do with nodes outside the community. Unlike images, where the number of dimensions is uniform across the pixels, each node in a graph can have different numbers of neighbours, giving rise to high fluctuations in dimensionality. An image has a clearly defined boundary, whereas it is hard even to define the boundary of an entire graph. As a consequence of the non-uniform dimensions of a graph, most matrix operations used in machine vision cannot be applied to graphs.

In order to be able to identify communities using geometric features, it is necessary to define shapes and their boundaries in a graph. We define a shape χ(ν, ξ) in a graph G(V, E) as a connected subgraph of a set of nodes ν that are connected by the edges ξ; (\(\nu \subseteq V, \xi \subseteq E \)). The boundary of a shape in a graph is defined as the nodes inside the shape that have common edges with nodes outside the shape, formally a node vi is on the boundary of shape χ if ∃eijE|viνvjVvjν

These connected nodes in the graph space create an object with its boundaries. The following figure demonstrates two shapes in the two different graphs.

In Figure 1a, the nodes in red colour compose a shape which consists of only two nodes. Figure 1b shows a shape that is comprised of 4 nodes. Nodes v2 and v4 in Figure 1b are considered the external boundary of the shape. Having defined the concept of shapes and boundaries in graph space, we have to answer the question how a boundary can move to align with desirable communities in graphs. To evolve boundaries, a method requires criteria which decide whether a node in the neighbourhood of the boundary is to be included in the shape. This can be achieved by applying the concept of velocity (4) to graph spaces. Velocity is calculated using the components of gradient and curvature, both of which require the calculations of derivatives.

Figure 1
figure 1

Examples of shapes and their boundaries in graphs

4.1 Derivatives in graphs

The notion of velocity in image processing is based on derivatives which produce gradient and curvature on the boundary of a shape. In a graph, we define the derivative as the rate of change of a function F(vi) for a given node vi. Function F(vi, vp) can be defined as the similarity, match or the difference between node vi and a constant node vp. For example, Fdegree(vi, t) can represent the degree of the node vi at time t in a dynamic network, or F(vi, vp, t) can represent the similarity of the node vi to the node vp at the time t.

Since derivatives do not exist in discrete space, the Taylor series (5) expansion can be used to approximate derivatives.

$$ \begin{array}{@{}rcl@{}} f(x+{\Delta} x)=f(x)+({\Delta} x)f^{\prime}(x)+...+\frac{({\Delta} x)^{i}}{i!}f^{i}(x)+... \end{array} $$
(5)

In the reminder of this section we will show how the derivatives in graphs can be approximated from the Taylor series. Calculating the curvature of a function requires the first and second derivatives.

According to the Taylor series, the numerical approximation of the first-order derivatives for a function f(x) is [15]:

$$ \begin{array}{@{}rcl@{}} \frac{\partial f}{\partial x} = \frac{f(x+{\Delta} x)-f(x)}{\Delta x}+O({\Delta} x) \end{array} $$
(6)

Similarly, the first-order backward derivative is:

$$ \begin{array}{@{}rcl@{}} \frac{\partial f}{\partial x} = \frac{f(x)-f(x-{\Delta} x)}{\Delta x} \end{array} $$
(7)

The second-order first derivative is:

$$ \begin{array}{@{}rcl@{}} \frac{\partial f}{\partial x} = \frac{f(x+{\Delta} x)-2f(x)+f(x-{\Delta} x)}{2{\Delta} x} +O({\Delta} x)^{2} \end{array} $$
(8)

The terms Ox) and Ox)2 in (6) and (8) are called truncation error and represent the remaining parts on the right side of the (5) which are neglected if one wishes to approximate derivatives.

The approximations in (6) and (8) can be extended to graphs. As and example, we will determine the derivative of F(vi) in Figure 2.

Figure 2
figure 2

Example graph with three nodes

First-order derivative:

$$ \begin{array}{@{}rcl@{}} F^{\prime}_{v}(v_{i}) \approx \frac{F(v_{i+1})-F(v_{i})}{v_{i+1}-v_{i}} \end{array} $$
(9)

vi+ 1vi shows the distance, or dissimilarity between these two nodes (e.g. difference of two friends in a social network). Any derivative based operation needs a distance function. Since this paper only focuses on the structures of networks for finding communities, structural distance or similarity measures in networks are considered. Nodes are structurally equivalent if they are in the same area of the graph and have the same neighbours. Three main structural similarity measures are simple structural equivalence, Cosine similarity measure, and Pearson product-moment correlation coefficient. In this research, normal structural distance (45) is used. Analysing different distance measures and their effect on detecting communities are left for future works.

Second-order first derivative:

$$ \begin{array}{@{}rcl@{}} F^{\prime}_{v}(v_{i}) \approx \frac{F(v_{i+1})- 2F(v_{i})+ F(v_{i-1}) } {d_{1}+d_{2}} \end{array} $$
(10)

where d1 = vivi− 1 and d2 = vi+ 1vi. di, in general, shows the difference between the nodes in the graph. Applying the model to weighted graphs is straightforward: If the weight between vi and vi− 1 is w, then \(d_{i}(w)= \frac {d_{i}}{w}\). .

The second derivative according to the Taylor series:

$$ \begin{array}{@{}rcl@{}} f(x+{\Delta} x)=f(x)+{\Delta} xf^{\prime}(x)+\frac{({\Delta} x)^{2}}{2!}f^{\prime\prime}(x)+ O({\Delta} x)^{3} \end{array} $$
(11)

In the rest of this section, after analysing two examples, a general model for finding the derivatives for a given node is proposed.

Example 1

Approximating derivatives for a node with two neighbours (Figure 3):

$$ \begin{array}{@{}rcl@{}} F(v_{c}+d_{1}) = F(v_{1})= F(v_{c})+d_{1} F^{\prime}_{v}(v_{c})+ \frac{{d^{2}_{1}}}{2}F^{\prime\prime}_{v}(v_{c}) \end{array} $$
(12)
$$ \begin{array}{@{}rcl@{}} F(v_{c}+d_{2}) = F(v_{2})= F(v_{c})+d_{2} F^{\prime}_{v}(v_{c})+ \frac{{d^{2}_{2}}}{2}F^{\prime\prime}_{v}(v_{c}) \end{array} $$
(13)
Figure 3
figure 3

Node with two neighbours

This can be shown and solved as a linear system with two equations and two unknowns:

$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} d_{1} & \frac{{d^{2}_{1}}}{2} \\ \\ d_{2} & \frac{{d^{2}_{2}}}{2} \end{array}\right] \left[\begin{array}{llll} F^{\prime}(C)\\ \\ F^{\prime\prime}(C) \end{array}\right] = \left[\begin{array}{llll} F(V_{1})-F(C)\\ \\ F(V_{2})-F(C) \end{array}\right] \end{array} $$
(14)

Example 2

Approximating the derivatives of F(vc) in Figure 4a:

$$ \begin{array}{@{}rcl@{}} F(v_{c}+d_{1}) = F(v_{1})= F(v_{c})+d_{1} F^{\prime}_{v}(v_{c})+ \frac{{d^{2}_{1}}}{2}F^{\prime\prime}_{v}(v_{c}) \end{array} $$
(15)
$$ \begin{array}{@{}rcl@{}} F(v_{c}+d_{2}) = F(v_{2})= F(v_{c})+d_{2} F^{\prime}_{v}(v_{c})+ \frac{{d^{2}_{2}}}{2}F^{\prime\prime}_{v}(v_{c}) \end{array} $$
(16)
$$ \begin{array}{@{}rcl@{}} F(v_{c}+d_{3}) = F(v_{3})= F(v_{c})+d_{3} F^{\prime}_{v}(v_{c})+ \frac{{d^{2}_{3}}}{2}F^{\prime\prime}_{v}(v_{c}) \end{array} $$
(17)
Figure 4
figure 4

a.vc a node with three neighbours; b.vc a non-central node

Accordingly, the system of equations is:

$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} d_{1} & \frac{{d^{2}_{1}}}{2} \\ d_{2} & \frac{{d^{2}_{2}}}{2} \\ d_{3} & \frac{{d^{3}_{2}}}{2} \end{array}\right] \left[\begin{array}{llll} F^{\prime}(v_{c})\\ F^{\prime\prime}(v_{c}) \end{array}\right] = \left[\begin{array}{llll} F(v_{1})-F(v_{c})\\ F(v_{2})-F(v_{c})\\ F(v_{3})-F(v_{c}) \end{array}\right] \end{array} $$
(18)

This is an overdetermined system with three equations and two unknowns. Overdetermined systems are usually inconsistent with no unique solution. In this case, one way of solving the problem of overdetermination is to convert an overdetermined system to a determined one by adding more unknowns in the form of higher derivatives, of course at the cost of additional complexity. Higher order derivatives are impractical in the case of nodes that have a larger number of neighbours. Other alternatives are the least squares approximation methods discussed later.

Here, the unknowns are higher derivatives. Although higher orders of derivatives are not necessary in the curve evolution of the gedosic active contour technique, at the price of higher computations, we will achieve higher accuracy for the first and second derivatives by considering them.

$$ \begin{array}{@{}rcl@{}} F(C+d_{1}) = F(V_{1})= F(C)+d_{1} F^{\prime}_{V}(C)+ \frac{{d^{2}_{1}}}{2}F^{\prime\prime}_{V}(C) + \frac{{d^{3}_{1}}}{3!}F^{\prime\prime\prime}_{V}(C)\end{array} $$
(19)
$$ \begin{array}{@{}rcl@{}} F(C+d_{2}) = F(V_{2})= F(C)+d_{2} F^{\prime}_{V}(C)+ \frac{{d^{2}_{2}}}{2}F^{\prime\prime}_{V}(C) + \frac{{d^{3}_{2}}}{3!}F^{\prime\prime\prime}_{V}(C) \end{array} $$
(20)
$$ \begin{array}{@{}rcl@{}} F(C+d_{3}) = F(V_{3})= F(C)+d_{3} F^{\prime}_{V}(C)+ \frac{{d^{2}_{3}}}{2}F^{\prime\prime}_{V}(C)+ \frac{{d^{3}_{3}}}{3!}F^{\prime\prime\prime}_{V}(C) \end{array} $$
(21)
$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} d_{1} & \frac{{d^{2}_{1}}}{2} & \frac{{d^{3}_{1}}}{3!} \\ \\ d_{2} & \frac{{d^{2}_{2}}}{2} & \frac{{d^{3}_{2}}}{3!} \\ \\ d_{3} & \frac{{d^{3}_{2}}}{2} & \frac{{d^{3}_{3}}}{3!} \end{array}\right] \left[\begin{array}{llll} F^{\prime}(C)\\ \\ F^{\prime\prime}(C)\\ \\ F^{\prime\prime\prime}(C) \end{array}\right] = \left[\begin{array}{llll} F(V_{1})-F(C)\\ \\ F(V_{2})-F(C)\\ \\ F(V_{3})-F(C) \end{array}\right] \end{array} $$
(22)

Example 3

Approximating the derivatives of F(vc) for the non-central node vc in Figure 4b. After extracting the equations based on the Taylor series, similar to example 1, the system of the equations is (Fig. 5):

$$ \begin{array}{@{}rcl@{}} F(v_{c} +m) = F(v_{1})= F(v_{c})+m F^{\prime}_{v}(v_{c})+ \frac{m^{2}}{2}F^{\prime\prime}_{v}(v_{c}) + \frac{m^{3}}{3!}F^{\prime\prime\prime}_{v}(v_{c}) \end{array} $$
(23)
Figure 5
figure 5

Non-central nodes

$$ \begin{array}{@{}rcl@{}} F(v_{c}+m+d_{1}) = F(v_{2})= F(v_{c})+(m+d_{1}) F^{\prime}_{v}(v_{c}) + \frac{(m+d_{1})^{2}}{2}F^{\prime\prime}_{v}(v_{c}) + \frac{(m+d_{1})^{3}}{3!}F^{\prime\prime\prime}_{v}(v_{c}) \end{array} $$
(24)
$$ \begin{array}{@{}rcl@{}} F(v_{c}+m+d_{2}) = F(v_{3})= F(v_{c})+(m+d_{2}) F^{\prime}_{v}(v_{c})+ \frac{(m+d_{2})^{2}}{2}F^{\prime\prime}_{v}(v_{c}) + \frac{(m+d_{2})^{3}}{3!}F^{\prime\prime\prime}_{v}(v_{c}) \end{array} $$
(25)
$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} m & \frac{m^{2}}{2} & \frac{m^{3}}{3!} \\ (m+d_{1}) & \frac{(m+d_{1})^{2}}{2} & \frac{(m+d_{1})^{3}}{3!} \\ (m+d_{2}) & \frac{(m+d_{2})^{2}}{2} & \frac{(m+d_{2})^{3}}{3!} \end{array}\right] \left[\begin{array}{llll} F^{\prime}(v_{c})\\ F^{\prime\prime}(v_{c})\\ F^{\prime\prime\prime}(v_{c}) \end{array}\right] = \left[\begin{array}{llll} F(v_{1})-F(v_{c})\\ F(v_{2})-F(v_{c})\\ F(v_{3})-F(v_{c}) \end{array}\right] \end{array} $$

begincenter .

$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} m & \frac{m^{2}}{2} & \frac{m^{3}}{3!} \\ (m+d_{1}) & \frac{(m+d_{1})^{2}}{2} & \frac{(m+d_{1})^{3}}{3!} \\ (m+d_{2}) & \frac{(m+d_{2})^{2}}{2} & \frac{(m+d_{2})^{3}}{3!} \end{array}\right] \left[\begin{array}{llll} F^{\prime}(v_{c})\\ F^{\prime\prime}(v_{c})\\ F^{\prime\prime\prime}(v_{c}) \end{array}\right] = \left[\begin{array}{llll} F(v_{1})-F(v_{c})\\ F(v_{2})-F(v_{c})\\ F(v_{3})-F(v_{c}) \end{array}\right] \end{array} $$

4.1.1 A general framework for approximating derivatives of a function in the graphs

The approximation of derivatives of nodes depends on the position of the node. When a node has a degree of one, we consider it a non-central node. Nodes with degrees larger than one are central nodes. For the purpose of approximating derivatives, if a node has only one neighbour, it is only possible to calculate the first derivative. Unlike with central nodes, in the case of non-central nodes it is necessary to consider neighbours of neighbours to obtain higher-order derivatives.

Central nodes (Figure 6a)

A central node vc has n neighbours. The Taylor series equation for the i-th (1 ≤ in) neighbouring node of vc can be written as:

$$ \begin{array}{@{}rcl@{}} F(v_{c}+d_{i}) = F(v_{i})= F(v_{c})+d_{i} {F^{1}_{v}}(v_{c})+... + \frac{{d^{n}_{i}}}{n!}{F^{n}_{v}}(v_{c}) \end{array} $$
(26)

The equations system is:

$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} d_{1} & ... & \frac{{d^{n}_{1}}}{n!} \\ &...& \\ d_{n} & ... & \frac{{d^{n}_{n}}}{n!} \end{array}\right] \left[\begin{array}{llll} F^{1}(v_{c})\\ ...\\ F^{n}(v_{c}) \end{array}\right] = \left[\begin{array}{llll} F(v_{1})-F(v_{c})\\ ...\\ F(v_{n})-F(v_{c}) \end{array}\right] \end{array} $$
(27)

It is a system of linear equations with n equations and n unknowns.

Non-central nodes

Figure 6b shows one of the peripheral nodes as the node vc for which we approximate the derivatives. The neighbour of vc is always a central node unless it is part of a two-node component of the graph, in which case it is only possible to calculate the first derivative.

Figure 6
figure 6

a. Node vc with n neighbours; b. Node vc with one neighbour

In this case, vn has several neighbours, therefore the derivatives of F(vc) can be approximated by solving the following system of equations:

$$ \begin{array}{@{}rcl@{}} \begin{array}{llll} F(v_{c}+m) = F(v_{n})= F(v_{c})+m F^{\prime}_{v}(v_{c})+ ... + \frac{m^{n}}{n!}{F^{n}_{v}}(v_{c}) \end{array} \end{array} $$
(28)

For all other nodes Vi where 1 ≤ in − 1 we have the following equation:

$$ \begin{array}{@{}rcl@{}} \begin{array}{llll} F(v_{c}+m+d_{i})=F(v_{i})=\\ F(v_{c})+(m+d_{i})F^{\prime}_{v}(v_{c})+...+\frac{(m+d_{i})^{n}}{n!} {F^{n}_{v}}(v_{c}) \end{array} \end{array} $$
(29)
$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} m & ... & \frac{m^{n}}{n!} \\ (m+d_{1}) & ... & \frac{(m+d_{1})^{n}}{n!} \\ &...& \\ (m+d_{n-1}) & ... & \frac{(m+d_{n-1})^{n}}{n} \end{array}\right] \left[\begin{array}{llll} F^{\prime}(v_{c})\\ F^{\prime\prime}(v_{c})\\ ... \\ F^{n}(v_{c}) \end{array}\right] = \left[\begin{array}{llll} F(v_{n})-F(v_{c})\\ F(v_{1})-F(v_{c})\\ ... \\ F(v_{n-1})-F(v_{c}) \end{array}\right] \end{array} $$

4.1.2 Derivatives in inconsistent and overdetermined systems

In general terms of linear algebra, a system of equations is expressed as Af = b. A system is overdetermined if the constraints (number of rows n in A) exceed the number of unknowns (number of rows r in f ). In general, if in (30) r < n, the system is overdetermined. In the case of community detection, (1) that defines curvature requires only the first and second derivatives; thus, r = 2.

$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} d_{1} & ... & \frac{{d^{r}_{1}}}{r!} \\ &...& \\ d_{n} & ... & \frac{{d^{r}_{n}}}{r!} \end{array}\right] \left[\begin{array}{llll} F^{1}(v_{c})\\ ...\\ F^{r}(v_{c}) \end{array}\right] = \left[\begin{array}{llll} F(v_{1})-F(v_{c})\\ ...\\ F(v_{n})-F(v_{c}) \end{array}\right] \end{array} $$
(30)

For this case, the least square approximation can be used. Reduced QR factorization [24] and singular value decomposition (SVD) [35] are well-known methods for approximating the least square solutions; in this work we use QR factorization.

4.1.3 Gradient of a node

The vector differential operator ∇ (‘del’ or ‘nabla’) has the following definition:

$$ \begin{array}{@{}rcl@{}} \nabla = \frac{\partial}{\partial x} i + \frac{\partial}{\partial y} j+ \frac{\partial}{\partial z} k \end{array} $$
(31)

In the Cartesian space the gradient of f(x, y, z) is obtained by applying the vector operator ∇ to the scalar function f and is defined as:

$$ \begin{array}{@{}rcl@{}} \nabla f(x,y,z) = \frac{\partial f}{\partial x} i + \frac{\partial f}{\partial y} j+ \frac{\partial f}{\partial z} k \end{array} $$
(32)

The concept of gradient as used in the context of active contours (3) can be extended to the graph space. In a graph, the dimension of the gradient for a given node is the number of its neighbours. Thus, the gradient of F(v) in the Figure 6 can be obtained using (33).

$$ \begin{array}{@{}rcl@{}} \nabla F(v) = \frac{\partial F}{\partial v_{1}} i_{1} + ...+ \frac{\partial F}{\partial v_{j}} i_{j} + ... + \frac{\partial F}{\partial v_{n}} i_{n} \end{array} $$
(33)

Where ij represents the unit vector which is directed to the node vj. \(\frac {\partial F}{\partial v_{j}}\) is calculated from the numerical approximation of the derivative as shown in (34).

$$ \begin{array}{@{}rcl@{}} \frac{\partial F}{\partial v_{j}} = \frac{F(v)-F(v_{j})}{d_{j}} \end{array} $$
(34)

Therefore, the magnitude of the gradient |∇F(v)| is defined by (35).

$$ \begin{array}{@{}rcl@{}} | \nabla F(v)| = \sqrt{ (\frac{\partial F}{\partial v_{1}})^{2} + ... + (\frac{\partial F}{\partial v_{n}})^{2}} \end{array} $$
(35)

4.1.4 Analysing the error for derivative approximations

To analyse the error of derivatives, we generalise two cases of central and non-central nodes into one equation:

$$ \begin{array}{@{}rcl@{}} \left[\begin{array}{llll} L_{1} & ... & \frac{{L^{n}_{1}}}{n!} \\ ... \\ L_{n} & ... & \frac{{L^{n}_{n}}}{n!} \end{array}\right] \left[\begin{array}{llll} F^{1}(v_{c})\\ ...\\ F^{n}(v_{c}) \end{array}\right] = \left[\begin{array}{llll} F(v_{1})-F(v_{c})\\ ...\\ F(v_{n})-F(v_{c}) \end{array}\right] \end{array} $$
(36)

In the case of a central node Li = di, and Li = (m + di) for non-central cases. Every equation in (36), is associated with a Taylor truncation error. This error is committed by replacing F(Vi) by the first n terms of Taylor expansion, and is modeled by the remainder δFi:

$$ \begin{array}{@{}rcl@{}} F(v_{i})-F(v_{c})=L_{i} {F^{1}_{V}}(v_{c})+ ... + \frac{{L^{n}_{i}}}{n!}{F^{n}_{V}}(v_{c})+\delta F_{i} \end{array} $$
(37)

where, \(\delta F_{i}=\frac {L_{i}^{(n+1)}}{(n+1)!}F^{(n+1)}_{V}(C+t_{i})\), and ti is in the interval (0, Li). It shows the truncation error and propagates through the system once the derivatives are computed. To observe how it affects the derivatives, these errors are stacked into an error vector δF = [δF1, δF2,..., δFn]T, and (36) is simplified to L(f + δf) = F + δF, where f is the vector of derivatives, δF is the error vector, and δf is the vector of derivative errors. We can then write:

$$ \begin{array}{@{}rcl@{}} \delta f = \delta F \end{array} $$
(38)
$$ \begin{array}{@{}rcl@{}} \delta f = L^{(-1)}\delta F \end{array} $$
(39)

We should mention here that the system proposed in (36) is called well-conditioned when the answer does not change much for small perturbations in the input. The system is ill-conditioned if the matrix L is close to a singular matrix; at least one of the singular values is very close to zero [24]. In this case, L(− 1) magnifies the error and it is necessary to check the error of the derivatives; δFi can be simplified as:

$$ \begin{array}{@{}rcl@{}} \mid \delta F_{i}\mid \leq e_{i} = k_{i} L_{i}^{n+1}, where\ k_{i}=\sup_{y\in [C,C+L_{i}]} \mid\frac{F_{v}^{(n+1)}}{(n+1)!}\mid\end{array} $$
(40)

Then the error of each derivative follows the equation:

$$ \begin{array}{@{}rcl@{}} \mid \delta f_{i}\mid \leq L^{(-1)}e_{i} \end{array} $$
(41)

4.1.5 Derivative of basic operations

Suppose F and G are two functions defined with \(F^{\prime }(v_{c})\) and \(G^{\prime }(v_{c})\) as the first order derivative in node vc. Then:

$$ \begin{array}{@{}rcl@{}} \text{Addition rule: }(F \pm G)^{\prime}(v_{c})=F^{\prime}(v_{c}) \pm G^{\prime}(v_{c}) \end{array} $$
(42)
$$ \begin{array}{@{}rcl@{}} \text{Multiplication rule: }[F(v_{c}).G(v_{c})]^{\prime}=F^{\prime}(v_{c}).G(v_{c})+F(v_{c}).G^{\prime}(v_{c}) \end{array} $$
(43)
$$ \begin{array}{@{}rcl@{}} \text{Division rule: }[\frac{F}{G}(v_{c})]^{\prime}=\frac{F^{\prime}(v_{c}).G(v_{c})-F(v_{c}).G^{\prime}(v_{c})}{[G(v).G(v_{c})]^{2}} \end{array} $$
(44)

Proofs are provided in the Appendices ?? and ??.

5 Community Detection using differential geometry properties

A community can be seen as a shape in a graph whose nodes are highly connected while their connections to nodes outside the shape are sparse. Since the velocity and its components gradient and curvature are based on derivatives which use the connections between a node on the boundary and its neighbours inside the shape as a basis for deciding the inclusion of a candidate node outside the shape, the approach can be expected to detect good boundaries of shapes.

Before describing the algorithm we need to define a proper F function. F(vi, vj) represents the distance between vi and vj. Any suitable distance measure can be chosen for it. The criterion used for F(vi, vj) in this research is the structural equivalence. Nodes are structurally equivalent if they are in the same area of the graph and have the same neighbours. So F(vi, vj) is defined as:

$$ \begin{array}{@{}rcl@{}} F(v_{i},v_{j}) =1- \frac{|N(v_{i})\cap N(v_{j})|}{|N(v_{i})\cup N(v_{j})|} \end{array} $$
(45)

where N(vi) is the set of neighbours of node vi. and \(\frac {|N(v_{i})\cap N(v_{j})|}{|N(v_{i})\cup N(v_{j})|}\), or the structural similarity, shows the proportion of the common neighbours.

The algorithm starts from a single node which is assumed to be part of the shape. Initially, the seed node vi is considered the current boundary of the shape. A second node vj, which has the minimal distance F(vi, vj), is chosen for inclusion in the shape. As the calculation of the second derivative requires the presence of at least three nodes, a hypothetical node, with the maximum distance of one from the other two nodes, is added, assumed to be part of the shape. This procedure is represented by the line InitialiseCommunity in Algorithm 1.

The shape χ initially comprises these three nodes, from which it expands through the inclusion of nodes adjacent to the boundary. Nodes adjacent to the boundary on the outside of the shape are candidates considered for inclusion. Each node vi on the boundary which is connected to the candidate node vp considers its inclusion based on the velocity function (46).

$$ \begin{array}{@{}rcl@{}} s(v_{i},v_{p}) = \frac{\kappa(v_{i},v_{p})}{1+ \alpha |\nabla F(v_{i},v_{p})|^{2}} - arctan(|F^{\prime}(v_{i},v_{p})|) \end{array} $$
(46)

In (46), the curvature is represented by κ(vi, vp), which is defined in (47). The magnitude of the gradient |∇F(vi, vp)| describes the difference between vi and the candidate node vp as calculated by (35). The parameter α moderates the difference between nodes. The larger alpha, the stricter the criterion for the inclusion of a node. The term \(arctan(|F^{\prime }(v_{i},v_{p})|)\) has been added to map the value of \(|F^{\prime }(v_{i},v_{p})|\) to a value between zero and one with the purpose of achieving a negative impact to sudden changes in the derivative of the distance function in order to reduce noise.

$$ \begin{array}{@{}rcl@{}} \kappa(v_{i},v_{p}) = \frac{F^{\prime\prime}(v_{i},v_{p})}{\Big( 1+\big(F^{\prime}(v_{i},v_{p})\big)^{2} \Big)^ {\frac{3}{2}}} \end{array} $$
(47)

As shown in (47), curvature uses the first and second derivatives of the distance function from node vi on the boundary to the candidate vp. While the gradient bases the decision of an inclusion of node vp on the difference between vp and the boundary node vi, curvature represents the curve of the boundary at vp - essentially, ‘concave’ boundaries are more likely to include a node vp, because, loosely speaking, it could be seen as ‘enclosed’ by that boundary. In Figure 7, values of curvature and gradient for two simple graphs are shown. By putting the values of curvature and gradient in (46), the velocity from v5 towards vp is a negative value (− 0.06) in Figure 7a; thus, vp will not be included in the community. In Figure 7b, the velocity value towards vp is positive for all v3, v4, and v5; therefore, the first one which, according to Algorithm 1, has the chance to include vp, will include it and curvature and gradient for the rest of them will not be computed.

Figure 7
figure 7

In both a and b, shape χ consists of v1, v2, v3, v4, v5 and vp is the candidate for the inclusion. In a, values of curvature (47) and gradient (35) from v5 towards vp are shown. In b, values of curvature and gradient from v3, v4, and v5 towards vp are shown

Starting from a given seed node, the boundary of a shape moves until the velocity function s no longer warrants the inclusion of further nodes. Candidate nodes are evaluated from all nodes on the boundary they are connected to, but the evaluation stops as soon as one of the boundary nodes favours the inclusion of the node. This means that most of the time, the algorithm achieves a significantly better run time than required by its worst case complexity. Figure 9 illustrates an example of the proposed algorithm. The diagram in Figure 8 shows the process of DCD.

Figure 8
figure 8

Flowchart of DCD

In (46), the velocity function has only one parameter, α. To give the user control over size and quality of the desired communities α is added to the inclusion criteria. The larger α is, the stronger the effect of the gradient, and therefore the sharper the edge.

figure a
Figure 9
figure 9

The red nodes show the current community and the green nodes are candidates for inclusion. The number on the edges shows the velocity from a node to the candidate. When the velocity towards all neighbouring nodes is negative the algorithm stops

6 Tracking communities

To our knowledge, the proposed model is the only local community detection method which is able to track temporal smoothness changes in a network. In temporal smoothness, there is a stream of atomic changes. The community updates itself triggered by a change. Our model, currently, considers addition and removal of edges and addition of nodes. Dealing with node removal, due to its complex nature, is left for future work. To affect a community, a change must happen either inside or in the neighbourhood of the community. The steps taken to deal with different types of changes are:

  • Edge addition: When an edge is added there are three possibilities:

    • The new edge is inside the community. This strengthens the existing community. Therefore, no adjustment is necessary.

    • The new edge is outside the community. Following the same logic, no change is necessary.

    • The new edge connects a node inside the community to a node outside the community. The velocity to the node which is located outside the community has to be calculated from all of its neighbours inside the community. If the velocity is positive, the node is included.

  • Edge deletion:

    • An edge inside the community is removed. The nodes on both ends of the removed edge are checked by calculating the velocities from their neighbouring nodes.

    • An edge outside the community is removed. This does not affect the community and no adjustment is necessary.

    • An edge that connects a node inside the community to a node outside is removed. Removing the edge only makes the community stronger, therefore no adjustment is necessary.

  • Node addition: We have the following possibilities when a node is added to the graph:

    • One or more edges connect the node to the community. The velocity from each of the neighbouring nodes to the new candidate are calculated. In case the velocity is positive, the node is included and the algorithm stops.

    • The additional node is not connected to the community. The community is not affected and no adjustment is necessary.

7 Time Complexity

This research has two sections: finding a local community around a seed node and tracking a local community given a stream of atomic changes. The time complexity analysis of both section are analyses.

7.1 Time complexity of finding a local community

The main computation in DCD is to find the first and second derivatives for the nodes on the boundary. These derivatives only consider the inside neighbours which in average their number is less than the average of node’s degree. Because average degree considers outside neighbours, too. For example in Figure 9f, node 3 has three neighbours inside the community and two neighbours outside of the community.

If the average degree of the nodes on the boundary is d, and there are m nodes on the boundary then the calculation of velocity require O(md2) calculations where d2 is the time complexity for QR factorization to approximate the first and the second derivatives. Knowing the fact that every node inside the community was once on the boundary, if the size of community is n, where mn, then time complexity of DCD should be O(nd2). However, the real time complexity is more than this because of the re-calculation of the velocity function for some nodes. This happens when the current node vc that is located on the boundary includes an outside neighbour into the community (because velocity from vc to the outside neighbour was positive). This means all of the inside neighbours of vc which are on the boundary has to update their velocity values. The number of neighbours which are on the boundary is less than d. Therefore, time complexity of DCD will be O(nd3) where n is the size of community and d is the average degree of the nodes inside the community.

7.2 Time complexity of tracking a local community

To track the community after a single atomic change, according to Section 6, the velocity of a constant number of nodes needs to be calculated. Therefore, the time complexity of updating the community after an atomic change is O(d2) where d is the average degree of the nodes, and d2 the time to approximate derivatives. The efficient time complexity of DCD makes it ideal for tracking local communities in large scale dynamic networks.

8 Experimental evaluation

8.1 Finding local communities

This work proposes a framework for approximating derivatives in a graph space to be used for community detection. The goal of this evaluation is to demonstrate that the derivatives can be used to identify good communities as defined by a commonly used quality measure for communities in graphs. The aim is also to show that these communities are of comparable, sometimes better, quality as communities produced by two well-known community detection methods. We used Local Spectral Partitioning (LSP) [39], parameterised as recommended by Jeub et al. [27], and Local Geodesic Spreading (LGS) [3] as benchmarks for comparison. The graphs used in the experimentation are Facebook graph FB-JHK of John Hopkins University with 5180 nodes and 186572 edges, and FB-CALTC of California Institute of Technology with 769 nodes and 33312 edges, both captured in September 2005 and are part of the FACEBOOK100 dataset [59].

Quality of communities

The three most widely known measures for determining the quality of local communities are relative density, intra-cluster density and conductance. Relative density is the ratio between the number of intra-cluster edges and the number of edges that connect the community to the rest of the graph. Intra-cluster density [38, 44] is the fraction of the number of edges inside the community as defined as:

$$ \begin{array}{@{}rcl@{}} Internal\ density (C) = \frac{e_{C}}{\frac{n_{C}(n_{C}-1)}{2} } \end{array} $$
(48)

where eC is the number of edges in community C and nC is the number of nodes in C. \(\frac {n_{C}(n_{C}-1)}{2}\) shows maximum possible number of edges within community C. Higher internal density means better quality for the community.

Conductance is defined as:

$$ \begin{array}{@{}rcl@{}} Conductance(C)= \frac{vol (C,\bar{C})}{min \Big(vol(C, G),vol(\bar{C},G)\Big)} \end{array} $$
(49)

In (49), C is the set of nodes which comprise the community, \(\bar {C}=V-C\) denotes the nodes in the graph which are not in C, and \(vol(C_{1},C_{2})= {\sum }_{i\in C_{1}}{\sum }_{j\in C_{2}}A_{ij} \). A is the adjacency matrix. Conductance(C) has a lower value when the community is loosely connected to the rest of the graph. Therefore, the lower the conductance, the higher the quality of the community. Following the practice of a number of recent studies of significance [27, 37, 39], we use conductance as the main quality measure in this paper. We also compare the results of DCD with baselines using in

8.2 Tracking local communities

To evaluate DCD for tracking communities, the dynamic communities network generator by Gorke et al. [25] is used. The benefit of their clustered network generator is its capability to generate ground truth dynamic communities, given an atomic change stream. The stream of atomic changes is generated in a way that the community label of every newly added node is known. The ground truth data can be compared against our method’s result. Since our method is the first local community detection method of its kind which can track local communities given a stream of changes, we only compared it against ground truth data. In this experiment, several networks with 1000 nodes and 5 communities with different intra-cluster and inter-cluster edge probabilities are generated. Then, 200 nodes are added to the network through a stream of atomic changes. DCD tracks and maintains each of the communities. Since it is known a priori which cluster every newly added node belongs to, we report precision, recall, and F1 score for different scenarios in Section 9. Although convergence of the algorithm cannot be mathematically proven, in all experiments we have observed that as long as the initial node is selected inside the community, the algorithm terminates and separate the community.

9 Results

9.1 Finding local communities

Comparing the outcome of LSP and derivative-based community detection (DCD) has its challenges because both methods depend on a parameter which leads to different combinations of quality and size in the communities detected. The teleportation parameter of LSP defines the type of community being developed. In the experiments, we ran LSP with teleportation set to 20 equally spaced values as explained by Jeub [27]. The parameter α in DCD defines the stringency of the inclusion criterion, with larger values being more restrictive. Unlike LSP, DCD stops when according to (46), no further candidate nodes qualify for inclusion.

Figure 10
figure 10

Conductance plot for different methods and different starting nodes in FB-JHK

In Figure 10, we included the progress of the one among the 20 LSP instances that produced the community with the best conductance (regardless of the size of the community) for the JHK-FB network. LGS, which is based on PageRank, has no parameters except the seed node, hence there is no choice in the result to include. Because of the variation in the parameter α, we included two result graphs for DCD. Figure 10a-d represent trials starting from four different seed nodes and were chosen because they are representative of the different behaviours of the algorithms. Figure 10a shows a case where LSP outperforms all others except DCD with α = 2.5 when the community has a size of around 200 nodes. In Figure 10b the smaller communities found by LSP are of slightly better quality than those of DCD, but DCD with α = 1.5 discovers a community with around 330 nodes with better conductance. In Figure 10c, the performances of LSP and DCD are almost equivalent – in most cases, DCD with α = 2.5 produces slightly better quality than LSP, but all three algorithms produce similar results. In Figure 10d, DCD with α = 2.5 produces considerably better quality for smaller communities while DCD with α = 1.5 shows better conductance for larger communities. LGS is not a competitive algorithm in any of the cases examined.

The results shown in Figure 10 illustrate the difference in performance of DCD that the parameter α entails. This raises the question how to identify the best setting for α. Further investigations, illustrated in Figure 11, show that smaller values of α lead to the detection of larger communities, while larger values of α discover small-size communities. Because larger values of α restrict the inclusion of new nodes earlier, the algorithm stops at a smaller community size. Table 1 shows the average sizes of the communities detected with different values of α. The quality of the community found, large or small, depends on the initial seed. This property is common to DCD and most other methods, including LSP and LGS (Fig. 12).

Figure 11
figure 11

Effect of α on finding local communities around a seed node in FB-JHK

Figure 12
figure 12

Conductance of the different detected communities in FB-JHK where α = 1.5

Table 1 Effect of α on the size of detected communities in FB-JHK for 20 different initial random seeds

Figures 13 and 14 compare the average conductances achieved by the different algorithms for a community of a particular size starting from 20 different random seeds. The size is dictated by the number of nodes included by DCD with the value of α given. Since several restarts were used, the size is not exactly identical in each of the restarts, but for each restart, the community with an equivalent size produced by LSP and LGS were chosen to calculate the average conductivity. For LSP, the trials were repeated with each of the 25 parameters for teleportation and the average is reported. Figures 13 and 14 show the conductance of the detected communities for DCD, LSP, and LGS for the 20 random seeds in FB-JHK and FB-CALTC. As it can be seen, DCD has the best performance followed closely by LSP and then with some margin is the LGS. Figures 15 and 16 illustrate the average internal density of the detected communities. For lower values of α DCD outperforms both LSP and LGS. However, for the greater values of α, which according to Figure 11 results into smaller community sizes, LSP outperforms DCD and LGS remains the weakest among the three methods.

Figure 13
figure 13

Average conductances of the communities in FB-JHK

Figure 14
figure 14

Average conductances of the communities in FB-CALTC

Figure 15
figure 15

Average internal density of the communities in FB-JHK

Figure 16
figure 16

Average internal density of the communities in FB-CALTC

Figure 17
figure 17

Finding the local community of node 3554 in FB-JHK using a:LGS, b:LSP, and c:DCD

Figure 17 illustrates the detected communities around node 3554 using LGS, LSP, and DCD respectively. For the sake of visualisation, nodes with the degree less than 15 were ignored. The experiments show LSP, unlike DCD and LGS, tends to include a considerable number of nodes with the relatively low degree.

9.2 Tracking local communities

To test the performance of DCD for tracking local communities, seven different scenarios with ground truth dynamic communities were generated. Each network initially has 1000 nodes with an average degree of 30. Then, 200 nodes are successively added to the network. The seven experiments differ in their probabilities of inter-cluster and intra-cluster edges. Experiments are labeled in alphabetical order. Their parameterisations are shown in Table 2.

Table 2 Different parameterisation for intra-cluster (Pin) and inter-cluster (Pout) probabilities

Figure 18a to c show the structure of the networks in scenarios A, D, and G from Table tab:param respectively. In Figure 18a, the five ground-truth communities are very well defined without much overlaps. However, since the Pintracluster increases as we go through subsequent scenarios, the quality of the communities drops. In Figure 18c, the communities due to the high value of Pintracluster, the communities are intertwined.

Figure 18
figure 18

The structure of the networks for different scenarios in Table tab:param. a: Scenario A, b: Scenario D, c: Scenario G

The precision, recall, and F1 scores for each of the experiments are shown in Figure 19. As the probability of edges within clusters decreases and the probability of edges between clusters increases, tracking communities becomes more difficult and the accuracy decreases. For well-defined communities, DCD performs well.

Figure 19
figure 19

Precision, recall, and F1 score for each scenario

10 Conclusion and further work

This study has introduced a new local community detection algorithm based on derivatives. It has necessitated the definition of an approach for approximating derivatives in graphs. The method has been shown to perform on a par with or better than a well-known local community detection method, and both methods have similar computational complexity. DCD offers the advantage of stopping when all nodes that qualify have been included in the community. The α parameter offers a means of controlling the size of the desired community. Although DCD does not consistently outperform LPS, it has significant potential as a local community detection method. Given its inspiration from Active Contours, many of the improvements that have been applied to Active Contours over time can also be adapted to DCD, for example Chan and Vese’s work [8] which serves to reduce the sensitivity to noise, can be added in the form of considering the density of the edges inside the community. One direction of future research is to investigate the problem of tracking local communities using different dynamic network generators with ground truth communities. One significant advantage of DCD is its success in tracking dynamic communities in a computationally cheap fashion, because the algorithm only needs to recalculate the ’velocity’ function (46) for the nodes that have changed if they are located inside the community or in the vicinity of the boundary. This is a considerable contribution given the lack of local community tracking methods and the prevalence of very large graphs whose communities matter in a social and commercial context. The application of derivatives to graph space in itself has a multitude of applications, such as using features of differential geometry for visualisation or using heat diffusion in thermodynamics for modeling information diffusion in social networks.