1 Introduction

Recently, complex networks have been studied and developed in many areas of science. Internet, biological networks, power grids, social networks, food webs, and communication networks are well-known examples of complex networks. One of the main issues in modelling such networks is to extract hidden structures, called communities [18]. Detecting communities helps us to make sense of complex networks [9].

Communities consist of objects, called nodes, and their relationships, called edges. They emerge as dense parts in a network while they may have a few relationships to each other. Indeed, communities are latent among a mass of nodes and edges in a sparse network. This characteristic makes the community detection process more difficult. Moreover, we do not usually have any prior knowledge about the number of existent communities and their sizes (i.e., the number of nodes included).

In this paper, we focus on one of the well-known approaches for detecting communities, called modularity-based optimization, initiated from the Newman and Girvan (NG) method [44]. Their method is based on the maximization of a measure, called the modularity function Q, to distinguish the edges of a community from random by means of a heuristic search algorithm. In order to improve the NG method, some researchers proposed to optimize modularity function through a mathematical programming model [1, 60, 66]. However, solving mathematical models may fail to complete the optimization process for medium and large networks in proper time because of the NP-Hardness of those models, although they find the best solution for small ones. To tackle this issue, researchers proposed to employ different heuristic methods (e.g., [36, 57, 62]), and many metaheuristic algorithms (e.g., [3, 21, 23, 27, 33]).

The Q measure can work well if the respective network is truly modular [51] and communities do not significantly differ in terms of their sizes [31]. Unfortunately, no guarantee exists for network modularity and community uniformity; therefore, this measure may not provide reliable results [15, 52]. Consequently, the abovementioned models inherit the stated limitations and drawbacks of the Q function.

Li et al. [34] have proposed another measure called modularity density or D measure in order to cope with this problem. D measure targets the average in-degree (i.e., the average number of edges coming to nodes), denoted by \(\hbox {d}_{\mathrm{in}}\), and the average out-degree (i.e., the average number of edges going out of nodes), denoted by \(\hbox {d}_{\mathrm{out}}\), of each potential community. This measure aims to find communities so that the maximum of \(\hbox {d}_{\mathrm{in}}\) and minimum of \(\hbox {d}_{\mathrm{out}}\) occurs. They also developed a mixed integer non-linear programming (MINLP) model for detecting communities based on the D measure. The number of communities must be known as prior information for their mathematical model. This limitation would be problematic in case of medium and large networks.

In this paper, first, a new mathematical model is proposed to tackle the problems so that automatically discovering the correct number of communities would be possible. Then, as a core contribution of this paper, a new metaheuristic is developed based on D measure. Since the quality of detected communities extremely depends on initial grouping and method of exploring the search space, a metaheuristic algorithm that provides a highly robust and parallel search is needed for successful detection. The desired algorithm should be able to employ a dynamic local search for a discrete optimization problem because, in the community detection problem, a small change in grouping nodes might have a big effect on the quality of grouping. These characteristics, i.e., a highly robust and parallel search with a dynamic local search, can be found in artificial immune systems [55]. Thus, we select the artificial immune system algorithm family and develop a hybrid artificial immune network (HAIN), not only for detecting communities but also for finding the correct number of them simultaneously.

The rest of the paper is organized as follows. The next section presents related studies. Section 3 describes our new mathematical model. We illustrate components of the proposed metaheuristic in Sect. 4. Section 5 shows and analyses the experimental results of employing the proposed HAIN on artificial networks and well-known real ones. Finally, Sect. 6 provides some concluding remarks.

2 Related studies

Community is one of the most important features of networks, and community detection has been focused by different disciplines such as sociology, biology and computer science. The community detection problem has not reached a satisfactory level, despite huge studies on this problem in different fields over recent years. Community detection studies can be classified into seven major groups, as follows [16]:

Traditional methods: The traditional approaches for finding the natural grouping of given objects could be classified as graph partitioning, hierarchical clustering, partitional clustering, and spectral clustering methods. The graph partitioning methods try to divide nodes into a predefined number of clusters. Some graph partitioning algorithms that can work well include the Kernighan and Lin [29] algorithm, Fiedler vector [13], Goldberg and Tarjan [19], and Flake et al. [14] algorithms. Hierarchical clustering methods aim to find a hierarchy of clusters (groups of objects). In this group of methods, two strategies can be taken for clustering nodes: (1) grouping nodes in the same clusters with high levels of similarity, and (2) connecting nodes with low similarity. In this group of methods, similarity measure is very important because nodes will be grouped based on this [10]. Partitional clustering methods are another popular category of clustering data. In these methods, nodes would be clustered based on the distance as a measure of dissimilarity between nodes. One of the most famous methods in this category is [63]. The last group of methods in this class is spectral clustering, which consists of all methods that partition nodes into communities by using the eigenvectors of matrices [11].

Divisive algorithms: The idea of this group of methods is to detect and remove links that connect nodes of different communities in order to disjoin them from each other. Algorithms in this category are initiated by the research of Girvan and Newman [18]. These methods are based on the concept of edge betweenness: betweenness of an edge is defined as the total number of shortest paths between all node pairs in the network involving that edge [17]. The mentioned divisive algorithms could not detect overlapping communities. To cope with this issue, Nepusz et al. [43] proposed the fuzzy community detection method in networks.

Modularity based methods: The word ‘modularity’ refers to the degree of separation of a network’s components. Newman and Girvan proposed a modularity measure Q [44] to ascertain the quality of network clusters. Based on modularity assumptions, the higher value of modularity indicates a higher quality of the clustering. Thus, they used a simple greedy search method to find communities in a way that maximizes Q function. Unlike this approach, some studies have used mathematical modelling to focus on optimality. For example, Xu et al. [60] put forward a mixed integer quadratic programming (MIQP) method, Agrawal and Kempe [1] proposed a rounding linear programming (RLP) method, and Zhang and Wang [64] developed a mixed integer non-linear programming method to maximize modularity function. Another group of researchers proposed metaheuristic algorithms in order to find a trade-off between optimality and speed, such as simulated annealing [23, 40, 41], genetic algorithm [33, 37, 53, 56], memetic algorithm [20, 21], immune system [22], ant colony optimization [27], and firefly algorithm [3].

Dynamic algorithms: Dynamic processes inspire these algorithms. Indeed, these algorithms can detect groups of similar nodes as communities based on their behaviour in the presence of a dynamic process. Random walks [28, 67] and synchronization [4] are two well-known processes used for detecting communities in graphs. Other similar methods, called spin models [50], are based on the Potts model [59] developed in statistical mechanics. These algorithms, although inspired by nature, are slow in general.

Statistical inference: The aim of the methods based on statistical inference is to fit a model to the actual structure of a graph using observations and hypotheses. Generative models and block modelling models are the two most popular groups. The basis of generative models is to maximize the likelihood of reaching fitness in comparison with the actual graph. Bayesian inference [25] and expectation-maximization clustering technique [65] are two popular methods for this group. Additionally, two commonly used approaches for methods of the block modelling models are to define communities based on their connectivity patterns, called regular equivalence [12], or as groups of nodes that have common neighbours, called structural equivalence [38].

Spectral algorithms: In the traditional methods group, we defined a group of clustering algorithms that use the eigenvectors of adjacency matrices to find partitions, called spectral clustering methods. The idea of using spectral properties of adjacency matrices would not be limited to clustering algorithms: previous works have addressed the capability of the eigenvectors of these matrices to show useful information on latent structures. Thus, many algorithms have been developed based on such idea [42, 61]. The main drawback of these algorithms is that they would work well if the network is modular.

Alternative methods: Other algorithms exist that have quite different ideas in comparison with the previous groups, although they may seem to overlap with some abovementioned approaches. A method based on the minimal spanning tree [7], agglomerative technique [5], clique percolation method [48], label propagation method [49], and bridge-bounding method [47] are some of the well-known alternatives.

3 Proposed mathematical model

As stated before, Li et al. [34] put forward the D measure. Based on this measure, they also proposed a simple MINLP where the number of communities must be known in order for it to start working. Their MINLP is the basis of our proposed MINLP. Thus, before describing new models, we illustrate the D measure and its respective MINLP.

Given a network G = (V, E) with N nodes and M edges, where V depicts the set of nodes and E implies the edges between nodes, Li et al. [34] proposed the general D measure for C communities as follows:

(1)

where \(\lambda \) is an adjusting parameter, \(\hbox {V}_{\mathrm{i}}\) implies the set of nodes in community i, denotes the set of nodes exist in network, but not included in \(\hbox {V}_{\mathrm{i}}\). \({\vert }\hbox {V}_{\mathrm{i}}{\vert }\) represents the cardinality of set \(\hbox {V}_{\mathrm{i}}\). The function L is defined as follows:

$$\begin{aligned} \hbox {L }\left( {\hbox {V}_\mathrm{a} ,\hbox { V}_\mathrm{b} } \right) = \Sigma _{\mathrm{i}{\in {\mathrm{Va}}}} \Sigma _{\mathrm{j}{\in {\mathrm{Vb}}}} \hbox {a}_{\mathrm{ij}} \end{aligned}$$
(2)

where \(\hbox {a}_{\mathrm{ij}}\) denotes the element respective to the ith row and jth column of adjacency matrix A. That is, both i and j are indices that imply nodes.

In order to build a mathematical model, Li et al. [34] defined a binary variable \(\hbox {x}_{\mathrm{il}}\) so that it equals one if community l includes node i, and zero if otherwise. Consequently, their MINLP is as follows:

$$\begin{aligned} \hbox {max}\quad \hbox {z }\!=\! \Sigma ^{\mathrm{C}}_{\mathrm{l}=1} \left[ \left( \left( {1-\lambda } \right) \Sigma ^{\mathrm{N}}_{\mathrm{i}=1} \Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {a}_{\mathrm{ij}} \hbox {x}_{\mathrm{il}} \hbox {x}_{\mathrm{jl}} -\lambda \Sigma ^{\mathrm{N}}_{\mathrm{i}=1} \Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {a}_{\mathrm{ij}} \hbox {x}_{\mathrm{il}} (1-\hbox {x}_{\mathrm{jl}} )\right) \!\Big /\!\Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {x}_{\mathrm{il}} \right] \nonumber \\ \end{aligned}$$
(3)

Subject to

$$\begin{aligned}&0< \Sigma ^{\mathrm{N}}_{\mathrm{i}=1} \hbox {x}_{\mathrm{il}} <\hbox {N}\quad \quad \hbox {l}=1,2, \ldots ,\hbox { C}.\end{aligned}$$
(4)
$$\begin{aligned}&\Sigma ^{\mathrm{C}}_{\mathrm{l}=1} \hbox {x}_{\mathrm{il}} =1\quad \!\!\quad \quad \qquad \hbox {i}=1,2, \ldots ,\hbox { N}. \end{aligned}$$
(5)
$$\begin{aligned}&\hbox {x}_{\mathrm{il}} = \left\{ {0,\hbox { 1}} \right\} \quad \quad \quad \,\, \quad \hbox {i}=1,2, \ldots ,\hbox { N },\hbox { l}=1,2, \ldots ,\hbox { C}. \end{aligned}$$
(6)
$$\begin{aligned}&0< \lambda <1 \end{aligned}$$
(7)

Here, before introducing our models, we simplify and rewrite objective function as follows:

$$\begin{aligned} \hbox {z }= \Sigma ^{\mathrm{C}}_{\mathrm{l}=1} \left[ \left( \Sigma ^{\mathrm{N}}_{\mathrm{i}=1} \Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {a}_{\mathrm{ij}} \hbox {x}_{\mathrm{il}} \hbox {x}_{\mathrm{jl}} -\lambda \Sigma ^{\mathrm{N}}_{\mathrm{i}=1} \Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {a}_{\mathrm{ij}} \hbox {x}_{\mathrm{il}} \right) \Big /\Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {x}_{\mathrm{il}} \right] . \end{aligned}$$
(8)

After simplification, the adjusting parameter \(\lambda \) appears as a coefficient of the negative terms. The negative terms of Eq. 8 imply the maximum number of edges that the model can score for each community. In the case of detecting large communities, we should assign a large value to \(\lambda \) to create a big penalty for all missed edges. Furthermore, detecting small natural communities will be possible if that penalty approaches zero. However, for simplicity, we consider \(\lambda \) to equal 0.5, although the model can resolve the problem by finding the optimal \(\lambda \).

In order to utilize this model, the correct number of communities, called C*, should be known in advance, but we do not often have any knowledge about this. To cope with this problem, a new solution has been proposed in this paper that specifies the maximum possible number of communities, called \(\hbox {C}_{\mathrm{max}}\), as the upper bound of C. Since each community should have at least three members, \(\hbox {C}_{\mathrm{max}}\) equals the absolute value of N/3. Then, the model needs changeability in terms of the number of available communities, i.e., from 2 to N/3. This requirement dictates that some communities will not have any members if \(\hbox {C}^{*}<\hbox {C}_{\mathrm{max}}\). This means that some denominators of objective function (Eq. 3) must take zero, so we have a ‘divided by zero’ statement, which is undefined and computationally impossible. To escape from this situation, we define another set of binary variables, noted by \(\hbox {b}_{\mathrm{l}}\), one for each community l, and add them to their respective fractions. These variables will take one if the size of the respective community is zero, and equal to zero if otherwise. By this way, a new proposed MINLP can find the correct number of communities and detect them in only one execution while the Li et al. [34] model needs N/3-1 executions to do the same. Our model has an objective function, as follows with Eqs. 46:

$$\begin{aligned} \hbox {max}\quad \hbox {z }= \Sigma ^{\mathrm{C}}_{\mathrm{l}=1} \left[ {\left( {\Sigma ^{\mathrm{N}}_{\mathrm{i}=1} \Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {a}_{\mathrm{ij}} \hbox {x}_{\mathrm{il}} \hbox {x}_{\mathrm{jl}} - \Sigma ^{\mathrm{N}}_{\mathrm{i}=1} \Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {a}_{\mathrm{ij}} \hbox {x}_{\mathrm{il}}}\right) \!\Big /\!\left( {\Sigma ^{\mathrm{N}}_{\mathrm{j}=1} \hbox {x}_{\mathrm{il}}+\hbox {b}_\mathrm{l}}\right) }\right] \end{aligned}$$
(9)

The optimal number of communities are equal to the absolute value of N/3 minus the sum of \(\hbox {b}_{\mathrm{l}}\) over l.

4 Proposed hybrid artificial immune network

Metaheuristic algorithms often provide a suitable trade-off between the solutions’ quality and the cost of optimization. They utilize a combination of intensification and diversification mechanisms to search the solution space. As they are different in terms of using intensification and diversification strategies, hybridization of two or more metaheuristic may be helpful to lead better solutions [55].

Artificial immune systems (AIS), introduced by Castro and Timmis [8], are a powerful group of bio-inspired metaheuristics. This group of metaheuristic is adaptive, parallel, and highly robust. In order to solve a NP-Hard problem, AIS posses several biological operations, such as clonal selection, negative selection, affinity maturation, danger theory, and immune network theory. The two latter operations are of a well-known variation of AIS, called artificial immune network (AIN). Each of these operations can provide at least one of the mechanisms mentioned (i.e., intensification and diversification).

AIN can utilize the components mentioned for optimization problems as well as for data mining tasks. This method works on a population of solutions, called antibodies, against a number of goal solutions, called antigens. Antibodies try to make their paratopes, i.e., current solutions, as similar as possible to antigens’ epitopes, i.e., goal solutions. The algorithm calculates this similarity, called affinity, for improvement and clonal expression. Successful antibodies will proliferate as a matter of degree of matching: Fig. 1 illustrates this issue. According to this figure, antibodies 1 and 5 completely match their paratopes with epitopes of antigens 1 and 2, respectively.

Fig. 1
figure 1

Antibodies try to match their paratopes with epitopes of antigens for cloning

The next section describes the proposed hybrid metaheuristic algorithm (HAIN), and its application to solve the proposed MINLP model.

4.1 The HAIN procedure

A crucial characteristic of complex networks is sparsity. Sparsity implies that each node often has links only to a few nodes, called its neighbours. Although this characteristic makes mathematical modelling of a network more problematic, it provides a great opportunity for search methods. That is, for each node, these methods should explore only a small subset of a network, i.e., a node’s neighbours. Therefore, the success of search methods depends on enhancing intensification and diversification in the search space using sparsity characteristics.

The hybrid artificial immune network (HAIN) strategy is to move across the neighbourhood of each node to maximize the total value of the D measure. This algorithm takes its strategy for generating initial population, intensification, and diversification. Similar to parallel methods, HAIN initiates with multiple start points. In the case of generating an initial population of antibodies (AB) and antigens (AG), HAIN avoids putting two non-adjacent nodes in one community. The next subsection describes this approach. Then, the algorithm calculates the affinity of each AB. Another component of HAIN is clonal selection and expansion. According to the basic idea of this component, the larger value of the affinity, the greater a population of new antibodies. In order to maturate new AB, the next component, called affinity maturation, starts to work after cloning. This component maturates all AB by applying one of two different moves with equal probability. The moves, described in the affinity maturation subsection, try to find the best neighbour of a random selected node for adding to (or removing from) its respective community to increase the total D measure. Indeed, they use sparsity characteristics and maximize the D measure simultaneously. This component particularly emphasizes the intensification mechanism. As a weak AB may exist, any algorithm needs cleaning and suppressing components. HAIN uses metadynamics and network suppression for those objectives. The diversity component takes a diversification strategy by splitting the largest community into two smaller ones, and merging two connected communities at random. The last subsection describes these procedures. Finally, the pseudocode of the proposed HAIN algorithm has been illustrated in Fig. 2. A computer program in the MATLAB language can be obtained on request from the authors.

Fig. 2
figure 2

The HAIN algorithm

4.2 Representation and initial population

In order to apply a metaheuristic algorithm, the first question is how to represent a solution. In the case of the HAIN algorithm, this question states the representation of antibodies (AB) and antigens (AG). Based on our knowledge, there are three ways for representing a community detection solution in the literature: locus-based adjacency or graph-based encoding [2, 24, 54], binary matrix encoding [35], and string-of-group encoding [37]. The latter proposes to assign a specific cell of solution vector to each node, and put its respective community label into that cell. Note that the length of each AB (or AG) vector equals N (i.e., the number of nodes). Figure 3 shows this simple encoding for a case of ten nodes and three communities. This proposed algorithm uses this representation to illustrate both AB and AG.

Fig. 3
figure 3

Representation of AB (or AG)

Since the optimal communities are not available to use as goal solutions, the algorithm will choose the best AB related to each AG, called AB*, as an updated AG if AB* is greater than the current AG in each iteration. The number of AG can be more than one, i.e., the algorithm starts and proceeds with multiple initial points in parallel.

Initial populations, crucially, need two factors: randomness and completeness. Randomness provides an opportunity to search all of the search space. Completeness denotes that all solution vectors should be able to include all admissible numbers, i.e., between 1 and the absolute value of N/3 in the case of our representation. The latter factor gives the opportunity for the algorithm to define different communities. Indeed, the algorithm can illustrate not only the optimal communities but also the correct number of communities. By this way, in the first step, the algorithm consecutively chooses nodes and assigns a specific integer, called the community number, to each selected node and subsequently to its neighbours in order to avoid putting two non-adjacent nodes in one community. This process repeats for each \(\hbox {N}_{\mathrm{AB}}\) antibody of each \(\hbox {N}_{\mathrm{AG}}\) antigen. Next, the algorithm fires a reiteration of the label propagation method (for more information about this method see [36]). According to this heuristic method, each node chooses the most frequent community number of its neighbours using the following formula:

$$\begin{aligned} \hbox {x}\left( \hbox {i} \right) =\hbox { mode }\left( {\upeta \left( \hbox {i} \right) } \right) \end{aligned}$$
(10)

where x(i) denotes the integer assigned to node i and \(\upeta \)(i) shows the set of neighbours of node i. The mode function implies that the most frequent number of the set appears in its argument. Other components of HAIN use the output of this heuristic method. Thus, based on [55], HAIN is a high-level relay heuristic (HRH) because another heuristic method (i.e., label propagation) provides a set of initial solutions for HAIN instead of randomly generated antibodies.

The abovementioned processes supply both randomness and completeness for each solution. Figure 4 shows this algorithm.

4.3 Affinity calculation

Each AB (or AG) shows a complete solution and illustrates the members of each community by an identical number. Note that these numbers are neither values nor orders: they work as labels. In other words, each AB (or AG) shows variables \(\hbox {x}_{\mathrm{il}}\) of MINLP, which take one. As the abovementioned representation satisfies all mathematical model constraints, in order to evaluate the solutions based on the D measure, the affinity of each AB (or AG) is equivalent to the respective objective function of the MINLP addressed by Li et al. [34]. Thus, we employ Eq. 9 (the objective function of the proposed MINLP) for affinity calculation. Note that we consider \(\lambda \) to equal 0.5.

Fig. 4
figure 4

The proposed algorithm for generating initial antibodies

4.4 Clonal selection and expansion

In the main variation of AIN, called aiNET, the clonal selection and expansion (CSE) component includes cloning each antibody with respect to each antigen based on its affinity. The cloning process follows a famous rule: the larger the affinity of an antibody, the greater a new generated clone. By this definition, each antigen is of a different type of output pattern that is expected to be recognized by a respective set of antibodies. However, since the modularity optimization is an optimization problem, no pattern exists for taking into account an antigen. In these cases, an antigen can be defined as the best antibody in each set of antibodies in terms of its affinity (i.e., objective function). Thus, in each iteration, our algorithm works on \(\hbox {N}_{\mathrm{AG}}\) different sets of antibodies in parallel. In other words, this mechanism works as a parallel population-based metaheuristic.

In order to employ a CSE component for each antigen, the algorithm ranks respective antibodies and calls them respectively. Then, the algorithm generates \(\hbox {c}_{\mathrm{i}}\) identical antibodies similar to the one selected using the following formula:

$$\begin{aligned} \hbox {c}_\mathrm{i} =\hbox { Round }(\upbeta \times \hbox {N}_{\mathrm{AB}} /\hbox {r}_\mathrm{i} ) \end{aligned}$$
(11)

where \(\upbeta \) is a constant coefficient and \(\hbox {r}_{\mathrm{i}}\) is the rank of the respective antibody i. Formula 11 implies that better antibodies in terms of affinity have larger colonies in the HAIN. This component can provide a proper intensification if it uses a local search for each newborn antibody.

4.5 Affinity maturation

Affinity maturation works as a powerful local search to exploit the search space. Maturation of new clones often occurs by random mutation. Here, our algorithm utilizes two new greedy procedures for maturating newborn clones called firing and hiring. The algorithm randomly selects one of two procedures for each antibody with equal chance. Then, the selected procedure works on the respective antibody to maturate it.

The first procedure, i.e., firing, chooses one of the present communities in the respective antibody at random. Then, it determines which of its members has the minimum effect on improving the D measure of their community using the following formula:

(12)

where \(\hbox {D}_{\mathrm{i}}\) and denote the number of internal and external edges of the community of node i, including node i, respectively. The algorithm chooses the node that has the largest value of \(\hbox {F}_{\mathrm{i}}\). Since the measure (\(\hbox {F}_{\mathrm{i}})\) calculates the changes in the nominator of the D measure of the respective community, this could contribute to the D measure. Note that the changes in the dominator of the D measure could be estimated as negligible because this refers to plus or minus one. Then, the firing procedure fires that member. The fired node must be placed in one of other communities. Thus, the procedure seeks the best new community for the fired node based on the maximum number of edges between this node and members of other present communities. Finally, the fired node joins to its new community.

The second procedure, called hiring, starts by choosing a community at random. Then, it seeks a node out of the selected community, which can increase the total D measure more than other nodes. The idea behind this procedure is to increase the total D measure by means of a greedy search. Finally, this node joins the selected community. The algorithm determines which node is more suitable to be hired using the following formula:

$$\begin{aligned} \hbox {H}_\mathrm{i} =\left( {\hbox {I}_{\mathrm{New}} -\hbox {O}_{\mathrm{New}} } \right) /\hbox {N}_{\mathrm{New}} -\left( {\hbox {I}_{\mathrm{Old}} - \hbox { O}_{\mathrm{Old}} } \right) /\hbox {N}_{\mathrm{Old}} \end{aligned}$$
(13)

where I (O) denotes the number of edges between node i and its neighbours in (out of) its respective community. The ‘Old’ and ‘New’ indices imply the community of node i, before and after the hiring process, respectively. N is the number of nodes included in the community of node i with respect to its index. Indeed, these fractions calculate the change in the total D value resulting from the movement of node i from one community to another. The algorithm chooses the node i with the largest value of \(\hbox {H}_{\mathrm{i}}\).

Although these two moves consider both sparsity characteristics and maximizing the D measure, hiring or firing only one node may provide too slow an intensification. Thus, each of these procedures works on a set of antibodies in succession. This means that after maturating the first newborn antibody, the procedures continue to work on this antibody to generate the second matured one. Then, the procedures work on the second matured antibody to generate the third matured one, and so on. Consequently, matured antibodies, except the first one, might be maturated using more than one move. This strategy provides more rapid intensification. On the other hand, since these procedures are greedy, after affinity calculation, we can determine which of them is better. Figure 5 depicts these procedures in one algorithm, called Algorithm 3.

Fig. 5
figure 5

The proposed algorithm for maturating newborn antibodies

4.6 Metadynamics and network suppression

After using the affinity maturating component, some maturated antibodies may address identical or useless solutions. In order to reduce calculating time, the algorithm employs two cleaning components: metadynamics, respective to each antigen, and network suppression, respective to all antibodies.

The metadynamics component eliminates all antibodies whose affinities, with their respective antigen, are lower than a given threshold in the main variation of aiNET. As the proposed algorithm considers the best antibodies as antigens, it should eliminate only newly matured antibodies that have a lower affinity in comparison with other ones.

Bringing all newly matured antibodies together makes a network itself. This network would provide powerful exploration of the solution space if empowered by a proper diversification procedure. In the case of our algorithm, this kind of diversification could be provided by removing antibodies that have twins in the network. In other words, the algorithm saves unique antibodies for the next iteration. Removing duplicate antibodies give a chance for the remaining ones to enjoy a better local search—using a CSE component—in the next iteration. This is because these antibodies could reach a better rank if those duplicate antibodies with higher affinities are removed in the previous iteration. This task also reduces the number of solutions and the cost of calculation.

4.7 Diversity

The algorithm may suffer from early convergence if there is no good diversification mechanism. However, aiNET includes a great diversification procedure called the diversity component. The diversity component states the generation of new antibodies at random. Despite the main aiNET, which generates new antibodies randomly, our algorithm takes two smart procedures for diversification: splitting the largest community into two smaller ones with equal size and merging all neighbours of a node to make a bigger community. In the case of a splitting procedure, the larger the size of community, the higher a probability of selecting this community for dividing into two partitions. It is fair because larger communities need more changes than smaller ones. Figure 6 illustrates the procedures of the diversity component:

Fig. 6
figure 6

The proposed algorithm for the diversity component

5 Experimental results

This section provides experiments to compare the effectiveness of HAIN with some other search methods of real world and artificial networks. All of the datasets are undirected, unweighted, and unattributed. To evaluate the results obtained by different methods, we used the normalized mutual information (NMI) measure. Given two sets A and B, this measure is as follows:

$$\begin{aligned} \hbox {I}\left( {\hbox {A},\hbox { B}} \right)&= \left( {-\hbox {2 }\Sigma ^{\mathrm{CA}}_{\mathrm{i}=1} \Sigma ^{\mathrm{CB}}_{\mathrm{j}=1} \hbox {C}_{\mathrm{ij}} \hbox {log }\left( {\left( {\hbox {C}_{\mathrm{ij}} \hbox {N}} \right) \!\Big /\!\left( {\hbox {C}_{\mathrm{i}.} \hbox {C}_{.\hbox {j}} } \right) } \right) } \right) \Big / \left( \Sigma ^{\mathrm{CA}}_{\mathrm{i}=1} \hbox {C}_{\mathrm{i}.} \hbox {log}\left( {\hbox {C}_{\mathrm{i}.} /\hbox {N}} \right) \right. \nonumber \\&\left. +\Sigma ^{\mathrm{CB}}_{\mathrm{j}=1} \hbox {C}_{.\hbox {j}} \hbox {log}\left( {\hbox {C}_{.\hbox {j}} /\hbox {N}} \right) \right) \end{aligned}$$
(14)

where N is the number of nodes, CA and CB are the number of communities in sets A and B, respectively, and \(\hbox {C}_{\mathrm{ij}}\) denotes the number of nodes of community i of set A, which appear in community j of set B. \(\hbox {C}_{\mathrm{i}.}\) (\(\hbox {C}_{.\mathrm{j}})\) is the number of nodes of community i of set A (community j of set B). I (A, B) will be at its maximum, i.e., 1, if A = B.

NMI will be helpful if we know exactly which nodes belong to which community. Coping with this issue is different for facing artificial or real world networks. In the case of artificial networks, we often employ a computer program to build a dataset. Most of these programs determine which node belongs to which community before making the respective adjacency matrix. To make it clearer, consider a program that makes 2N nodes while the first N nodes are called community 1, and the second N nodes called community 2. Then, the program makes edges between nodes with probability p such that p for nodes in the same community is bigger than p for nodes of different communities. However, in the case of real world networks, more information must be in hand to know the community structures, while this desired information cannot be reached in most real world cases. In such cases, in order to make a fair comparisons between methods, we attempt to discover natural patterns in a network and compare their obtained D measure. Therefore, in this paper, we use calculated the D measure on real world datasets, and employed NMI only on artificial networks.

Based on our knowledge, only two proposed search methods in the literature, called Genetic Algorithm to Optimize D value (GAOD) [37] and Improved Memetic algorithm (iMeme-Net) [21], which employ modularity density (D measure) as their optimization function. The GAOD [37] algorithm hybridizes a simple genetic algorithm as its main algorithm, with a heuristic method that is employed to avoid generating invalid initial chromosomes. Another algorithm, called iMeme-Net, employs a heuristic method for generating a new initial population, a variation of the simulated annealing algorithm as its local search and a new memetic algorithm as its main body. We select these algorithms and our MINLP for comparison of real network datasets. In the case of artificial networks, since the actual communities present by a computer program, there is no need to include our MINLP; instead, we add one of the best methods in the literature, called Multi-Objective Community Detection algorithm with Max-Q model selection (MOCDQ) [54] to compare the obtained results. The MOCDQ method is a multiobjective genetic algorithm employed variations of Q function as its objective functions. Here, we solve the MINLP model by GAMS/BARON in http://www.neos-server.org/ and run algorithms in MATLAB 7.9 on a computer with Intel Core 2 Duo, 2.50 GHz, and 4.0 GB RAM.

5.1 Real world network

5.1.1 Datasets

The Zachary karate club network: Zachary [64] generated this network after he studied the friendship network of 34 members of a karate club over a period of two years. Although he reported that the network divided into two groups during this period, some researchers have discovered more than two communities in this network [1, 24, 34, 35, 37].

The Bottlenose dolphins network: Lusseau et al. [39] collected this social network. This network is of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand.

Political books: V. Krebs compiled this network (available: http://www.orgnet.com/), which includes books about US politics published around the time of the 2004 presidential election. The online bookseller Amazon sold these 105 books and show frequent co-purchasing of books by means of the edges between purchased books.

American college football: This dataset [18] represents a network of American football games between colleges during the 2000 season. Nodes and edges demonstrate teams and regular season games, respectively.

Word adjacencies: This dataset [45] includes an adjacency network. This network is of common adjectives and nouns in the novel David Copperfield by Charles Dickens.

Les Miserables: This dataset [30] contains a network of characters in the novel Les Miserables, which appear together.

Neural network: White and Southgate [58] compiled this dataset, which is a network representing the neural network of C. Elegans.

Co-authorship network: This dataset contains a network of scientists who have researched a network theory area of science [45]. This dataset has 128 isolated nodes out of 1589 nodes.

Power grid: The structure of the Western States power grid of the US is illustrated in this network. White and Southgate [58] provided its respective dataset.

Table 1 illustrates more details about the abovementioned datasets.

Table 1 Real world networks in details

Table 2 illustrates the parameter values of HAIN method after trial and error tuning. The tuning task involves running the algorithm with different combinations of three parameters’ values on various datasets several times and finding which combination obtains the best results. This table also shows the parameter values of GAOD, iMeme-Net, and MOCDQ, extracted from their respective papers.

Table 2 Parameter values of the methods

5.1.2 Results

Table 3 shows the best results obtained by the proposed MINLP, HAIN, GAOD, and iMeme-Net methods on real world networks. Columns C denote the number of communities respective to the best-obtained result, illustrated in the D columns. The bold entries correspond to the best obtained result(s) respective to each dataset. Figures 7 and 8 also depict the discovered communities using the HAIN algorithm in the karate club network and co-authorship network, respectively.

Table 3 Number of communities and modularity density results obtained by the methods
Fig. 7
figure 7

Discovered communities in karate club network using HAIN algorithm

Fig. 8
figure 8

Discovered communities in co-authorship network using HAIN algorithm

In the case of the karate club dataset, all methods report the best number of communities equalling three respective to \(\hbox {D}_{\mathrm{measure}} = 3.9225\). HAIN and GAOD could establish better results than other methods in the case of the dolphin dataset. One might expect that the MINLP model would find the global optimum solution. Here, we note that the MINLP model is non-convex as it contains only integer variables [26]. Therefore, this kind of model may be trapped in a local optimum. Another drawback of this kind of model is that it fails in cases of medium and large networks. This issue is depicted in Table 3 by means of an F character.

Generally, HAIN achieves the best results in six out of nine experiments. In the case of two out of the three remaining experiments (i.e., word adjacencies and neural network datasets), this method finds the best-reported number of communities correctly, and differs slightly from the best results. In the case of the neural network and power grid dataset, the GAOD method reports an excellent result, although this method reaches near-best results on other datasets. Consequently, HAIN demonstrates appropriate performance on real world datasets.

Other question arising here points to which of these algorithms can provide a better trade-off between the quality of solutions and CPU time. In order to reply such question we need criteria to fairly measure the performance of algorithms. Fortunately, for evaluating algorithms developed for maximization models, Osman and Al-Ayoubi [46] have defined such criteria, called Marginal Improvement per unit of Computational effort (MIC), as follows:

$$\begin{aligned} \hbox {MIC}=\left( {\left( {\hbox {D}\left( \hbox {A} \right) -\hbox {LB}} \right) \!\Big /\!\left( {\hbox {OP}-\hbox {LB}} \right) } \right) \!\Big /\!CPU \end{aligned}$$
(15)

where D(A) implies the value of objective function obtained using the metaheuristic, CPU is the consumed time in seconds for running the metaheuristic, and LB and OP are the lower bound and global optimum value of objective function, respectively. These latter two values could be determined by the user if they are unavailable. The higher the value of MIC, the greater quality an algorithm has.

For comparison purposes, from each level of network size (i.e., \(<\)50 nodes, \(\sim \)100 nodes, and \(>\)1,000 nodes) a representative dataset is selected. Table 4 illustrates the best results obtained by the methods using MIC criteria over three selected datasets:

Table 4 The MIC results obtained by methods

According to Table 4, the HAIN algorithm provides better results in two out of three experiments. However, there are other crucial issues arising for algorithm evaluation that are not addressed by MIC criteria, such as robustness and method of programming algorithms. Moreover, claims about the superiority of the HAIN algorithm need further research. To do so, in the next section, we describe and employ a set of computer-generated datasets.

5.2 Artificial network

5.2.1 Datasets

Lancichinetti et al. [32] proposed Lancichinetti-Fortunato-Radicchi (LFR) benchmark datasets including well-known artificial graphs. One can make them by varying the defined parameters via simulation. In order to evaluate our algorithm and analyse its behaviour, we employed a set of LFR datasets. These datasets are different in terms of the number of edges (M), the maximum and minimum size of communities (\(\hbox {S}_{\mathrm{Max}}\) and \(\hbox {S}_{\mathrm{Min}})\), permitted interval of nodes’ degrees (\(\hbox {k}_{\mathrm{min}}\), \(\hbox {k}_{\mathrm{max}})\), and the mixing parameter (\(\upmu \)) where two parameters \(\upbeta \) and \(\upgamma \) are constant and equal 1 and 2, respectively. By means of these datasets, we designed three different experiment sets for comparing our HAIN with GAOD, iMeme-Net, and MOCDQ. The next sections show the experiments’ results.

5.2.2 Results of varying the mixing parameter

The mixing parameter \(\upmu \) denotes the probability of making a connection between a node and nodes of other communities. The higher the value of \(\upmu \), the more mixed a network is. Thus, we could study the behaviour of the methods by increasing the value of \(\upmu \), from 0 to 0.6, through the following experiments. In all of these experiments, N = 350, C* = 6, k = [25, 50], and S = [50, 75]. Figures 9 and 10 show the results obtained using the methods.

Fig. 9
figure 9

NMI results obtained by the methods when the mixing parameter varies

Fig. 10
figure 10

Changing the obtained number of communities when the mixing parameter varies

According to these figures, in most of the experiments, HAIN reaches the correct number of the communities even when NMI \(<\)1. Other methods (i.e., GAOD, MOCDQ, and iMeme-Net), although competing with each other in terms of NMI, have a smaller NMI than HAIN. GAOD and MOCDQ report larger numbers as the number of communities; however, iMeme-Net has smaller numbers in most of the experiments. Consequently, HAIN shows higher reliability in comparison with other methods.

5.2.3 Results of varying the allowed community sizes

A robust method shows effective performance when something changes in input. In order to analyse the robustness of the methods in detail, we provide six LFR datasets, which are different in terms of the minimum and maximum allowed size of communities present. These datasets include 350 nodes. Tables 5 and 6 represent the details about the datasets and of the experimental results, respectively.

Table 5 The parameters of artificial LFR networks
Table 6 The results obtained by methods in terms of the number of communities and NMI

In all of the experiments, HAIN achieves the best results. GAOD and MOCDQ, although achieving near-optimal values, represent incorrect numbers as C* in five out of six experiments. GAOD could reach the correct number of communities in the dataset with the widest interval of community size. Despite the performance of GAOD, MOCDQ achieves the correct number of communities when communities are approximately identical in terms of size. Like the previous set of experiments (i.e., the results depicted in Fig. 10), iMeme-Net reaches values smaller than C* in most of the experiments; however, like GAOD and MOCDQ, this method reports far from the reality in terms of C*. Indeed, this method has problems when the sizes of communities are significantly different. The main reason for this is that iMeme-Net starts with a set of good solutions made by five iterations of the label propagation method, which converges rapidly toward a local optimum. Then, in the absence of great diversification, moving to better points is impossible. In general, the results over varying the permitted community sizes again confirm the effectiveness of our HAIN method.

5.2.4 Results of varying the allowed interval of degrees

In this section, we want to analyse the performance of the methods by varying the permitted interval of nodes’ degrees. In this way, first, we make nine datasets by changing \(\hbox {k}_{\mathrm{max}}\), from 10 to 50, and by keeping other parameters constant as follows: \(\hbox {k}_{\mathrm{min}}=5\), S = [40, 90], \(\upmu = 0.1\), and N = 350. Then, we implement all methods using these datasets. Figures 11 and 12 depict the respective results. Note that for all datasets, C* = 7.

Fig. 11
figure 11

NMI results reached by the methods when \(\hbox {k}_{\mathrm{max}}\) varies

Fig. 12
figure 12

The number of communities reached when \(\hbox {k}_{\mathrm{max}}\) varies

There exists a close competition between our HAIN method and the MOCDQ method according to Figs. 11 and 12. Despite the performance of HAIN and MOCDQ, two other methods (i.e., GAOD and iMeme-Net) provide worse results: they reached a lower NMI than HAIN and MOCDQ on most of the experiments. In the case of the number of communities, GAOD demonstrates the worst results. Among the methods, HAIN represents the best results.

6 Conclusion

This paper focused on modularity density maximization to detect communities in networks. The modularity density targets the average in-degree and the average out-degree of each potential community; therefore, this measure can avoid limited resolution. In this way, two models have been developed in this paper: the MINLP model and the HAIN algorithm. The MINLP model can discover not only latent communities but also the correct number of communities. The HAIN algorithm is able to detect optimal communities without any prior knowledge about the number of communities. This method combines the characteristics of complex networks, i.e., sparsity, with the excellent components of pure immune networks, i.e., cloning, affinity maturation, network suppression, and diversity. The HAIN algorithm uses a well-known local search method, called label propagation, to generate initial solutions.

The experimental results on real world complex networks suggest that the HAIN algorithm reaches the best results in comparison to the GAOD, MOCDQ, and iMeme-Net methods. We also built three sets of experiments to evaluate the performance of the HAIN algorithm on the LFR benchmark datasets. The first experiment set on LFR datasets showed that the proposed methods were quite efficient at discovering the community structure of complex networks with varying mixing parameters. By changing the permitted interval of community sizes, we designed another experiment. The results again confirmed the effectiveness of our HAIN method, although, in the case of the third experiment set, the differences between the results could be negligible. Consequently, the experiments on artificial datasets express that the HAIN algorithm can reach the best results in terms of NMI and the number of communities.

The proposed methods can achieve the same gains on directed and weighted networks by means of simple modifications. However, in the case of an attributed network, there exists a strong requirement for future research. Further future work could involve employing and improving state-of-the-art algorithms such as [6] in order to solve modularity-based optimization models.