1 Introduction

In recent years, understanding, managing and analysing large or complex networks or social networks, has received significant interest from researchers in various domains and social network analysis (SNA) is considered as one of the most significant challenges that has to be addressed. Because of the fast advancement in Internet, several types of social networks like Facebook, Twitter and LinkedIn have emerged as online social networks with significantly large network size (due to the large number of users getting added on a daily basis). There are many features that can be interpreted by analyzing these social networks. Network analysis provides an effective tool to investigate statistical properties of the network such as small world phenomenon (Watts and Strogatz 1998; Gandomi et al. 2013b; Yang et al. 2013) and scale free property (Wang and Chen 2003).

In Social Network Analysis (SNA), discovering communities has great significance. The social networks are usually denoted in graphs. In a social network, the community refers to a group of nodes that are tightly coupled to each other and loosely connected with other nodes. Nodes in graphs having the same community often share fascinating characteristics such as common interest, and common purpose. Hence community detection is one of the most important problems in SNA. Here, the important challenge is identifying connections or links in the communities (Wang and Liao 2014). Several surveys on identifying communities in networks can be found in Chintalapudi and Krishna Prasad (2015), Pourkazemi and Keyvanpour (2013). Deep auto encoders is another successful method to address the problem of community detection. this work proposed a novel deep autoencoder with Particle Swarm Optimization (PSO) and continuation algorithms to reveal community structures in complex networks (Al-Andoli 2021). The research of dynamic community detection where community structures change over time. In the paper an efficient and effective multi-objective method, namely DYN-MODPSO, is proposed, and where the traditional evolutionary clustering framework and the particle swarm algorithm are modified and enhanced (Ying Yin and YuhaiZhao 2020). Many significant challenges arises during community detection. The dynamic nature of networks, scalability issues, complexity of algorithm computation and heterogeneity of the networks like issues are arising with the increase in network size.Local structure based method of link prediction captures the similarity scores based on common neighbor which gives an accurate measure to know the link structure arising between nodes. As a result some interesting and potential links may be missed and also as a matter of fact it will be difficult and time consuming for computing similar score for all nodes in network. Similarly for global based methods for link prediction computes score paths separated by greater than 2. But calculating is again a time consuming process. In SNP, Most of the problems are NP-hard in nature (Cai et al. 2016). There is always a considerable amount of challenges available when it comes to optimization problems as the networks grow in wider as the number of users getting increased this can be treated as a scalable issue and need to have effective computation when working on networks related to various types and also many issues will come into picture as the network size grows. Many meta-heuristic algorithms have been investigated to solve the problems discussed above (Gandomi et al. 2013a; Mirjalili et al. 2017a). For instance, the challenges outlined above, have been successfully addressed using dynamics of many biological systems (Kotteeswaran and Rajesh 2014). The inherent characteristics such as adaptive, self organize and collective behaviour of biological systems are effectively applied to solve such complex problems. The most predominant and successful classes in the bio inspired computing are evolutionary based computing and swarm intelligence based computing. Hence, researchers are adopted to use nature-inspired algorithms to solve different problems in SNA. In this article, we are reviewing a number of bio-inspired algorithms used in community detection from different aspects in the following sections.

Fig. 1
figure 1

Taxonomy of the paper

The major contributions in this paper are:

  • Discussion on bio-inspired algorithms and their classifications.

  • Discussion on the challenges that arise while analyzing the social graph, such as community analysis, connection analysis, and path analysis.

  • The objective functions used in community detection research has been discussed. How the evolutionary algorithms,swarm-based algorithms helps to find out the solutions in community and connection analysis have been discussed.

  • In swarm-based algorithms particularly, four well-known algorithms namely Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), and Firefly algorithm have been discussed.

Apart from solving the link prediction and community detection problems. More recently, a number of studies have reported on the success of such techniques for solving difficult problems on all key areas and computer science like optimization problems in data mining and rule extraction dynamic and multiple criteria website optimizations using genetic algorithms, in the field of data mining, ML algorithms using genetic programming and differential evolution in the field of structural engineering with evolution strategy, in the field of multi-modal biomedical imaging using PSO, ACO and many more. The organization of the paper is shown in Fig. 1. In Sect. 2, we summarized the nature-inspired algorithms. Sect. 3 highlights different challenges while analyzing the social network, that is performed on the graphs. In Sects. 3 and 4 addresses the different evolutionary approaches and Sect. 5 introduces the swarm intelligence based algorithms in social networks. Section 6 discusses the future scope of the work followed by a conclusion in Sect. 7.

2 Background of nature inspired algorithms

Usually, real-world optimization problems are NP-hard in nature, which are difficult to solve using classical deterministic techniques (Rai and Tyagi 2013). In this aspect, the role of nature-inspired techniques have been proven to be great and have been applied to solve many such problems associated with different research domains. Last few decades the popularity and usage of nature-inspired algorithms have been increased. These algorithms are generally inspired by different biological swarm activities that happen in nature. Figure 2, shows the taxonomy of nature-inspired techniques.

Fig. 2
figure 2

Taxonomy of Bio-Inspired Optimization Techniques

The evolutionary algorithms (EAs) are the well-known and extensively used bio-inspired algorithms, for solving real-time problems to find optimum solutions for complex optimization problems. The Genetic algorithm (GA), Differential Evolution (DE), Evolutionary strategy (ES)are the most well-known algorithms in the class of EA. These techniques have their own properties. For instance, the GA is purely reliant on the genetic parameters. By choosing the appropriate values of mutation and crossover, GA may produce better optimization results.

Swarm Intelligence (SI) is another field of studying and designing well-organized computational intelligent interactive solutions. The complex problems are being solved by using the behavior of living swarms such as birds, reptiles, fish, and ants. For instance, the Ant Colony Optimization (ACO) algorithm follows the behavior of ants’ food searching pattern. Generally, ants like insects cooperate indirectly by an odorous chemical called pheromone trails. In a swarm, Each ant in a swarm represents an agent. They laid down the pheromone while searching the food source. In case any ant discovers the origin of food, it returns back by detecting a pheromone trail.

Bacterial Forging algorithm is an evaluation to study of developing a new bio-inspired optimization algorithm. COSMIC (Paton et al. 2004) and RUBAM algorithms are comes under bacterial forging.

3 Challenges in analyzing the social graphs

Some of the interesting challenges in analyzing the social networks are finding the interaction between the people, how they act, how diffusion happens in networks, finding the influential nodes in the network, finding the shortest paths between the nodes, and the evolution of communities in the networks. All these challenges are solved by analyzing the networks which are modeled as graphs. In this section, we review the different analysis techniques that can be performed on the graph along with the corresponding solution approaches using Bio-Inspired computing. Most of the analysis is related to graphs such as directed, undirected, signed, and unsigned graphs.

3.1 Community analysis

One of the main challenges in studying SNA is identifying communities in the network. Community analysis in a graph refers to the process of determining the groups that are structurally equivalent in the network. These structurally equivalent groups are usually termed as communities. Most of the conventional approaches to community discovery such as proposed by Girvan and Newman (2002), Wasserman and Faust (1994), and Kernighan and Lin (1970) etc. use the modularity measure to discover communities. But these methods do not reveal the overlaps among the communities and also need prior information of the size and number of the communities. However, there exists many real situations where a node belongs to multiple groups. For example, nodes in Facebook connect to multiple groups(i.e., a person can be a part of many groups on Facebook). In order to overcome these drawbacks, k – clique percolation method proposed by Palla et al. (2005) and several such methods has been developed (Ahn et al. 2010; Evans and Lambiotte 2009; Fortunato 2010) for detecting overlapped communities. But these methods do not scale well when the size of the network increases and also results in high computational complexity. It has been proved that the optimization of objective functions such as community score (Clauset et al. 2004), community fitness (Lancichinetti et al. 2009), modularity (Brandes et al. 2008), minimum cut and normal cut (Shi and Malik 2000), which attempts to measure how well a partition divides the network into communities, are NP-Hard problems. To optimize such problems several approaches based on meta heuristics are proposed in the literature. Sections 3 and 4 briefly explains the biologically inspired approaches applied to perform community analysis in social networks.

3.2 Connection analysis

Connection Analysis in a graph is a process of determining the new connections among two vertices in a graph. This process is mentioned as link prediction (Lichtenwalter et al. 2010). Link prediction problems has variety of applications. Link prediction methods are classified into supervised and unsupervised techniques. Pujari and Kanawati (2012) proposed a technique which aggregate the rank of the nodes for link prediction in complex networks. Yl et al. (2015) proposed the link prediction algorithm by assigning weights to the nine structural related features and then aggregated their results to conclude final prediction scores. Another method proposed by De Sá and Prudêncio (2011) used weights to improve supervised link prediction. Several algorithms such as (Newman 2001; Adamic and Adar 2003; Zhou et al. 2009; Lü et al. 2009; Kumari et al. 2020) based on structural properties of a network have been used for link prediction. To avoid the high computation cost of these methods, some heuristic based methods are used in link prediction. In Sects. 3 and 4 several biologically inspired approaches are Discussed to predict the links in social networks.

3.3 Path analysis

Path Analysis in a graph is a process of determining the best path or the shortest path in the graph. Several applications based on the path analysing are discussed here.

  1. 1.

    for finding the communities in social networks which are known by calculating the number of shortest paths (Wasserman and Faust 1994; Freeman 1977) or density of random walks (Newman 2005),

  2. 2.

    for predicting the links in social networks such as SIM rank (Jeh and Widom 2002), superimposed random walk (Liu and Lü 2010), Random walk with restart (Liben-Nowell and Kleinberg 2007), local path index (Zhou et al. 2009)

  3. 3.

    for investigating statistical property of network called small world phenomenon which is based on finding average shortest path length

Analyzing the paths in the network has been computationally expensive when the size of the network increases. In order to optimize the efficiency of the methods in the literature, recently many metaheuristics are applied to the problem under consideration which are discussed in the upcoming sections.

4 Approaches to evolutionary computation

4.1 Evolutionary computing algorithms

Evolutionary computing algorithms are inspired from biological models of natural evolution of genomes (Coello et al. 2002). The field of evolutionary computation encompasses several algorithms such as genetic algorithms which uses the genetic operators and evolutionary strategies which also includes differential evolution etc.

Evolutionary algorithms are population-based algorithms, i.e., the algorithm maintains a collection of feasible solutions to the problem under consideration, which is used to create new feasible solutions through the use of genetic operators. Based on the fitness value, solutions are chosen to be part of next-generation and new solutions are produced by applying genetic operators like reproduction, mutation, and recombination. This process repeats until some termination condition is satisfied. Differences among the different evolutionary computation algorithms depend on the representation of the solution and the way the genetic operators are implemented. For example, genetic algorithms use binary or discrete-valued variables to represent a solution and they favor the use of recombination and mutation. While the evolutionary strategies and evolutionary programming often use the vector of real numbers and put more emphasis on mutation operation. Reproduction of new solutions acts as key a factor for persevering population diversity. Crossover operators (also called recombination) and mutation operators are used to producing new solutions. Crossover operator combines two or more individuals to generate new individuals also called offsprings. The mutation operator is an unary operator operated as a single individual by introducing random permutations on the individual. The general framework for evolutionary algorithms is outlined below.

figure a

4.1.1 Community analysis

In the literature, many community detection algorithms have been proposed. Among them, one category is based on the principles of evolutionary computation inspired by biological mechanisms. Community mining methods determine clusters in a large network. A cluster is defined as a group of nodes in the network which share similar properties. Finding the clusters in large networks is considered a NP-hard problem. Evolutionary algorithms are considered as most popular methods for solving such NP-Hard problems either by considering a single objective function or by using multi-objective function that has to be optimized by using the EC algorithms or SI based algorithms. The single objective function that has to be optimized for community identification is modularity, proposed by Newman and Girvan (2003). Modularity is defined as strength of dividing a network in communities (Zhuang et al. 2017).

Modeled the network as a graph \(G = (V, E)\) with V indicating the number of nodes in the network and E representing edges in the network, modularity(Q) is termed as follows (Newman 2006):

$$\begin{aligned} Q = \frac{1}{2|E|} \sum _{l,m}^{n} \bigg [A_{lm} - \frac{k_{l}k_{m}}{2|E|} \bigg ] \delta (c_{l},c_{m}) \end{aligned}$$
(1)

where |E| represents the number of edges in the graph G, \(A_{lm}\) is the adjacency matrix of graph G and k is the degree distribution and \(\delta (c_{l},c_{m})\) is a function that takes a value 1 when nodes l and m are in the same community.

Many bio-inspired metaheuristics have been presented to optimize single objective function i.e., Modularity (Q) (Tasgin et al. 2007; Pizzuti 2008; Hajeer et al. 2013; Sahoo et al. 2020). In Tasgin et al. (2007), the population is expressed as a set of chromosomes. Each chromosome is represented by a vector with n elements. The position value of the vector is a number assigned to the community and two-position values are the same if both nodes belong to the same community. In Pizzuti (2008), the author uses the adjacency representation for chromosome representation and uses the community scores as the fitness function to evaluate the new solutions. The drawbacks of these methods are maximizing Q is proved to be NP - Hard, this cannot detect communities in overlapped networks and also modularity maximization which is not able to find the similar group structures whose size falls below a certain resolution limit. To overcome these disadvantages many researchers have modeled multi-objective optimization algorithms (Fortunato 2010; Cheng et al. 2019). Pizzuti (2009) proposed multi-objective genetic algorithm in unsigned networks. In this method, two objective functions that need to be optimized are the Maximization of community fitness and Maximization of average degrees within the communities. Other optimization models such as proposed by Agrawal (2011) Maximize the function which is the combination of modularity and average degree measure within the communities. The drawback of this method is that it cannot be extended to other networks in the real world, where the relations can be positive and negative. For example Friend and Foe networks etc. Amelio and Pizzuti (2013) proposed novel multi-objective optimization for detecting communities in the signed networks. In order to detect the communities in the signed network, the authors introduce two objective functions. The first objective function is to maximize modularity Q and the second objective function is to minimize the sum of positive relationships between different communities and the sum of negative relationships within a community. Recently to detect communities in a signed network Shi et al. (2012) proposed a multi-objective evolutionary algorithm based method called MEA-SN. In this method author modeled the community detection task for signed networks, where weights used on the edge represent the positive or negative relationship. The weight \(> 0\) represents a positive relationship and the weight \(< 0\) represents a negative relationship. The objective function designed for a signed network is based on social balance theory. The 2 objectives designed are maximization of the sum of positive structural similarity values within the community and also maximization of negative structure similarity values between the communities. To overcome the difficulty with the crossover operator and mutation operator Li and Song (2013) proposed a compact genetic algorithm that uses the probability distribution model using the MI algorithm instead of using traditional genetic operations. Shang et al. (2013) proposed an improved genetic algorithm that uses the simulated annealing method as the local search method to improve the ability in adjusting the parameters. In order to improve the search space and the performance of the algorithm, Guerrero et al. (2017) considers the safe and balanced initialization of the number of community structures that are defined using the community density vector that holds the number of nodes in each community. The evolutionary parameters used for analyzing the complex networks in order to detect the communities are listed in the Table. 1.

Table 1 Evolutionary parameters for community analysis

4.1.2 Connection analysis

Authors in Silva et al. (2010) proposed a friend recommendation system for online social networking sites like LinkedIn, Twitter, FaceBook etc. Their study analyzes the sub-graph, where a node in the sub-graph is connected to another node by at least 3 hops. In this method author constructs a linear combination of topological similarity measures like number of adjacent nodes, density of group formed by common friend of person i and person j and density of group formed by adjacent vertices of node \(n_i\) and node \(n_j\) and uses the genetic algorithm to calibrate the quotients \(w_i\) in relation to optimization function which is given by Eq. 2.

$$\begin{aligned} M(n,w) = I_{01}(n_{c},n)w_{01} + I_{02}(n_{c},n)w_{02} + I_{03}(n_{c},n)w_{03} \end{aligned}$$
(2)

Weights are represented in binary strings. Crossover operator is used to generating new individuals after excluding the worst individual in the population according to the fitness function value and then mutation operation is applied to children. The algorithm terminates when no improvement in the fitness of the best individual is found. Naruchitparames et al. (2011) extended friend recommendation system proposed by Silva et al. (2010) by adding social features of an individual like shared friends, location, age, etc. In this method, the social genome is represented as a binary string of social features. The drawback of this method is that the algorithm will perform better only if the user information is complete and truthful. Recently Bliss et al. (2014) proposed an evolutionary approach to link recommendation in dynamic networks. The author uses an evolution strategy to optimize weights which are used in linear combination of sixteen topological based similarity indexes. Initially from one potential solution algorithm generates a Gaussian cloud of feasible solutions using the covariance matrix. A population of feasible solutions is selected from the Gaussian cloud. The feasible solution with the best fitness value is selected and the process is repeated for some determined iterations. The algorithm gets trapped in local optima which is a drawback of this method. In Table. 2 evolutionary parameters for connection analysis is discussed.

Table 2 Evolutionary parameters for connection analysis

5 Approaches to swarm based algorithms

5.1 Ant colony optimization(ACO)

Ant colony optimization is a swarm intelligent based algorithm that is derived from the natural behaviour of the ants (Newman 2001). In general, swarm intelligence is a discipline that deals with a system composed of many individuals that coordinates using decentralized and self organization of natural and artificial systems such as Ant colonies, flocks of fishes, or swarms of birds or bees (Gandomi and Alavi 2012). Such systems are typically composed of individuals that are relatively homogeneous and interaction among the individuals are local to each other and with their environment. Ant colony optimization is inspired by foraging behaviour of the ant colonies. In natural world, ants wander randomly in search of food, and upon locating food return to their nest while laying down the pheromone trails. Further ants follow the trial which are laid in the previous iteration. In this manner, whenan ant finds a decent way from the home to the food source, different ants are bound to follow a similar way. In the long run prompts all the ants following a solitary way.

The complete ACO calculation is portrayed as underneath (Dorigo and Stützle 2009).

figure b

5.1.1 Construct solution()

In this step ants build the solution by applying a step by step decision policy. The decision policy is given by

$$\begin{aligned} p_{ij}^{k} = {\left\{ \begin{array}{ll} \frac{[\tau _{ij}(t)]^{\alpha } \times [n_{ij}]^{\beta }}{\sum _{l \in N_{i}^{k}} [\tau _{ij}(t)]^{\alpha } \times [n_{il}]^{\beta }} &{} \text{ if } \ j \in N_{i}^{k} \\ 0 &{} \text{ Otherwise } \end{array}\right. } \end{aligned}$$
(3)

The probability decision rule is a function of

  1. 1.

    Locally available pheromone tails (\(\tau _{ij}\)) and the heuristic values that are associated with nodes \(\tau _{i}n_{i}\) and with the edges (\(\tau _{ij}n_{ij}\))

  2. 2.

    \(N_{i}^{k}\) is the neighborhood of the anti K when in node i. In S-ACO the neighborhood of a node i contains all the nodes that are not yet visited thereby avoiding the repetition of the nodes.

  3. 3.

    \(\alpha \) and \(\beta \) are the adjustable parameters that controls the pheromone trails and the heuristic information attached to the edges.

Once an ant has completed the tour or while building the tour, the ant evaluates the complete tour or partial tour by the update pheromone procedure to decide how much pheromone to be deposited.

5.1.2 Update pheromone trails

After all the ants completing the tour, the pheromone trails are updated. First, pheromone values are lowered by a constant factor and then add pheromone on the edges the ants have visited in their tours. Pheromone evaporation is given by

$$\begin{aligned} \tau _{ij}(t) \leftarrow (1-p) \tau _{ij}(t)\ \text{ for } \text{ all } \ i, j \in L \end{aligned}$$
(4)

where P With \(O \le P < 1\) is the pheromone trail evaporation rate which helps to avoid quick convergence and also helps the ants to continue their search process. Each ant K, deposits an amount \(\Delta \tau _{ij}^{k}\) on each edge it has visited and is given by

$$\begin{aligned}&\tau _{ij}(t) \leftarrow \tau _{ij}(t) + \sum _{k=1}^{n} \Delta \tau _{ij}^{k}(t) \text{ for } \text{ all }\ (i,j) \in L \end{aligned}$$
(5)
$$\begin{aligned}&\Delta \tau _{ij}^{k} = {\left\{ \begin{array}{ll} \frac{1}{C^{k}(t)} &{}\text{ if } arc(i,j) \in \tau ^{k} \\ 0 &{} \text{ otherwise } \end{array}\right. } \end{aligned}$$
(6)

where \(C^{k}(t)\) is the path length traveled by the kth ant and is computed as the sum of the length of the edges belonging to \(\tau ^{K}\). In the next section challenges in social network analysis are discussed that are based on ACO and its variants.

5.1.3 Connection analysis

Predicting connections between the nodes has become popular issue as the size of the online social network increases. Recently ACO,a population based meta heuristic, has been used for fixing NP Hard problems which are visualized in the form of graph. Chen and Chen (2014) proposed supervised link prediction algorithm. In this method a logical graph is formulated by considering all the missing edges. Initially Ants are placed randomly on the selected nodes. Ants construct their tour by selecting the next node with a probability \(P_{ij}\), which is function of pheromone trail on the edge and the set of common neighbours as heuristic information represented by (\(n_{ij}\)) on the edge between the nodes (\(V_i, V_j\)). After all the ants complete their forward tour, the pheromone values on each edge that has visited by the ant will be updated. Pheromone values are updated based on quality of the path travelled by the ant. Finally, the algorithm terminates at certain iterations or when the pheromone trials are stabilized. Final values of the pheromones on the edges are used as the similarity score between the nodes.But the algorithm converges quickly when the large amount of pheromone is deposited on the edges. Later Sherkat et al. (2015) proposed unsupervised link prediction in their approach, the solution construction is done initially by keeping each ant randomly on the selected graph node. The key objective for these ants is to find triangles (or a triad), i.e., a sub graph with 3 nodes and to update the pheromone of all edges of that triangle. In this method, ants construct the tour by selecting the next node which is inversely proportional to the pheromone trial laid on the edge that enable ants to explore more paths. After finding the triangles pheromones on the triangle edges are increased by one unit. In this way ants explore the triangles till all ants dies or until certain number of iterations is reached. This algorithm works better when the network is strongly connected. The ACO parameters used for analyzing the complex networks inorder to find the links between 2 nodes are listed in the Table. 3.

Table 3 ACO parameters for connection analysis

5.1.4 Path analysis

Based on the small world phenomenon, random networks has been investigated (Lawrence and Latha 2015). The proposed algorithm provides an approach to find shortest path of length six between source and the destination using the foraging behaviour of the ants.In this method the author used the greedy policy inorder to find the next decision node to build the shortest path. Rivero et al. (2011) proposed an algorithm for finding the relationships in social networks. In order to find relationships, the author investigated the chain of two people that are separated by 2 hops. The ants find the destination node by considering the path that goes through node that has higher centrality, i.e., through a person with lot of friends. Initially pheromone is deposited on edges, food is deposited on the selected nodes. Food odour helps the ant to find the path. The selection of the next node from the valid list of nodes depends on the pheromones on the edges. As a result, the time required to diffuse food odour and different values of threshold used for food odour intensity depends on the mentioned parameters. Honghao et al. (2013) proposed an algorithm to detect communities based on the path analysis. In this method the author detects the communities by using random walk as a heuristic rule. Each ant in the algorithm move to the next node randomly with in its neighbourhood. After completion of each iteration, collective behaviour of all the ants is used to produce the current solution and this process continues till the algorithm converges. The ACO parameters used for analyzing the complex networks inorder to detect the links in the network are listed in the Table. 4.

Table 4 ACO parameters for path analysis

5.1.5 Community analysis

Mandala et al. (2013) proposed the first approach to detect the community in social approach using min – max ant system. Their study relies on the idea of finding maximum cliques, i.e., the sub-graph where the maximum nodes are connected to each other. But once the clique is found, ants continue their travel on the graph to find other cliques by maintaining a tabu list for the visited nodes as well as the construction cliques. In order to avoid quick convergence and to get global optimum solution initial pheromone values are initialized to upper pheromone trail limit and the pheromone update is measured based on the best tour travelled by the ant in the current iteration. Later, Romdhane et al. (2013) proposed a method that detects the communities by optimizing modularity measure. But the algorithm will be able to partition the network only into two groups. Most of the existing approaches were devoted to undirected network like facebook etc. Thus, Poli et al. (2007) Proposed an algorithm for community mining in directed social networks. In order to maximize the modularity value i.e., the measure used to find the quality of the community, their study uses ant colony optimization. Their study starts with an empty solution i.e., a set of computed empty partitions. Then initialize the community by placing the ant on the most influential vertex and the ants construct the solution by selecting the vertex from the set free direct neighbours with a probability \(P_{ij}\) subjected to the importance of the vertex j and the purity of the vertex j belonging to the community \(C_i\). Finally the algorithm terminates when all the nodes are visited. Mandala et al. This method forms communities that have direct connections between the nodes but ignores the nodes with multiple connections. Mandala et al. (2013) proposed method for clustering dynamic graphs. Their methodology utilizes various ant colonies, every province has hive at its middle. Calculation introduces K hubs with the most noteworthy hub degrees at hive position. In every cycle ants will investigate graph almost a hive position by adding the hubs to same hive with most noteworthy part of pheromone. After the completion of one cycle, pheromones will get refreshed and the hive will reassigned so that hub with the minimal average briefest way length to all the hubs alloted to a similar hive. This strategy utilizes the nearby pursuit which prompts the neighborhood ideal and time utilization in genuine networks. The ACO boundaries utilized for investigating the complex networks inorder to identify the network communities are recorded in the Table. 5.

Table 5 ACO parameters for community analysis

5.2 Particle swarm optimization(PSO)

Particle Swarm Optimization is a population based stochastic optimization technique that simulates the behaviour of bird flocking or fish schooling. It is proposed by Kennedy and Eberhart (1995). PSO has been successfully applied to many engineering problems because of its unique nature in searching mechanism, efficiency in computation and easy implementation. Instead of chromosomes, PSO has particles that form its population called swarm. Unlike in genetic algorithm, there is no “survival of the fittest” selection process for the particles, but rather PSO uses mutation strategy where each particle is simply moved from one place to another place. The PSO algoritm utilizes potential arrangements called particles, that fly through the problem space by following the current ideal particles. At every cycle, the speed of every particle is stochastically refreshed by the best arrangement that the particle accomplished up until now and the local best position. The simple usage and brisk computational assembly makes PSO a famous enhancement technique for taking care of continuous and discrete improvement issues. The PSO algorithm is outlined as below:

figure c

In the following analysis of social networks are outlined based on Particle Swarm Optimization and its variance.

5.2.1 Community analysis

Many optimization methods are applied for solving the community detection problem which is considered as a NP – Hard problem. Typical clustering techniques includes Kernigh-Lin algorithm [57], genetic algorithm Leskovec et al. (2010), and PSO algorithm. Kernigh-Lin algorithm discover communities by determining the maximum value of modularity function. Genetic algorithms detect communities by optimizing single or multiple objective functions. However these both methods need prior information about the size and number of communities and too many parameters that need to be initialized. Basically PSO algorithms are designed for solving continuous optimization problem (Shi and Eberhart 1998, 2001). Later for updating the individual velocity and position, the researchers redefined the equation by using the swap operator (Mitrović and Tadić 2009), crisp - set method (Chen et al. 2010), fuzzy - matrix method (Liao et al. 2007). Because of its quick convergence and easy implementation they are extended for solving discrete optimization problem. Cai et al. (2015) proposed a discrete PSO algorithm for finding the communities in large scale social networks. In this method, the authors have carefully redefined the equations to generate the discrete values and also used greedy method for updating particle velocity and position. Chen and Qiu (2013) proposed a novel PSO based on modularity optimization for detecting the clusters in social networks. The algorithm uses modified PSO to optimize the modularity in new constructed weighted network which is compressed network taken from the original network. Later, Cao et al. (2015) proposed PSO for discrete optimization problem by considering the connections in the network when initializing the population and there is no need of prior information about the size and number of communities. In this method, authors used modulo operator and different operators which generates an integer value in \(IDPSO_{RO}\) algorithm. In order to increase the accuracy of the algorithm, the author introduce a new method called community correction strategy for updating individual velocity and individual position with best solution at 1/2, 1/2, 3/4 of total iterations. The PSO parameters used for analyzing the complex networks inorder to detect the communities are listed in the Table. 6.

Table 6 PSO parameters for community analysis

5.3 Artificial bee colony (ABC) algorithm

Artificial Bee Colony is an as of late created swarm knowledge based algorithm proposed by Karaboga and Basturk in 2006 to formulate NP-Hard problems Karaboga and Basturk (2007). It was enlivened by self organizing and aggregate intelligent behaviour of honey bee colonies. In ABC, artificial bees colony contains three kind of bees : Employee/worker honey bees, spectator honey bees and scout honey bees. Every worker honey bee fly around in multi dimensional quest space for deciding new arrangements with the area of the current one and assess them. The calculation utilizes avaricious choice cycle between the past one and the upgraded one. The worker honey bee whose arrangement has been disposed of will turn into a scout honey bee, at that point the scout honey bee caries out looking for additional new arrangements and constrained by a boundary called ”limit”. When the investigation of new arrangements is finished by worker honey bees, they return to hive and offer their data about gainfulness of the food source with the passerby honey bees in the hive. Presently the passerby honey bee pick an answer contingent upon bigger productivity esteem which are determined utilizing the wellness esteems gave by worker honey bees. At that point apply ravenous determination between new arrangement and past arrangement found by spectator honey bee by Memorizing the best one. This cycle proceeds until some end standards is met. Outline of ABC algorithm:

figure d

In the following sections analysis of social networks are outlined based on ABC algorithm.

5.3.1 Connection analysis

Friend recommendation is a fundamental problem in online social networks that aims to suggest new links usually done by analyzing the links connected between the nodes. Several algorithms are developed based on the structural properties of the network. But when the network grows, friend recommendation problem becomes NP Hard problem. Typically Bio Inspired algorithm works better than the conventional algorithm. One such Category is ABC algorithm. Akbari et al. (2013) recommends new friends using the structural properties of network. In the proposed approach ABC is applied to optimize the weight coefficients assigned to the topological features of the network. The fitness function for finding the quality of each solution is the error function which is defined as sum of differences between the actual value and expected values. This method effectively recommends new links over other methods such as KNN, SVM, MLP neural network and topological GA based approach. The ABC parameters used for analyzing the complex networks inorder to predict the connections among the nodes in the network are listed in the Table. 7.

Table 7 ABC parameters for Conection analysis

5.3.2 Community analysis

Network community discovery can be considered as optimization issue in which a target quality function called modularity, wellness score and so forth are picked to be improved. Hafez et al. (2014) to speak to a network structure, have applied locus based adjacency encoding plan proposed by Shi et al. (2009) and to speak to the every individual food source he utilizes a vector \(\overrightarrow{X}\) containing components that take a worth j in the range[1, n]. A worth j put away in the ith position is deciphered as connection between hub i and hub j, i.e., the creator instates the food sources from its hub neighborhood. The wellness of the arrangement is determined utilizing be diverse target functions, for example, modularity, network score, wellness score and so on. The presentation of the calculation utilized relies upon the quality function utilized. Further ABC, is applied in numerous domain, for example, information mining for performing grouping proposed in (Fathian et al. 2007; Marinakis et al. 2007). The ABC boundaries utilized for investigating the complex networks inorder to distinguish the networks are recorded in the Table. 8.

Table 8 ABC parameters for community analysis

5.4 Firefly algorithm

Firefly algorithm is an yet another important population based intelligence algorithm inspired by the flashing of fireflies and is developed by Yang (2010). The key idea of Fire fly algorithm is that firefly will be attracted to other mates that has higher intensity and attractiveness. Intensity of each firefly is proportional to quality of the solution and the attractiveness depends on the distance moved towards a more brighter firefly . Outline of Firefly algorithm is given below

figure e

5.4.1 Community analysis

Amiri et al. (2013) designed multi objective functions to discover communities in large scale networks. In this method, firefly algorithm is used to optimize multiple objective functions. The two objective functions were introduced in this paper, maximization of internal links and minimization of external links. Amiri et al. (2013) consider random movement of firefly as in Caponetto et al. (2003) and the absorption parameter is considerd as in Apostolopoulos (2010). The complexity of algorithm is high because it involves several heuristics such as external repository for storing on dominated Pareto optimal solutions, Niching strategy to find the best global solution from the list of Pareto optimal solutions, fuzzy based clustering and chaotic based tuning of the algorithm. However, the proposed algorithm cannot detect the overlapped communities and also not suitable for dynamic networks. The Firefly parameters used for analyzing the complex networks inorder to detect the communities are listed in the Table. 9.

Table 9 Firefly parameters for community analysis

6 Future directions

Some of the future directions to work on in the area of social network analysis is discussed in this section:

6.1 Employing new metaheuristic measures

Due to recent research in the field of meta heuristics lot of new promising optimization techniques are introduced in the literature such as bat algorithm, cuckoo search, salp swarm algorithm (Mirjalili et al. 2017b), krill herd algorithm (Gandomi and Alavi 2012), Whale Optimization Algorithm (Faramarzi et al. 2020) and they are popular in many diverse applications (Gill and Buyya 2019; Mirjalili et al. 2017b). Using these algorithms in the context of social network analysis problems particularly connection analysis, path analysis and community analysis seems to be fruitful.

6.2 Secure computation on social networks

Social Networks possess lot of vital information and for analysis only anonymized data or network is used. In order to compute data in real world without revealing identity of the users and without affecting privacy of the users is of at most importance. Privacy preserving computation, multi party computation or similar techniques have to be developed over social networks so that link prediction and community analysis can be performed over real time data. Performing link prediction and community analysis over real time data has significance in e–commerce and related applications apart from understanding the nature of evolution.

7 Conclusions

Swarm based optimization techniques used in social network analysis are compared in this paper with respect to community analysis, connection analysis and path analysis. From the comparison we can infer that firefly algorithm provides better result by exploring the solution space efficiently without getting struck at the global optima or local minima. Further it is interesting to explore the application of firefly algorithm in social networks with respect to different computational applications.As a future scope swarm based optimization techniques can be extended to the deep learning area especially for updating the gradients in designing the deep neural network models. Further, due to the problem of convergence speed which can be encountered in solving real challenges.These bio-inspired algorithms could be hybridized with other approaches and methods such as quantum computing and choas theory to enhance the performance of nature inspired optimization algorithms