1 Introduction

Peer-to-peer (P2P) networks [1] may also link various end hosts (known as peers) in a very ad-hoc way. The P2P networks are particularly made use in applications of file sharing that enables the peers to share all the digitalized files like audio, video, electronic books and so on. In recent times much more advanced applications like real-time conferences, media streaming, and online gaming have also been included in this. Unlike many of the traditional networks that have server client topology and the files are available in the data center every peer is a client and also a server in the P2P networks. This model is very convenient owing to the classical ability to search for the files throughout the network by means of using semi-centralized servers that are shown as a centralized node that leaves the network vulnerable.

In case of a P2P network, the nodes to communication will be able to communication with neighbors with no central coordination. By means of exchange of data this network performs a cooperative positioning: where each node gets helped by the neighbors for computing this position.

Optimization is that process that has been defined as: Finding of the value in the variables which will either minimize or sometimes maximize the functions of an objective at the same time satisfying such constraints. The problems in optimization have been centered based on three factors: (1) the objective function that is either minimized or maximized. (2) One set of variables or unknowns which affect this objective function (3) constraints allowing unknowns taking on some values at the same time excluding others [2].

In most problems of optimization there are either one or more local solutions and it does become more important in choosing a proper method of optimization which will not be greedy and will look in its neighborhood for an ideal solution. This can mislead the process of search and will leave this stuck in a local solution. But the problem of optimization will have a mechanism for balancing between that of the local as well as the global search. There are several methods that have been used for solving the problems of optimization in mathematical as well as combinatorial type. In case the problem gets difficult or if the space becomes large this can be quite difficult to solve. Owing to this there are many methods of meta-heuristic optimization developed to solve them.

Multi-objective optimization can meet the system needs that trade-off between conflicting objectives than that of single optimization problems. This normally includes many goals that are conflicting and its outcome is achieved by compromising the goal. Presently there are many approaches that are of multi-objective including traditional mathematical approaches like Goal programming and intelligent algorithms like Ant Colony algorithm, Particle Swarm Optimization, and Genetic Algorithms.

The k-means protocol completely depends on the Euclidian distances and residual energies that are the basis of the selection of cluster heads. Therefore, all the nodes collect information on the node id, the position, and the nodes’ residual energy and store the information in the central node. Once this information is procured the clustering algorithm (k-mean) is performed [3]. This technique will distribute the dataset into different clusters having similar features [4]. In order to solve this problem of clustering, the cluster number that fits into a dataset is identified and the objects for these are duly assigned.

The K means clustering is that well studied and popular technique of data analysis. A standard version will assume that data that is available in a particular location is easy to get. But if the sources of data get distributed in a large P2P network [5], the collection of such data in a central location even before that of clustering will not be attractive or practical. There are several exciting applications in K-means clustering for the data that is distributed over that of a P2P network and these are now highly scalable, efficient, and is desired.

The Heuristics makes use of trial and error for producing solutions that are acceptable which is a common problem within a practical and reasonable time [6]. The main complexity of this problem will make it very difficult in searching for a solution. The aim here is to identify a solution that is acceptable inside a timescale. The Meta-heuristic will be formally defined as a process of iterative generation for guiding a subordinate heuristic by means of intelligently combining the concepts in exploration as well as the exploitation in its search space and the strategies of learning will be used for structuring the information to identify solutions that are efficient and near-optimal.

In this work, the simulated annealing (SA) can be the answer to this. This is not to fool with false minima and is very easy in its implementation. Also as it does not need a mathematical model, it may also be used in solving a wide range of problems. The Simulated annealing is that which imitates the process of annealing that is used in metallurgy. When a substance goes through the annealing process it is heated for some time until it attains a point of fusion for liquefying it and is later cooled down to solid again. The lasting properties of this depend on the schedule of cooling that is applied; if cooling happens quickly it is easily broken down into an imperfect structure and if it is slow the structure is strong.

Simulated Annealing [7] can be both a generic and probabilistic meta-algorithm for the problem of global optimizing especially in identifying a better approximating value to that of the global minimum of any provided function in a particular search space that is large. This is in analogy with the annealing process having a formal proof model for faster convergence. The SA and its behavior are kept under control with the help of cooling scheme which doesn’t depend on the first solution. Due to certain problems, Simulated Annealing is much efficient when compared to exhausting list as the objective to obtain a solution that is acceptable within a reasonable time. The SA algorithm has some demerits like needing many iterations for generating near optimal or optimal solutions and the initial temperature is hard to determine.

The data mining part of the P2P network has been dealt with in this work. In this work, the k-means clustering is duly optimized by the multi-objective fish swarm optimization along with simulated annealing algorithm that uses neighbour cooperation in the peer to peer network. In Sect. 2 some related literature is given. The techniques and methods used are explained in Sect. 3 and the outcomes and suggestions are given in Sect. 4 which Sect. 5 completes this work.

2 Literature survey

The MultiObjective PSO (MOPSO) which was set-up during 1999 is now an emerging field to solve the MOOs having extensive literature, applications, codes, variants, and software. Lalwani et al. [8] reviewed the applications of the MOPSO in different areas and studied the MOPSO variants as well. The MOO and its key concepts were studied and reviewed in the survey that was organised with multiple objectives and their variants.

Singh et al. [4] brought about a Simulated Annealing protocol in the conditioned multi-objective optimization (MOO). While traversing for it in a possible area, this protocol performs as it is in the current put-forth archived multi-objective simulated annealing (AMOSA) algorithm, and while working in an area that is not feasible it brings down the violation constraint by means of moving through the approximate descent direction (ADD). The Archive of the non-dominating results that are determined at the time of search was kept. The possibility of acceptance of one new point has been found by the status of its possibility and the status of dominating after comparing with the current points as well as the archives point. The proposed algorithm’s performance has been reported with a group of constrained bi-objective test problems (CTP2 to CTP8) seven in number that poses difficulties to the current protocols. A comparative analysis of this protocol used in wide range evolutionary algorithms that are multi-objective called the NSGA-II is included.

Jiang and Zhu [9] made a proposal for an approach that uses AFSA for solving the multi-objective problem of optimization. In this, the Pareto dominance concept was used for evaluating the advantages like faster convergence rate, ability to search globally, robustness and disadvantages of the Artificial Fish (AF). The Artificial fish search uses the technique of parallel search in the available solution space. The Pareto optimal solutions found is saved in an External Record Set. The effectiveness of the algorithm was illustrated by the results obtained through the simulation of 4 benchmark test functions.

Jiang and Cheng [10] made a presentation of a stochastic approach known as simulated annealing-artificial fish swarm algorithm (SA-AFSA) to find solutions to multimodal issues. This algorithm incorporates both the AFSA and the SA for improving performance. This hybrid algorithm has some features like strong local search abilities to solve few multi-modal problems. This protocol uses the Simulated Annealing to the AFSA so that AFSA’s performance is improved. This hybrid protocol possesses the following-characteristics such as the hybrid algorithm maintains a strong local searching capability of the Simulated Annealing along with swarm intelligence of the AFSA. The experimental outcomes show that as in the other testing case the SA-AFSA will get good precision and converging speed.

Many multi-objective algorithms for automatic clustering are put-forth in literature to handle the clustering problem. Abubaker et al. [11] made a proposal of a Multi-Objective Particle Swarm Optimization along with a simulated annealing algorithm (MOPSOSA). The MOPSOSA aims at the estimation of the right number of clusters and groups them into a data set which need not identify the actual cluster number. Its efficiency was studied on the basis of the velocity of the particles. Certain synthetic along with real life datasets were employed to measure the efficiency of this MOPSOSA algorithm. The results proved that some suitable velocity parameters belonging to the same range have been used.

Fang et al. [12] further proposed another artificial fish swarm algorithm (MOAFSA), that imitated the behaviour of the fish in local search that makes use of a quick sort method for getting a solution set that was nondominated and which cuts the external set to that of the crowding distance. The paper used the MOAFSA for the functions test that was multi-objective. The results have proved that the MOAFSA has a higher speed of convergence and the Pareto set that corresponds to this is quite evenly distributed. The MOAFSA is now applied to schedule the optimization of the reservoir of a hydropower station.

The K-means clustering and their partitions are a collection of the tuples of data into a K disjoint, which are the exhaustive groups or clusters in which the K will be used in in a manner that is user specified. The main goal here is finding a clustering that brings down the sum of its distances among the data tuple and also the centroid of such clusters. The K-means will begin with one initial set of the K centroids that are selected randomly. The AFSA is that algorithm which has been based on the schools of simulation of the fish behavior. In the P2P system, in accordance to the flow of such service nodes and the provision of service nodes. For enjoying this the quality of service has been regarded as the food and its quality. The node which further enjoys this can be able to provide the service and therefore the node will also be the fish and the food. This node will send a process of service request that flows to the process of food. This quality of the service is regarded as the food and its quality. The node which enjoys such service will be able to provide the service and so the node may also be the fish and the food. Owing to the interaction process along with the artificial fish model with a high level of similarity that will be able to put this P2P model in a model of artificial fish swarm. The main disadvantages in this are the high level of time complexity, the lack of a balance among both the global and the local search and this is in addition to that of the lack of such benefitting in the experiences of the members in the group for the movements that are subsequent to this [13, 14]. Here in this work the MOFSASA method that is proposed in a P2P system.

3 Methodology

The k-means clustering, the Artificial Fish Swarm Algorithm, the Simulated Annealing, the neighbour selection and the multi-objective FSA are discussed here.

3.1 Skin segmentation data set

Over B,G,R color space, the construction of the skin segmentation dataset (Table 1) is done. From the facial images of diversity, age, gender and race of people, the skin and the non-skin dataset are generated by using the skin textures.

Table 1 Skin segmentation dataset

Data set information From the facial images across various age groups including the young, middle and old, ethnicities like Caucasian, African, Asian etc., and genders that have been obtained from the FERET database and the PAL database, the B,G,R values that are obtained are randomly sampled for collecting the skin dataset. The sample size was totally 245,057. These constituted 194,198 non-skin samples and 50,859 skin samples.

Attribute Information: A dataset of dimension 245,057 * 4 is constituted here. The first 3 columns are B,G,R (x1, x2, and x3 features) values and the 4th column is that of the class labels or the decision variable y

3.2 Optimized k-means clustering algorithm

The k-means protocol uses additional time for the calculation of the distance from the center of the cluster [15]. This additional time is saved by this method. In case the distance from a new cluster center is small than the distance of the object from the previous cluster center then recalculate the cluster center and evaluate the distance of the object from another center until there are no changes in the mean. This is done by the maintenance of two data structures used for storing the label of the cluster as well as the distance of the object that corresponds to the cluster’s centre. The distance of this data object is updated for every iteration until the termination is reached. The optimized k-means algorithms’ process is as below:

Input: The actual number of the desired clusters, k, and that of a dataset D = (d1,d2,.............,dn) which contain n data objects.

Output: A set of the k clusters.

Steps:

1) The k data objects are randomly selected from data-set D as its initial clusters.

2) The Euclidian distance d (di,cj) for every data object di (1 <= i<= n) with every cluster center cj (1 <= j<= k) is computed.

3) Every data object di, closest centre cj is found and assigned to the cluster center cj.

4) The label and distance is stored as:

Set the cluster[i] = j, and dist[i] = d (di, cj).

In which j is the cluster label where the data di resides and d (di, cj) the distance of the data object di to that of the cluster center that is labeled by j.

5) For every cluster j (1 <= j<= k) the new cluster center is recalculated.

6) For every data object di, the distance to the new center of the cluster is computed.

a) In case the distance is less or equal to the dist[i], the data object remains there; and the iteration stopped

b) Else

For each of the cluster center cj (1<=j<=k), the distance d (di, cj) is computed and assigned to the data object di with the closest cluster.

Set the cluster[i] = j;

Set the dist[i] = d (di, cj).

Repeat from step 3 until the center remains same.

The algorithm that is optimized needs two data structure clusters for storing the cluster label and the distance for every iteration that is used for the iteration that comes next.

3.3 Multi-objective optimization based on artificial fish swarm algorithm (MOAFSA)

The AFSA’s are superior to the other algorithms in terms of speed of convergence, global search capacity, and robustness. The AFSA that has global information gets applied in solving the problems of optimization. This algorithm has been initiated at a single node [16] which can be N1. This generates a set of centroids initially \(V_1^{(1)} =\{\vec {v}_{j,1}^{(1)} :1\le j\le K\}\)using a termination threshold \(\gamma >0\) (that is a user-defined constant), sending them to their immediate neighbours \(\Gamma ^{(1)}\), and starts the first iteration. If a node gets the centroids and the \(\gamma \) for the very first time, the rest is sent to its immediate neighbours and the first iteration begins. After this, all nodes enter iterating mode 1 having similar initial centroids and the same terminating threshold \(\gamma \). This k-means protocol under goes many iterations and it is modified at every node \(N_i \). For every iterations the centroids and the counts of clusters \(l,N_i\) are collected duly from the immediate neighbour. This and the local da of \(N_i \), are utilized for the generation of centroids for the next iteration. If they are substantially different then the \(N_i\) moves to the next iteration. Else it stays in a finished state. The flowchart of this AFSA algorithm is depicted in Fig. 1.

Fig. 1
figure 1

Flowchart of proposed AFSA algorithm

3.4 Simulated annealing (SA)

The SA [4] is a method of search that is point-to-point having a strong base. Its ability to use unfavorable solutions are used to prevent its convergence into local optima making it more dynamic. This SA was designed originally from optimization of single objectives because of its point-to-point nature of the search. Certain earlier attempts were made in combining many objectives into one single composite function with weights.

Simulated annealing [17] is an algorithm that is straight-forward where the Metropolis Monte Carlo technique is used. This is much suitable for the Simulated Annealing as the states that are possible only are sampled here at a particular temperature. This protocol is a Monte Carlo simulation run at very high temperature analogous to greater random fluctuation in modifying the variables. Later it is carefully monitored to make sure that the search space converges to the optimal level during the simulation as the temperature is reduced analogous to have less fluctuation in modifying the variables. Simulated annealing (SA) is also used to search the optimal solutions of problems using determination of initial (high) as well as final (low) temperature analogous kT (in which k refers to a Boltzmann’s constant) in case of the Acceptance Checking, as well as in finding which will constitute a Monte carlo step.

The initial, as well as the final temperature in a problem, is found using the acceptance of probability. Generally, if Monte Carlo simulating permits energy (E) an increment in dEi at probability of Pi, the effective initial temperature is kTi = −dEi/ln(Pi). When during last temperature there is a cost increment of 10 will be welcomed using a probability of about 0.05 (5%), the last temperature being kTf = −10/ln(0.05) = 3.338

figure c

3.5 Fish swam algorithm-simulated annealing (FSASA)

The noted swallow behaviour of the swarming fish has been presented in SA-AFSA for reducing its complex nature. Its ability to search globally for the AFSA relates to AFs population. Once iterations are complete, if the fitness function of the AF is bad compared to a provided threshold there is small control over the optimum outcome. The weak fish will take more computing time, the storage and thereby increases the toughness of the algorithm [18]. The behaviour of swallowing that is explained here will solve this issue and the AF that is weak will be swallowed if the value of the fitness function value is below the mentioned threshold value.

The implementation of SA-AFSA is as shown for global minimum, for random fish using Random (0, 1) to generate random number between 0 and 1. The variable af_total defines the number of AF, the AFS_iterate_time shows its iterations in the AFSA, af_step depicts a step distance of the AF, af_visual is the visual distance of the AF, af_delta is the crowd factor, try_number is the maximum try number of the AF, T0 is the initial temperature, c_rate is the rate of cooling, SA_iterate_time_max is maximum iterations in the SA, kdenotes the number of iteration taking place at a given temperature.

The algorithm for AFSA is explained below

(a) Initialization of the parameters: af_total, AFS_iterate_time, af_step, af_visual, af_delta, try_number, T0, c_rate, and SA_iterate_time.

(b) Create the af_total AFs within the area of feasible solutions.

(c) Update the locations of each artificial fish. The locations of the AF’s are updated dynamically as below: Execute its swallowing behavior and on the condition being satisfied in (1):

$$\begin{aligned} X_i^{t+1} =X_i^t +\frac{X_c -X_i^t }{||X_j -X_i^t ||}.af\_step.Rand() \end{aligned}$$
(1)

\(X_j \) denote the fish which is to be followed

In case the condition of the swarming behavior is satisfied go forward to follow behavior else go to prey behavior in (2):

$$\begin{aligned} X_i^{t+1} =X_i^t +\frac{X_c -X_i^t }{||X_c -X_i^t ||}.af\_step.Rand() \end{aligned}$$
(2)

\(X_c \) denotes the center position of the strong fish and its swarm’s position.

Default:

Its preying behavior is executed, the AF chooses a state

\(X_j \) stays randomly within its visual distance in (3).

$$\begin{aligned} X_j =X_i +Rand().af\_visual \end{aligned}$$
(3)

If \(X_j \) is better than the current state, it moves a random step forward in the direction in (4).

$$\begin{aligned} X_i^{t+1} =X_i^t +\frac{X_j -X_i^t }{||X_j -X_i^t ||}.af\_step.Rand() \end{aligned}$$
(4)

In case a better location is not identified after many trials the AF makes a random step.

(d) Step b is continued till the condition of termination is met and the best solution best X is recorded and found by the AFSA.

(e) when i < SA_iterate_time_max do:

  1. (1)

    for j = 1:k

  2. (2)

    generating another new solution \(X_{new} \) using \(X_{best}\), and compute its fitness solution, and keep a track of any variation \(\Delta F \) that exists between \(X_{new} \)and \(X_{best} \) in (5):

    $$\begin{aligned} \Delta F=f(X_{new} )-f(X_{best} ) \end{aligned}$$
    (5)
  3. (3)

    if \(\Delta F<\) 0, \(X_{best} =X_{new} \)

  4. (4)

    if \(\Delta F>\) 0, p= exp(\(-\)\(\Delta E \)/T(i)), if p > Random(0,1), best new X = X,  else its best X remains unchanged.

  5. (5)

    end

(f) increase the times of iteration i = i+1; and by cooling of the systems temperature T(i) = T0 * c_rate;

(g) Print the solution-vector as its final solution.

3.6 Multi-objective fish swarm algorithm simulated annealing (MO-FSASA) algorithm

The MO-FSASA algorithm [11] hereby integrates all the benefits of quick computation as well as converging in PSO which has the capacity to evade local solutions of Simulated Annealing. This MO-FSASA protocol begins with k-means scheme for creating initial positions of the swarm. After this, the Multi-objective Fish Swarm Optimization (MFSA) is made in which the swarm particles are introduced to search space by using the current optimal particles so that the best solution is identified. When searching, the Multi-Objective Simulated Annealing (MOSA) is made use in case if the position of the particle doesn’t change and it does not progress to a preferred position. Every iteration makes use the concept to share an update of fitness in the repository of the solutions that are Pareto optimal. In Fig. 2 a conceptual diagram of the MOFSASA algorithm and its mechanism along with the input and the output datasets are shown. This algorithm has automatic clustering in accordance to different index that validate clusters which are the DaviesBouldin index (DB-index), based on the Euclidean distance; symmetry-based index (Sym-index), based on the point symmetry distance; connectivity-based index (Conn-index), depending on the shorter distance. The mapping is carried out by [19].

The MO-FSASA algorithm is explained

figure d
Fig. 2
figure 2

Framework for MO-FSASA algorithm

3.7 Collaborative distributed fuzzy C-means clustering (CDFCM) algorithm

The novel objective function developed as (6) is the CDFCM algorithm under the P2P network environment. This is done by combining the distributed weight dissimilarity measure and an extra term for the attribute-weight-entropy regularization.

$$\begin{aligned} \min F(U,C,M)= & {} \sum _{j=1}^J {\sum _{n=1}^{N_j } {\sum _{k=1}^K {u_{jnk}^\alpha } } } \nonumber \\&\sum _{m=1}^M {w_{jkm} (x_{jnm} -c_{jkm} )^{2}} \nonumber \\&+\gamma \sum _{j=1}^J {\sum _{k=1}^K {\sum _{m=1}^M {w_{jkm} \log w_{jkm} } } } \nonumber \\ Subjecttoc_{jkm}= & {} c_{ikm} ,i\in NB_j \nonumber \\ \sum _{k=1}^K {u_{jnk}}= & {} 1 ,0\le u_{jnk} \le 1 \nonumber \\ w_{jkm}= & {} w_{jkm} ,i\in NB_j \nonumber \\ \sum _{m=1}^M {w_{jkm}}= & {} 1 ,0\le w_{jkm} \le 1 \end{aligned}$$
(6)

where U = [u\(_{jnk}\)] is the membership degree matrix and u\(_{jnk}\) denotes the membership degree of the n-th object belonging to the k-th cluster in the j-th peer. C = [c\(_{jkm}\)] is the cluster prototype matrix and c\(_{jkm}\) denotes the m-th dimension of the k-th cluster prototype in the j-th peer. W = [w\(_{jkm}\)] is the attribute weight matrix and w\(_{jkm}\) denotes the m-th dimension of the k-th cluster weight vector in the j-th peer. \(\alpha \) is the fuzzification coefficient and \(\gamma \) is a positive scalar.

The shape and the size of the clusters is controlled by the first distributed weighted distance term in this new objective function [20]. This also facilitates the agglomeration of clusters. The negative entropy of the attribute weights is constituted in the second term. The optimal distribution of all the attributes are regularized by the attribute weights, in accordance with the data which is available. This is done in order to minimize the intra-cluster dispersion simultaneously maximizing the entropy of the attribute weights for stimulating significant attributes which can contribute better for the cluster identification . A positive regularizing and an adjustable parameter is \(\gamma \) (\(\gamma \quad>\) 0), whose proper selection enables the balancing of the two terms so that an optimal solution can be found. The local cluster prototypes are ensured by consensus constraints c\(_{jkm}\) = c\(_{ikm}\) and, w\(_{jkm}\) = w\(_{ikm}\) i\(\in \) NB\(_{j.}\) along with yielded attribute weights at each peer coinciding with the global ones of all the objects. i.e. the results obtained by centralized clustering technique are similar to the results obtained by distributed clustering.

Minimizing F (U, C, W) with respect to constraints is a constrained nonlinear optimization problem. This problem is solved by applying the Picard iteration, similar to the conventional FCM algorithm.After initially fixing C and W, it finds the required conditions on U for reducing F(U) to the least value and then W and U are fixed to decrease F(C) to the least value with regard to C. The last step is the fixing of U and C so that F(W) is minimized with respect to W.

Corresponding to equations (7) through (11), the matrices U, C and W are respectively updated.

$$\begin{aligned} u_{jnk}= & {} \frac{1}{\sum _{h=1}^K {\left( {\frac{\sum _{m=1}^M {w_{jkm} (x_{jnm} -c_{jkm} )^{2}} }{\sum _{m=1}^M {w_{jhm} (x_{jnm} -c_{jhm} )^{2}} }} \right) } ^{\frac{1}{\alpha -1}}} \nonumber \\ for\,1\le & {} j\le J,1\le n\le N_j ,1\le k\le K \end{aligned}$$
(7)
$$\begin{aligned} c_{jkm}= & {} \frac{\sum _{n=1}^{N_j } {u_{jnk}^\alpha w_{jkm} x_{jnm} } -\sum _{i\in NB_j } {p_{jikm} } }{\sum _{n=1}^{N_j } {u_{jnk}^\alpha w_{jkm} } } \nonumber \\ for\,1\le & {} j\le J,1\le k\le K,1\le m\le M \end{aligned}$$
(8)
$$\begin{aligned} p_{jikm}= & {} p_{jikm} +\eta _1 (c_{jkm} -c_{ikm} ) \nonumber \\ for\,1\le & {} j\le J,1\le k\le K,1\le m\le M,i\in NB_j \end{aligned}$$
(9)
$$\begin{aligned} w_{jkm}= & {} \frac{\exp \left( {-\gamma ^{-1}\sum _{n=1}^{N_j } {u_{jnk}^\alpha (x_{jnm} -c_{jkm} )^{2}} -2\gamma ^{-1}\sum _{i\in NB_j } {q_{jikm} } } \right) }{\sum _{l=1}^M {\exp \left( {-\gamma ^{-1}\sum _{n=1}^{N_j } {u_{jnk}^\alpha (x_{jnl} -c_{jkl} )^{2}} -2\gamma ^{-1}\sum _{i\in NB_j } {q_{jikl} } } \right) } } \nonumber \\ for\,1\le & {} j\le J,1\le k\le K,1\le m\le M \end{aligned}$$
(10)
$$\begin{aligned} q_{jikm}= & {} q_{jikm} +\eta _2 (w_{jkm} -w_{ikm} ) \nonumber \\ for\,1\le & {} j\le J,1\le k\le K,1\le m\le M,i\in NB_j \end{aligned}$$
(11)

Here P = [p\(_{jikm}\)] and Q = [q\(_{jikm}\)] are two matrices containing the Lagrange multipliers corresponding to the consensus constraints c\(_{jkm}\) = c\(_{ikm}\) and, w\(_{jkm}\) = w\(_{ikm}\) i\(\in \) NB\(_{j}\). They are defined for the iterative update of the cluster prototypes and attribute weights. \(\eta _{1}\) and \(\eta _{2}\) are positive scalars.

Exploring the structures of every peer through peer exchanges is the essence of collaborative clustering. Clustering at individual peer and an interaction between neighbor peers by means of exchanging the findings are the two main phases of collaborative clustering. These intertwine and take place at fixed sequence. Figure 3 shows a general view of the processing of the proposed collaborative distributed clustering algorithm.

Fig. 3
figure 3

The block diagram of the proposed collaborative distributed clustering algorithm

In the beginning, every peer spawns its initial cluster prototypes and attribute weights. The local findings are then conveyed to its neighbors. Then, at every peer, the FCM-type algorithm is performed with its optimization pursuits. This is carried out by focusing on local data and at this point in time, the findings are communicated by the neighbor peers. All of the peers are set to involve in another communication phase after one step iteration. Once more, the findings are communicated by them and novel conditions for the new phase of the FCM-type clustering are set up. Collaboration refers to the pair of the processes of communication and clustering. After taking a finite number of collaboration iterations, the overall optimization stops when there is no more considerable improvisation in the exposed structure (cluster prototypes and attribute weights) of all the peers.

figure e

The algorithm involves distributive mode of processing the iterative steps with “D” marks. Based on the local findings of every peer and those coming from the neighbors, the cluster memberships, cluster prototypes and attribute weights are correspondingly updated by every peer. The peer sends “convergence” message to its neighbors when the variation of the cluster prototypes and the attribute weights in two successive iterations is lesser than the preset threshold. When a peer receives the “convergence” notification from all its neighbors, the algorithm iteration of each peer terminates. Then, is obtaining the consensus on cluster prototypes and attribute weights. Due to the fact that the clustering performance coincides with that of the centralized clustering techniques, the data transmission energy is reduced and evened out among the peers by the CDFCM algorithm.

4 Results and discussion

This work makes use of the MATLAB simulation environment for evaluating the proposed techniques.

Table 2 shows the parameters for AFSA. Figures 4, 5, 6 and 7 shows the accuracy and F Measure for Proposed MO-FSASA K Means respectively. The Fig. 8 shows the fitness and Fig. 9 shows the standard deviation.

Accuracy considers both true positives (TP) and true negatives (TN) over all instances. In other words, accuracy shows the ratio of all correctly classified instances in (12).

$$\begin{aligned} \frac{TP+TN}{TP+FP+TN+FN} \end{aligned}$$
(12)

F-measure is a harmonic mean of precision and recall in (13, 14, 15).

$$\begin{aligned} \frac{2\times (\Pr ecision\times Recall)}{\Pr ecision+Recall} \end{aligned}$$
(13)

where

$$\begin{aligned} \Pr ecision= & {} \frac{TP}{TP+FP} \end{aligned}$$
(14)
$$\begin{aligned} Recall= & {} \frac{TP}{TP+FN} \end{aligned}$$
(15)
Table 2 Parameters for AFSA
Fig. 4
figure 4

Accuracy for proposed MO-FSASA K means compared with distributed K means

It is observed from Fig. 4 that the accuracy of Proposed MO-FSASA K Means performs better by 6.67%, by 4.49%, by 3.65%, and by 2.73% for the network size 500, 2000, 3000 and 4000 respectively than distributed K Means Clustering.

Fig. 5
figure 5

F measure for proposed MO-FSASA K means compared with distributed K means

It is observed from Fig. 5 that the F Measure of Proposed MO-FSASA K Means performs better by 9.94%, by 6.67%, by 5.52%, and by 4.13% for the network size 500, 2000, 3000 and 4000 respectively than distributed K Means Clustering.

Fig. 6
figure 6

Accuracy for proposed MO-FSASA K means compared with CDFCM

From the Fig. 6, it can be observed that the proposed MO-FSA SA K Means has higher classification accuracy by 4.65% for 500 number of size, by 4.04% for 2000 number of size, by 3.98% for 3000 number of size and by 4.82% for 4000 number of size when compared with CDFCM.

Fig. 7
figure 7

F measure for proposed MO-FSASA K means compared with CDFCM

Fig. 8
figure 8

Fitness

Fig. 9
figure 9

Standard deviation

From the Fig. 7, it can be observed that the proposed MO-FSA SA K Means has higher f measure by 3.89% for 500 number of size, by 4.91% for 2000 number of size, by 3.4% for 3000 number of size and by 4.18% for 4000 number of size when compared with CDFCM.

From the Fig. 8, it can be observed that the fitness function achieves better performance through number of iterations.

From the Fig. 9, it can be observed that the standard deviation increased through number of runs.

5 Conclusion

The peer-to-peer networking paradigm such as decentralization, self organizing etc. helps in building the network applications. MO-FSASA protocol was used for improving the neighbour cooperation in the peer-to-peer network application to speed up the file transfer, share redundant computing power for enhancing the performance. SA hybridized with AFSO algorithm based on the swallowed behaviour greatly improved the stability and the global search ability thus speeding the convergence. This shows more benefits like a high speed of convergence, high accuracy, fault tolerance, and flexibility. The main aim of the MOFSASA will be to estimate a suitable cluster number to partition the data set within such clusters with no need to know the clusters. The ability in obtaining the accurate solutions will be the most critical objective in the process of clustering.

The results have shown that this proposed MO-FSASA K Means performs with a better accuracy by about 6.67, 4.49, 3.65 and 2.73% for network size 500, 2000, 3000 and 4000 respectively than that of the distributed K Means Clustering. Likewise, the F Measure of the Proposed MO-FSASA K Means performs with a better accuracy by about 9.94, 6.67, 5.52 and 4.13% for network sizes of 500, 2000, 3000 and 4000 respectively than that of the distributed K Means Clustering.

The case of the future work the focus will be mainly to influence the proposed algorithm’s influence and also the MOFSASA application in the other fields in the industry for the prediction of stock, the optimization of weights and the recognition of patterns. In case of the future work for the purpose of implementing the other algorithms that are meta-heuristic.