Introduction

Clustering is a well-known data analysis method that can be used to arrange the data into different clusters. Similar data are placed in the same cluster, while dissimilar data are put in another cluster. The dissimilarity between data is calculated using the distance function [1]. In literature, clustering can be used in a variety of fields including text mining, social network analysis, data exploration, medical science, finance, and multimedia data [2, 3]. Additionally, the subcategories of clustering are (i) hard clustering and (ii) soft clustering. In hard clustering, the data can be assigned to a single cluster, while, in soft clustering, data can belong to several clusters depending on probability function value [4]. The main issue associated with clustering is the quality of partition. Based on the optimal partition (i.e., cluster centroids), a dataset is divided into numerous clusters, and the clustering techniques are utilized to calculate optimal centroids for obtaining optimal partitioning (clusters). These partitions are validated based on internal, external, and relative cluster validation measures [5]. These validation methods have defined the quality of the clustering algorithm. The internal validation is based on cluster creation, such as compactness, separation, and connectivity. The closeness between the data within a cluster can be used to define compactness. If a cluster displays the bare minimum of compactness, it is sufficient. The distance between two or more clusters is used to calculate separation, and it can be on the extreme side. The identical cluster data is used to describe the connectivity. External validation is described by comparing the clustering result against the class labels that are mentioned in the dataset. It includes various performance indicators like purity, rand index, entropy, etc. The relative validation evaluates the clustering structure through various user-defined parameter values that are provided by the algorithms [6].

In literature, different clustering algorithms based on diverse methodologies have been presented in the literature for addressing clustering problems and also considered the various viewpoints for solving these problems [7,8,9,10,11,12]. Further, the clustering techniques are divided into five categories such as partitional, hierarchical, grid, density, and model-based [13]. Additionally, each technique has several advantages and disadvantages. Recently, the research community has focused on grouping uncertain and high-dimensional data [14, 15]. Despite this, current developments in clustering can be described as fuzzy, evolutionary, meta-heuristic, and multimedia clustering [16, 17]. A few clustering studies also presented novel distance functions to improve the results of clustering. Some studies also focus on validation metrics to evaluate the effectiveness of clustering algorithms [11]. However, no one approach can more effectively handle the clustering problem with a wide range of datasets. Every algorithm has several advantages and disadvantages. According to a thorough literature review, it is identified that partitional clustering can be used more frequently than other clustering techniques, like the hierarchical, grid, and model-based clustering techniques due to being less time-consuming [18]. Partitional clustering determines a distinct set of clusters and allocates the data to a particular cluster based on distance measures. Additionally, prior knowledge of clusters (K) is the prerequisite for this clustering. K-Means is one of the popular partitional clustering algorithms [19], and it is important to anticipate the number of clusters. In turn, the clustering process can become more extensive and even more difficult to compute the optimal partitioning for the given dataset. Further, it is noticed that clustering results do not converge on the global optima [20]. Therefore, to solve the aforementioned problem, either cluster information should be provided beforehand, or the number of clusters should be calculated automatically to achieve optimal partitioning. Some meta-heuristic algorithms have been reported for handling partitional clustering in the literature. A few of these algorithms are summarized as PSO [21, 22], MOA [23, 24], CSSA [25], BH [26], BA [27, 28], ABC [29, 30], ACOA [31], BB-BC [32], BBO [33], etc. These algorithms have balanced search capabilities and produce prominent clustering results. However, these algorithms have some other drawbacks such as population diversity, trade-off, convergence rate, and sometimes trapped in local optima [34]. The aforementioned weaknesses of meta-heuristic algorithms can be eliminated with the assistance of additional meta-heuristic algorithms. To achieve superior clustering results, the weak point of one meta-heuristic algorithm is substituted by the strong point of another meta-heuristic algorithm. For example, the slow convergence rate of the PSO algorithm is handled by hybridizing the PSO with the k-harmonic mean [35]. Similarly, chaotic maps are incorporated into PSO to accelerate convergence speed [36]. To increase the performance of k-means and limit the effect of initial centroids on final clustering results, the k-mean and PSO algorithms are combined [37]. ABC trade-off problem is handled via a self-adaptive system [37, 38]. The BB-BC local optima problem is solved by integrating chaotic maps [39]. Based on approximation functions, these methods also provide near-to-optimal solutions for clustering issues. However, the No Free Lunch theorem [81] states that no single clustering approach can be used to solve all clustering problems as well as applicable to all datasets. Hence, there is a scope to develop a new clustering algorithm that can generate a more optimal solution for clustering problems and is also applicable to a wide range of datasets. As a result, a new algorithm for obtaining optimal solutions and solving large-scale clustering problems can be devised. Recently, a new meta-heuristic algorithm, named water flow optimizer (WFO) has been presented to handle a variety of constrained and unconstrained optimization problems [40, 41]. The hydraulic processes of water particles inspired this method, which describes the flow of water from highland to lowland. Further, the laminar and turbulent flows are taken into account to devise the stochastic search operators for the optimization process. From an optimization viewpoint, the water flow can be classified into two types such as either to maximize or minimize an objective function that can be designed to solve the problems. Second, an iterative process can be used to find the best solution and convergence behaviour of an algorithm. Hence, this work aims to examine the efficacy of the WFO algorithm for solving clustering problems. However, before implementing the WFO in the clustering field, several modifications are integrated into the WFO algorithm to make it more effective and generate optimal clustering results.

Motivation and Objectives of the Work

This research work aims to examine the efficacy of the WFO algorithm for handling clustering problems. Clustering can be described as an optimization problem with constraints and it determines the groups of similar data objects. The data in groups are allocated using a distance function and the goodness of the groups is assessed using the centroids. Hence, the main objective of the WFO algorithm is to compute optimal centroids to group the data objects into respective clusters. However, some improvements are proposed and integrated into WFO before its implementation. The reason for these improvements is to overcome the issues associated with the WFO algorithm. In the literature, it is mentioned that WFO has better optimization capability, but this algorithm also suffers from several shortcomings [42, 43]. These shortcomings are expressed as—(i) a random distribution function is utilized to seed the population of WFO in a pseudo-chaotic manner. Hence, the process of finding the optimal solution is sometimes blind. (ii) Further, the laminar flow (pl) operator and turbulent flow (pe) operator consist of constant values, which cannot adequately balance the exploration and exploitation ability of the algorithm. (iii) It is also observed that the laminar flow phase includes a parallel one-way search strategy that may result in search holes, and in turn, WFO may fall into local optima. The aforementioned shortcomings of the WFO are handled through the gaussian chaotic map, an improved balance mechanism and a search strategy method. The main objectives of the work are highlighted below.

  • The random distribution function is utilized to generate the initial population of the WFO algorithm in search space. This distribution function cannot generate a uniform population. Hence, a gaussian chaotic map is utilized to generate the population in a more systematic manner instead of random.

  • The exploration and exploitation ability of the WFO algorithm cannot be well balanced due to constant values of the laminar flow (pl) operator and turbulent flow (pe) operator. The constant values of the laminar flow (pl) operator and turbulent flow (pe) operator are substituted by dynamic values that can be changed in each iteration. Hence, an improved balance mechanism is designed to improve the exploration and exploitation ability of the WFO algorithm.

  • The one-way strategy is adopted in the laminar flow phase of the WFO algorithm to search the candidate solutions and in turn, an algorithm may stuck in local optima. This issue is resolved by using a neighbourhood search process.

  • These improvements are integrated into the WFO algorithm for generating more optimal solutions, called the improved WFO (IWFO) algorithm. The proposed algorithm is adopted for solving the clustering problems. The task of this algorithm is to determine the optimal centroid for a given dataset.

  • The efficacy of the proposed IWFO algorithm is evaluated using several well-known clustering datasets downloaded from the UCI repository based on SSE, AR and DR rates. The simulation results are compared with conventional as well as meta-heuristic algorithms. The findings stated that the proposed WFO algorithm obtains superior clustering results compared to other algorithms.

The rest of the manuscript is organized as related works based on meta-heuristic algorithms for clustering are summarized in "Literature Review". The information on the original water flow optimizer is discussed in "Water Flow Optimizer". "Proposed Improved Water Flow Optimizer Algorithm (IWFO)" presents the proposed improved water flow optimizer for clustering problems. The findings of the IWFO algorithm are discussed in "Experimental Results". The outcomes of this work are mentioned in "Conclusion".

Literature Review

The latest works on partitional clustering are discussed in this section.

Qtaish et al. [44] presented a hybrid capuchin search algorithm (HCSA) to deal with the local optima and initialization issues of the K-means clustering algorithm. Further, the chameleon swarm (CS) algorithm is adopted to strengthen the search mechanisms of the CSA algorithm. In addition, the aforementioned combination of the CS-CSA is used to generate the initial centroids for the K-means algorithm, called HCSA. The sixteen datasets are considered to evaluate the performance of the HCSA based on well-known clustering metrics. The simulation results are compared with nine meta-heuristic algorithms including k-means. The results revealed that the combination of CS-CSA-K-means successfully overrides the issues of K-means.

Kuo et al. [45] combined the three algorithms such as PSO, GA and GE with possibilistic fuzzy c-means (PFCM) for effective cluster analysis. Further, this study also integrates Atanassov’s intuitionistic fuzzy sets (IFSs) to PFCM, called PIFCM. These algorithms are summarized as MOGA-PIFCM, MOPSO-PIFCM and MOGE-PIFCM. The performances of these algorithms are evaluated using fifteen standard clustering datasets. The simulation results are assessed using well-known clustering metrics. The results showed that the MOGE–PIFCM algorithm outperforms other clustering algorithms in terms of validation indices.

Premkumar et al. [46] introduced a K-means-based grey wolf optimizer (KCGWO) for handling the clustering problems. In KCGWO, the K-means algorithm is used to enhance the optimization capabilities of the traditional GWO. This integration aims to improve the diversity and convergence rate of the GWO algorithm. A new weight factor is also added in GWO to improve its performance. The performance of the KCGWO is assessed over a set of benchmark clustering datasets and results are examined using well-defined metrics. The results stated that KCGWO achieves more stable clustering results compared to other algorithms. This integration also obtains optimal centroids.

Demirci et al. [47] presented a new meta-heuristic algorithm called the electrical search algorithm (ESA) to solve the clustering problems. This algorithm is inspired by the movement of electricity. Further, the inilization process is defined through the special structures called poles. The search mechanism is described by the movement of electrons. The four benchmark datasets such as iris, wine, seeds and hepatitis C virus are considered to examine the performance of the ESA algorithm and the results are compared with seven existing meta-heuristic algorithms including K-means. This work also considers the Friedman Signed Rank and post hoc Wilcoxon tests to evaluate the efficacy of ESA. The results showed that the ESA improves the clustering results in a significant manner compared to other algorithms.

Gharehchopogh and Khargoush [48] presented an improved version of the interactive autodidactic school (IAS) algorithm to solve the data clustering problems. To improve the exploitation process and also generate better populations, some chaotic maps are integrated into the ISA algorithm. The working of the ISA algorithm is described by three operators -individual training sessions, group training sessions, and new student challenges. The performance of the proposed chaotic ISA algorithm is examined over twenty clustering datasets. The results are evaluated using the best, average and worst solutions, and compared with state of art meta-heuristic algorithms. It is revealed that the Chebyshev chaotic function-based IAS algorithm obtains superior clustering results than other algorithms.

Ezgi combined the LaF and DE algorithms to discover the optimal centroids for data clustering problems [49]. The weak exploitation process of the LaF algorithm is improved using the DE/best/1 mutation operator. The performance of the proposed LaF-DE algorithm is evaluated using twelve clustering datasets based on the SSE and accuracy parameters. The results showed that the proposed LaF-DE provides better clustering results with eight datasets out of twelve. It is also noticed that the proposed algorithm also obtains satisfactory clustering results with the rest of the dataset compared to most of the algorithms being compared.

Duan et al. [50] handled the automatic clustering problem in high dimensional data by an improved affinity propagation based on an optimization algorithm. Initially, the dimensionality of data is reduced using the t-distributed stochastic neighbour method. Further, an improved equilibrium optimizer is utilized for optimizing the preference selection. The local search and convergence efficiency are enhanced using the crisscross strategy. Finally, the performance of the above-mentioned combination is assessed using seven high-dimensional datasets and results are compared with well-known four clustering algorithms based on NMI and RI. It is stated that the performance of the improved affinity propagation is significantly improved using the above-mentioned improvements.

An effective data clustering algorithm based on the chimp optimization algorithm (ChOA), generalized normal distribution algorithm (GNDA), and opposition-based learning (OBL) is reported for handling clustering problems [51]. Three different clustering algorithms are proposed for solving clustering problems and these algorithms are ChOA(I) and ChOA(II) based on chaotic maps, the combination of ChoAGNDA-OBL, and SO-ChOAGNDA. The performance of these algorithms is evaluated using five clustering datasets based on SSE and error rate. The simulation results are compared with a wide variety of meta-heuristic clustering algorithms. It is analyzed that the SO-ChOAGNDA algorithm obtains lower SSE and error rates compared to other algorithms.

Singh et al. [52] presented an enhanced whale optimization algorithm (EWOA) for handling the clustering problems effectively. The whale optimization algorithm (WOA) suffers from local optima, convergence rate and trade-off issues. The trade-off issue of WOA is addressed through searching behaviour water wave optimization algorithms. The local optima and convergence issues are resolved through the neighbourhood mechanism and tabu search algorithm. The eight benchmark datasets are considered to examine the performance of the EWOA based on average intra-cluster distance and f-measure. The results showed that superior clustering results are obtained by the EWOA with most of the datasets.

A variable neighbourhood strategy-based firefly algorithm (VNS-FA) is presented for effective data clustering [53]. It is observed that the firefly algorithm (FA) converges on premature solutions due to a lack of exploitation capability. Further, variable neighbourhood strategy (VNS) is incorporated into FA to resolve the aforementioned issues. The efficacy of the proposed VNS-FA is evaluated using eight well-known clustering datasets. The results are assessed using intra-cluster distance, internal CH metric, entropy and F-measure parameters. The results stated that the proposed VNS-FA method obtains superior results with most of the datasets.

An improved grey wolf’s optimization (IGWO) algorithm is presented for clustering and dynamic social networks [54]. This work aims to improve the accuracy rate of clustering problems. To achieve this, a label propagation algorithm is integrated into the GWO algorithm. The performance of the IGWO is examined using six well-known datasets based on NMI, intra-cluster distance, and error rate parameters. The results confirmed that the proposed IGWO algorithm obtains a better NMI rate than other clustering algorithms. It is also seen the hat proposed IGWO algorithm also gets a minimum error rate and intra-cluster distance than other algorithms.

A cat-based meta-heuristic algorithm is reported for addressing the partitional clustering [55]. Before implementing this algorithm, several modifications are inculcated into the cat algorithm in terms of tradeoff mechanism between local and global searches, diversity issues and premature convergence. In turn, an improved search mechanism, accelerated velocity equation and neighborhood mechanism are introduced for handling the aforementioned issues. The efficiency of the cat algorithm is examined over eight clustering datasets based on intra-cluster distance and f-measure parameters. It is analyzed that the proposed cat algorithm has a minimum intra-cluster distance with most of the dataset while obtaining a better f-measure rate than other algorithms.

Kushwaha et al. [56] presented an electromagnetic field optimization (EFO) method for resolving the issues of the k-mean algorithm. The k-mean algorithm suffers from poor selection of initial centroid and in turn traps in local optima. To overcome this issue of the k-mean algorithm, the EFO algorithm is utilized for generating the optimal initial centroid for the k-mean algorithm. It is also reported that the EFO algorithm may not stuck in local optima due to attraction and repulsion mechanisms. Several well-known datasets are considered for assessing the efficacy of the proposed clustering algorithm based on normalized mutual information (NMI), rand index (RI), and purity. The results showed that the proposed algorithm gets far better clustering results than the same class of algorithms.

To perform the clustering in massive data, Hashemi et al. [57] designed an updated PSO method. The proposed method utilizes the multi-start pattern reduction mechanism to decrease the calculation time for clustering. The clustering time is reduced using a reduction operator while the multi-start operator is utilized for ensuring population diversity and local minima. The six clustering datasets are considered to evaluate the performance of the proposed method based on accuracy and execution time. The results confirmed that better clustering results are obtained by the proposed PSO method.

Prior information on the cluster is the main prerequisite for partitional clustering, so the research community is paying close attention to automatic partitional clustering. The aforementioned problem of partitional clustering was also taken into account by Zhu et al. [58] who presented the AC-DPHS dynamic parameter-based HS algorithm for automatic data clustering. In the proposed AC-DPHS algorithm, the parameter is modified dynamically instead of static. The efficiency of the proposed AC-DPHS is assessed using five datasets. The results are evaluated using ARI, FM, and PBM parameters. It is stated that the proposed algorithm obtains superior results than other algorithms being compared.

A hybrid k-prototype clustering method based on enhanced SCA is developed by Kuo and Wang [59]. SCA algorithm is utilized to compute the optimal weight of attributes as well as initial attribute selection. Furthermore, the k-prototype method incorporates different mutation strategies, like Gaussian, Cauchy, levy, and single-point to produce better clustering outcomes. The ten datasets are chosen from the UCI repository to examine the efficacy of the k-prototype algorithm based on accuracy and Cohen kappa. The findings showed that the proposed k-prototype algorithm produces better clustering results than other algorithms.

A meta-heuristic clustering approach based on the behaviour of micro-bats is proposed by Kaur and Kumar [60]. Several modifications are designed to resolve the issues related to the algorithms such as convergence rate, local optima, and trade-off issues. The elitist process handles the slow convergence rate, the initialization issue is handled by a collaborative approach. The effectiveness of the proposed bat-based metaheuristic algorithm is evaluated using several healthcare and non-healthcare datasets utilizing well-known performance metrics like intra-cluster distance, standard deviation, accuracy, and rand index. The results of the proposed approach are compared to conventional clustering algorithms and it is claimed that the proposed approach achieves a better accuracy rate than conventional algorithms.

A learning automata-based hybrid MPA and JS algorithm is presented for handling the data clustering problem effectively [61]. The MPA and learning automata are utilized for enhancing proficiency and reducing the complexity of the JS algorithm. MPA is applied to memorize the best solution achieved so far while learning automata is used to improve the learning strategy of the JS algorithm. The efficiency of the hybrid MPA-JS algorithm is evaluated using ten clustering datasets based on SSE, average execution time and CHC. The results are compared with several state of art meta-heuristic algorithms and it is found that the proposed hybrid MPA-JS algorithm gets better results than other algorithms.

To handle the automatic clustering problems, Ikotun and Ezugwu [62] hybridized the symbiotic organisms search algorithm with K-means. A global threshold function is also utilized for computing the outliers in the dataset and further, such data points are eliminated from the dataset. Moreover, a three-way mutation mechanism is also designed and integrated into the symbiotic organisms search algorithm to improve the performance of the aforementioned hybridization. The efficacy of the hybrid SOS-KM algorithm is examined over forty-two datasets based on DB index, CS index and computational time. The findings stated that the hybrid SOS-KM algorithm gets superior results for most of the dataset in terms of DB index, and CS index.

A hybrid algorithm based on a firefly algorithm (FA) and self-organizing map (SOM) is presented for the clustering task, called FA-SOM [63]. In the proposed FA-SOM algorithm, initial cluster centroids are selected using the FA. Further, the weight of SOM is optimized using optimal cluster centroids generated by FA. The six clustering datasets are utilized for evaluating the performance of the FA-SOM based on SSE parameters. A statistical test is also adopted to investigate the efficacy of the proposed FA-SOM algorithm. The findings stated that the proposed FA-SOM algorithm obtains superior clustering results than other methods.

To achieve better computational time, Suryanarayana et al. [64] developed a dynamic k-mode clustering algorithm for effective cluster analysis. The PSO algorithm is adopted to compute the optimal centroid for the k-Mode algorithm. Further, a frequency-based technique is also utilized for updating the modes and decreasing the cost function for clustering tasks. The efficacy of the dynamic k-mode algorithm is evaluated using six well-known clustering datasets using accuracy, f-measure and NMI parameters. The results demonstrated that the proposed dynamic k-mode algorithm obtains more accurate clustering results than others (Table 1).

Table 1 Summarizes the works reported on clustering using meta-heuristic algorithms

Water Flow Optimizer

Recently, a new meta-heuristic algorithm, called WFO based on the water flow theory has been developed for solving global optimization problems [40]. This algorithm is inspired by the shapes of water flow which is described through laminar and turbulent flows. Hence, the WFO algorithm also comprises two different evolutionary operators (laminar and turbulent) as stated above. In turn, the WFO algorithm imitates the hydraulic phenomenon of water particles i.e. flowing from highland to lowland by using two operators. The optimization process of the WFO is designed by using the laminar and turbulent operators. The optimization problems are described as either minimization or maximization problems. So, an objective function can be defined either in terms of minimization or maximization and this objective function is being solved by the optimization steps of the WFO algorithm. As WFO is inspired by the flow of water from highland to lowland, hence there exists a similar pattern among water flow and searching of the solutions in optimization problems. So, in WFO, a water particle can act as a possible solution, the position of the water particle corresponds to the solution value, and the objective function is described in terms of the potential energy of the water particle. The description of the laminar and turbulent operators is mentioned below.

Laminar Operator

The working of this operator is inspired by a laminar flow of water and this flow specified that water particles should be moved in parallel straight lines as mentioned in Fig. 1. Further, the particle velocities can differ depending on the viscosity of water. The particles that are far from obstacles or walls, can move faster than those closer to obstacles and walls. Mathematically, this behaviour of water flow is demonstrated using Eq. (1), called the laminar operator.

Fig. 1
figure 1

Flowchart of the proposed IWFO clustering algorithm

$${\text{y}}_{\text{t}}\left(\text{i}\right)={\text{x}}_{\text{t}}\left(\text{i}\right)+\text{R}\times \overrightarrow{\text{V}} \forall \text{t}=\left\{{1,2},{3,4},\ldots \text{m}\right\}$$
(1)

In Eq. (1), \({y}_{t}\left(i\right)\) described as the moving position of tth water particle after ith iteration, \({x}_{t}\left(i\right)\) corresponds to the position of tth water particle in ith iteration, R is a random number in the range of \(\left[0, 1\right]\), called water coefficient and \(\overrightarrow{V}\) defines a vector value corresponding to a common motional direction (dimension) and m is the total number of water particles. The direction of common motional using Eq. (2).

$$\overrightarrow{\text{V}}={\text{x}}_{\text{e}}\left(\text{i}\right)-{\text{x}}_{\text{o}}\left(\text{i}\right)\text{ such that }\left(\text{e}\ne \text{f}, \left(\text{f}\left({\text{x}}_{\text{e}}\left(\text{i}\right)\right)\le \text{f}\left({\text{x}}_{\text{o}}\left(\text{i}\right)\right)\right)\right)$$
(2)

In Eq. (2), \({x}_{e}\left(i\right)\le {x}_{o}\left(i\right)\) describes the potential energy of water particles that can be used to select the best water particle in the ith iteration. If eth particle energy is lower than oth particle energy, then the best particle is \({x}_{e}\left(i\right)\), otherwise \({x}_{o}\left(i\right)\).

Turbulent Operator

The working of this operator is inspired by the turbulent flow of water and this can be described as a contingent pushing of water particles. In turn, an inconsistent motion can disrupt the bonding of water particles, and produce the fast flow of water to destabilize the obstruction and cause a local oscillation. In turn, amplitude is generated for oscillation, and the amplitude can grow with time. A shearing force is produced due to the aforementioned process, known as torque, which is responsible for swirling water particles. If the dimension of the problem to be solved can be described through a layer of water, then the dimension transformation of problems can be viewed as an irregular motion of water particles in turbulent flow. Mathematically, it can be understood as randomly selecting the dimension and position of particles during oscillation. Finally, the behaviour of the turbulent operator is described through Eq. (3).

$${\text{y}}_{\text{t}}\left(\text{i}\right)=\left({\text{x}}_{\text{t}}^{\text{d}}\left(\text{i}\right), {\text{p},\text{ x}}_{\text{t}}^{\text{d}+1}\left(\text{i}\right),\text{ p}, {\text{x}}_{\text{t}}^{\text{d}+2}\left(\text{i}\right),{\text{ p},\text{ x}}_{\text{t}}^{\text{d}+3}\left(\text{i}\right),\dots {\text{p},\text{ x}}_{\text{t}}^{\text{d}+\text{n}}\left(\text{i}\right)\right)\text{ where d}=\left\{\text{1,2},3,\dots \text{n}\right\}$$
(3)

In Eq. (3), 'p’ corresponds to the jostling operator, and it can be described through Eq. (4).

$$\text{p}=\left\{\begin{array}{ll}\gamma \left({\text{x}}_{\text{t}}^{\text{d}}\left(\text{i}\right), {\text{x}}_{\text{j}}^{\text{d}}\left(\text{i}\right)\right), & if \; r<pe\\ \vartheta \left({\text{x}}_{\text{t}}^{\text{d}}\left(\text{i}\right), {\text{x}}_{\text{j}}^{\text{d}+1}\left(\text{i}\right)\right), & otherwise\end{array}\right.$$
(4)

In Eq. (4), j describes the index of the randomly chosen particle (e) such as \(j\in \left\{1, 2, 3,\ldots s\right\}\) and \(i\ne j .\)‘d + 1’ defines a dimension that can be chosen randomly such that \(d+1\ne d\). \(pe\) can be defined as an eddying parameter and r is a random number in the range of [0, 1]. Further, it is observed that the eddy shape is similar to the Archimedean spiral and it can be expressed using the Eq. (5).

$${\upgamma }\left( {{\text{x}}_{{\text{t}}}^{{\text{d}}} \left( {\text{i}} \right),{\text{ x}}_{{\text{j}}}^{{\text{d}}} \left( {\text{i}} \right)} \right) = {\text{x}}_{{\text{t}}}^{{\text{d}}} \left( {\text{i}} \right) + \upvarphi \times {\uptheta } \times \cos \left( {\uptheta } \right)$$
(5)

In Eq. (5), Ɵ is a random value in the range of \(\left[-\pi , \pi \right]\) and \(\varphi\) corresponds to the shearing force between tth and eth particles. The shearing force \((\varphi )\) between tth and eth particles is determined using Eq. (6).

$$\upvarphi = \left| {{\text{x}}_{{\text{t}}}^{{\text{d}}} \left( {\text{i}} \right),{\text{ x}}_{{\text{j}}}^{{\text{d}}} \left( {\text{i}} \right)} \right|$$
(6)

Further, the general behaviour of water particles is described through a transformation function and this function is summarized in Eq. (7).

$$\upvartheta \left( {{\text{x}}_{{\text{t}}}^{{\text{d}}} \left( {\text{i}} \right),{\text{ x}}_{{\text{j}}}^{{{\text{d}} + 1}} \left( {\text{i}} \right)} \right) = \left( {{\text{ub}}^{{\text{d}}} - {\text{lb}}^{{\text{d}}} } \right)\frac{{{\text{x}}_{{\text{j}}}^{{{\text{d}} + 1}} \left( {\text{i}} \right) - {\text{lb}}^{{{\text{d}} + 1}} }}{{{\text{ub}}^{{{\text{d}} + 1}} - {\text{lb}}^{{{\text{d}} + 1}} }}$$
(7)

In Eq. (7), \({ub}^{d}\) denotes the upper bound in dth dimension, \({lb}^{d}\) denotes the lower bound of the dth dimension. \({x}_{j}^{d+1}\left(i\right)\) denotes the jth index of eth particle in dimension (d + 1).

Evolutionary Rule

This subsection discusses that the water flow can be either laminar or turbulent. The distinction between laminar and turbulent flows can be made with the help of the Reynolds number. It is stated that a threshold function is defined to determine the flow of water. If the flow of water is less than a threshold, then it can be considered as laminar, otherwise turbulent. Further, laminar flow probability can be denoted through \(\left({P}_{l}\right)\), whereas, turbulent flow probability is defined using \(\left(1-\left({P}_{l}\right)\right)\). Moreover, laminar probability can be described as a control parameter and it is a random number in the range of [\(0, 1]\). The flow of water is determined using \({P}_{l}.\) However, it is also discussed that water can be flowed from highland to low land, in turn, the position of water particles may change. Hence, this behaviour of water can be characterized using the evolutionary rule. As par this rule, the moving position of the water particles can be changed according to their potential energy, if potential energy is less than the threshold, a new moving position is calculated for the water particles otherwise, there is no change in the current position. This behaviour is illustrated using Eq. (8).

$${\text{X}}_{t}\left(\text{i}+1\right)=\left\{\begin{array}{ll}{ y}_{t}\left(i\right), & if \; f\left({y}_{t}\left(i\right)\right)\le f\left({x}_{t}\left(i\right)\right)\\ {\text{x}}_{t}\left(i\right), & otherwise\end{array}\right.$$
(8)

Proposed Improved Water Flow Optimizer Algorithm (IWFO)

This section presents the improved water flow optimizer algorithm for solving the hard clustering problems. It is observed that several limitations are associated with the water flow algorithm such as (i) random distribution function is utilized to generate the initial population, (ii) adequately less balance between exploration and exploitation mechanisms, and (iii) due to one-way strategy adopted in laminar flow phase, WFO may be stuck in local optima. Firstly, these aforementioned limitations of the WFO algorithm are handled through a logistic chaotic map-based function for generating the initial population, the balance between exploration and exploitation mechanisms is enhanced using improved solution search equations, and the local optima issue is handled through a multi-strategy search process. The proposed improvements are discussed below.

Chaotic Map-Based Distribution Function

In WFO, a random function is applied to generate the initial population. But, a rand () function provides a random number between 0 and 1. Further, it cannot provide random numbers in uniform order. Hence, the population cannot be generated uniformly throughout the search space. The initial population for WFO is computed using Eq. (9).

$$x=lb+rand\left(N, D\right)\cdot \left[ub-lb\right]$$
(9)

In Eq. (9), \(x\) denotes the initial population of water particles, \(lb\) and \(ub\) denote the lower and upper bounds of the search space, N represents the total number of water particles and D can be defined as the dimension of the search space. Suppose, the number of water particles (N) is 4, dimension (D) is 3 and \(lb\) for the search space is given as \((0, 10, 25)\) and \(ub\) is given as \((30, 50, 60)\). Now, a \(rand\left(\cdot \right)\) function is used to generate four numbers in the range of \([0, 1]\). These numbers are integrated with Eq. 9 for generating the initial population of the water particles and the initial population is expressed as (24.6270, 27.4879, 29.0764), (27.2680, 19.3384, 56.8596) and (4.6826, 13.8287, 28.7677). By analyzing the population, it is noticed that the population cannot explore the entire search space effectively. On the other side, search space can be examined more systematically using chaotic maps rather than \(rand ()\) function. This work employs the gaussian chaotic map for generating the initial population of water particles and it is computed using Eq. (10).

$$x=lb+{c}_{k}\left(N, D\right)\cdot \left[ub-lb\right]$$
(10)

Here, \(x\) denotes the population of water particles, \(lb\) and \(ub\) are lower and upper bounds, N is the number of water particles and D is the dimension, \({c}_{k}\) denotes kth Gaussian map and it is computed using Eq. (11).

$${c}_{k+1}=\mu \times {e}^{{c}_{k}} \mu \in \left[-\text{1,1}\right]$$
(11)

In Eq. (11), \(\mu\) is a user-defined parameter in between − 1 to 1 4, but the optimal value of \(\mu\) is − 0.2, \({c}_{k}\) denotes the kth Gaussian map, and \({c}_{k+1}\) denotes the (k + 1)th chaotic map, and the value of \({c}_{0}\) is set to 0.6.

Balancing Exploration and Exploitation Mechanisms

The exploration and exploitation mechanisms are an integral part of every meta-heuristic algorithm. The exploration can be understood as a global search, while exploitation can be described as a local search. However, these mechanisms should be balanced to achieve effective solutions. It is noticed that a weak local search cannot exploit then search space efficiently. On the other side, the global search is responsible for exploring the solution that can be found during the local search. This search also determines the capability of a local solution weather it can be acted as either a global solution or not. Moreover, the global search is also directed the search towards the global optima. To guide the solution towards global optima, the search process also knows the direction of the previously best solution. Hence, two nonlinear (sigmoid and tanh) functions are taken into consideration for obtaining a better tradeoff between the search mechanisms. In WFO, the search mechanisms are described by the laminar operator \((pl)\) and turbulent operator \((pe)\) which are summarized in Eqs. (1213).

$$pl=\left({c}_{1}-{c}_{2}\right)\times \left(\frac{1}{1+{e}^{{-t}_{curr}/{t}_{max}}}\right)$$
(12)
$$pe={c}_{2}\times \left(\frac{2}{1+{e}^{{-2t}_{curr}/{t}_{max}}}-1\right) +{c}_{1}$$
(13)

Here, \({c}_{1}\) and \({c}_{2}\) are two cognitive parameters whose values are 0.5 and 0.3. \({t}_{curr}\) is the current iteration and \({t}_{max}\) is maximum iteration. If, \(\left(pl>rand\right)\), the updated position of water particles (y) is computed by Eq. (14).

$${\text{y}}_{t}\left(\text{i}\right)=\left\{\begin{array}{ll}{x}_{t}\left(i\right)+r\times \overrightarrow{\text{d}}, & if \; rand<pl\\ {\text{x}}_{t}\left(i\right)+dt.({x}_{best}-{\text{x}}_{t}\left(i\right)), & otherwise\end{array}\right.$$
(14)

In Eq. (14), \({x}_{t}\left(i\right)\) is the current position of the water particle in ith iteration, \({x}_{best}\) is the best position, r is a random number in between [0, 1], \(\overrightarrow{d}\) is the direction of water particles and \(dt\) is a decrement operator inspired by the whale optimization algorithm. The direction of water particles is computed using Eq. (15), while \(dt\) is determined by Eq. (16).

$$\overrightarrow{\text{d}}={x}_{t}\left(i\right)-{x}_{s}\left(i\right) \; such \; that \; \left(t\ne s, \text{f}\left({\text{x}}_{t}\left(\text{i}\right)\right)\le \text{f}\left({\text{x}}_{s}\left(\text{i}\right)\right)\right)$$
(15)

Here, \({x}_{t}\left(i\right)\) and \({x}_{s}\) denote the tth and sth water particles in ith iteration, \(f\left({x}_{t}\left(i\right)\right)\) and \(f\left({x}_{s}\left(i\right)\right)\) is the potential energy of tth and sth water particles. Further, the potential energy of the particle (\({x}_{t}\)) is computed using the Eq. (17)

$$dt=\frac{1}{1-cos\left(\pi l\right)}$$
(16)

In Eq. (16), \(l\) is a random number in the range of [0.6, 1]. Further, the potential energy \(\left(f\left({x}_{t}\left(i\right)\right)\right)\) is computed using the Eq. (17).

$$f\left({x}_{t}\left(i\right)\right)=\sum_{i}^{n}\frac{SSE({x}_{t}\left(i\right))}{\sum_{t=1}^{n}SSE({x}_{t}\left(i\right))}$$
(17)

Here, \(f\left({x}_{t}\left(i\right)\right)\) is the potential energy of the tth water particle \(({x}_{t})\), and \(SSE({x}_{t}\left(i\right))\) is the sum of the squared error. It can be defined as the sum of the squared error of the given particle divided by the sum of the squared error of all particles.

Neighborhood Search Process

This subsection discusses the neighbourhood search process to overcome the single search strategy of WFO. This process also handles the local optima issue of the WFO algorithm effectively. In the literature, neighbourhood search mechanisms have been employed to overcome local optima issues [65, 66]. This work proposes three methods to overwhelm the local optima- (i) Gaussian local search strategy, (ii) Mutation strategy, and (iii) Crossover strategy.

  • Gaussian local search strategy: It is observed that local search is responsible for finding the optimal solutions by exploring the search space and this solution is called local optimal solution. On the other side, it is also noticed that local search sometimes converges quickly and in turn solution sticks in local optima. Hence, to avoid the solution stuck in local optima, this work explores the Gaussian local search strategy for computing the local optimal solution. This process is summarized below.

    $${{x}_{t}}^{\prime}\left(i\right)={\left(1-\epsilon \right)\times x}_{t}\left(i\right)+{\tau }_{c}$$
    (18)

In Eq. (18), \({{x}_{t}}^{\prime}\left(i\right)\) represents the new position of the water particle, \({x}_{t}\left(i\right)\) denotes the current position of the water particle that can be responsible for local optima, \({\tau }_{c}\) is a Gaussian chaotic variable which is computed using the Eq. (19).

$${\tau }_{c}={lb}^{d}+{c}_{k}\left({ub}^{d}-{lb}^{d}\right)\times {g}_{best}^{d} \; where \; d\in \left(\text{1,2},3,..,D\right)$$
(19)

In Eq. (19), \({lb}^{d}\) and \({ub}^{d}\) denote the lower and upper constraints in the dth dimension, \({g}_{best}^{d}\) denotes the best position of water particles, \({c}_{k}\) denotes the logistic chaotic map that can be computed using Eq. (11).

  • Mutation-based strategy: The local optima issue arises due to similar populations in successive iterations. In turn, there is no change in the allocation of data objects to the respective clusters. Hence, this work also explores the capability of the mutation operator to generate a diverse population to avoid the local optima condition. The new position of water particles is generated by considering the three previous personal best positions of water particles in a random fashion such as \({x}_{e}, {x}_{f}, {x}_{g} \; where, e\ne f\ne g\) and this behaviour is depicted using Eq. (20).

    $${{x}_{t}}^{\prime}\left(i\right)={x}_{e}\left(i\right)+ {c}_{k}\times \left({x}_{f}\left(i\right)-{x}_{g}\left(i\right)\right)$$
    (20)

In Eq. (20), \({c}_{k}\) denotes the gaussian chaotic map computed using Eq. (11) and it is also utilized to control the chaos of mutation parameter.

  • Crossover-based strategy: This strategy is derived from the concept of the crossover operator. The new position is generated by performing the two-point crossover between two randomly chosen gbest water particles and it is expressed in Eq. (21).

    $${{x}_{t}}^{\prime}\left(i\right)=\left\{\begin{array}{ll}{x}_{t}\left(i\right)+{c}_{k}\left({x}_{p}-{x}_{q}\right) & if, rand<\left(\text{f}\left({\text{x}}_{p}\right)||\text{f}\left({\text{x}}_{q}\right)\right)\\ {x}_{t}\left(i\right)+{c}_{k}\times \left(1-\frac{{t}_{curr}}{{t}_{max}}\right) & otherwise\end{array}\right.$$
    (21)

In Eq. 21, \({{x}_{t}}^{\prime}\left(i\right)\) denotes the new position of water particle, \({x}_{t}\left(i\right)\) denotes the previous position of the water particle, \({x}_{p}, \; and \; {x}_{q}\) are two randomly chosen water particles from the list of gbest particles, \(f\left({x}_{p}\right)\text{ and }f\left({x}_{q}\right)\) describe the potential energy of pth and qth particles, \({c}_{k}\) denotes the gaussian chaotic map which is computed using Eq. 11, \({t}_{curr}\) denotes the current iteration and \({t}_{max}\) denotes the maximum number of iterations.

Algorithmic Steps of the IWFO Algorithm

This subsection presents the algorithmic steps of the proposed IWFO algorithm for clustering problems. The IWFO algorithm aims to determine the optimal centroids for the given dataset. The working of the IWFO algorithm is defined through six major steps. These steps are (i) Initialization, (ii) Objective Function and Data Allocation, (iii) Laminar Flow, (iv) Turbulent Flow, (v) Evolving and updation. The descriptions of these steps are mentioned below.

  1. (i)

    Initialization: This step corresponds to user-defined parameters of the IWFO algorithm. The parameters include the number of particles (clusters (K)), dimension (D), laminar probability (\(pl\)), eddying probability \((pe)\), lb, ub, and maximum iteration. The initial position of the water particles is computed by Eq. (10). Further, initial positions are also described as the initial centroids for cluster problems.

  2. (ii)

    Objective function and data allocation: This step defines a problem-related objective function. For clustering problems, the objective function is defined using a distance function and this function is used to allocate the data to clusters. So, in this work, an objective function is defined in terms of the Euclidean distance and the allocation of data to respective clusters is done based on the minimum Euclidean distance. This function is defined in Eq. (22).

    $$\text{D}\left({\text{x}}_{\text{i}},{\text{c}}_{\text{j}}\right)= \sqrt{\sum_{\text{m}=1}^{\text{d}}{\left({\text{x}}_{\text{im}},{\text{c}}_{\text{jm}}\right)}^{2}}$$
    (22)

Here, \(D\left({x}_{i},{c}_{j}\right)\) is the Euclidean distance between data \(({x}_{i})\) and centroids \({(c}_{j})\), d denotes the dimension of data, and it is expressed as m \(=\text{1,2},3,\dots .d\).

  1. (iii)

    Laminar flow: When data is allocated to respective clusters, the next step is to compute the direction of water flow. It can be achieved by comparing the laminar probability \((pl)\) with a \(rand()\) function such as If \(\left(rand \left(.\right)<pl\right).\) If the condition is true, then water flow is defined as laminar flow, and the position of water particles is updated using the laminar flow mechanism. The process of the laminar flow is highlighted in Algorithm 1.

Algorithm 1
figure a

Steps for the laminar flow phase

  1. (iv)

    Turbulent flow: If water flow is not defined as laminar, then it can be turbulent and the position of water particles is updated using the turbulent flow mechanism. This mechanism is mentioned in Algorithm 2.

Algorithm 2
figure b

Steps for the turbulent flow phase

  1. (v)

    Evolving and Updation: This step corresponds to determining the final updated position of water particles (cluster centroids) after laminar and turbulent mechanisms. To determine the final position of water particles, the potential energy of new particles is compared with the previous potential energy of particles. If \(\left(f\left({x}_{t,new}\left(i\right)\right)>f\left({x}_{t,old}\left(i\right)\right)\right)\) is higher, then the new centroid is as \({x}_{t}\left(i+1\right)\leftarrow {x}_{t,new}\left(i\right)\), otherwise \({x}_{t}\left(i+1\right)\leftarrow {x}_{t,old}\left(i\right).\) This step also includes a limit operator for alleviating the local optima problem and it is set to 5. If, potential energy \(\left(f\left({C}_{k}\left(i\right)\right)\right)\) is not improved in five successive iterations, it is assumed that the algorithm sticks in local optima and the neighbourhood search procedure is called to compute the new position of the water particles. If the termination condition is not met, then steps (ii)-(v) are repeated. Otherwise, obtain the optimal position of water particles (optimal cluster centroids). The algorithmic steps of the proposed IWFO algorithm are listed in Algorithm 3, while the flowchart of the IWFO-based clustering algorithm is depicted in Fig. 1.

Algorithm 3
figure c

Proposed IWFO algorithm for cluster analysis

Experimental Results

This section presents the simulation result of the proposed IWFO algorithm and other algorithms. The efficacy of the IWFO algorithm is examined using twelve standard clustering datasets. These datasets are downloaded from the UCI repository and the description of these datasets is mentioned in Table 2. The simulation results of the proposed IWFO algorithm are compared with a wide variety of meta-heuristic algorithms. These algorithms are described as K-means [67], PSO [21], ACO [68], ABC [69], DE [70], GA [71], BB-BC [72], BAT [73], VS [74], MBOA [75], WOA [46], ICSO [76], TLBO [77], CS [78], GSA [79], LION [80], WWO[82] and GWO [52]. The simulation results are evaluated using intra, SD, rank, AR and DR parameters. The results indicate an average of thirty separate runs. The intra-parameter can be defined as the sum of the distance between data objects and respective cluster centres. This parameter indicates the closeness between data objects and cluster centres. Standard deviation (SD) is also computed for validating the intra-results. Further, AR denotes the accuracy rate while DR denotes the detection rate. The maximum iteration is set to 100. The parameter settings of the other algorithms except the proposed IWFO are recommended the same as reported in the corresponding literature. The parameters setting of the proposed IWFO are given as population size (P) is defined in terms of clusters (K), laminar probability \(({P}_{la})\in \left(0.2, 0.5\right)\), eddying probability \(({P}_{e})\in \left(0.5, 0.9\right)\), limit operator \(\left({limt}_{op}\right)=5\), maximum iteration (max_iter) is set to 100. The experiment is conducted in a Matlab environment using a Core i5-based processor with 32 GB of RAM.

Table 2 Description of the datasets

Simulation Results

This subsection discusses the simulation results of the proposed IWFO and other clustering algorithms using intra-cluster distance (intra), standard deviation (SD), rank, accuracy (AR) and detection rate (DR) parameters. Table 3 presents the simulation results of the proposed IWFO and other standard existing clustering algorithms based on intra, SD and rank parameters. The simulation results of the proposed algorithm are compared with the original WFO, K-means, PSO, ACO, ABC, DE, GA, BB-BC, and BAT algorithms. A variety of datasets are considered for evaluating the efficacy of the proposed IWFO and other clustering algorithms. It is seen that the proposed IWFO algorithm obtains the minimum value of intra-parameter with most of the datasets. The proposed algorithm achieves minimum intra with iris (9.21E+01), glass (2.03E+02), ionosphere (9.04E+02), vowel (1.56E+05), balance (5.78E+04), cancer (1.23E+03), and thyroid (1.27E+03) except wine, control, crude oil, CMC and LD datasets. For the wine dataset, the DE algorithm obtains the minimum intra-value (1.58E+04) while the proposed IWFO algorithm obtains the second minimum intra-value (1.59E+04) among all algorithms. It is also seen that the BB-BC algorithm gets minimum intra values (2.38E+04 and 3.13E+03) for the control and LD dataset, while the proposed IWFO algorithm achieves (2.41E+04 and 2.85E+03) minimum intra values for both datasets. For the crude oil dataset, the ACO algorithm achieves minimum intra value (2.47E+02), whereas the proposed IWFO algorithm obtains (2.63E+02) value for intra parameter. For the CMC dataset, the K-means algorithm achieves minimum intra-cluster distance (5.59E+03), while the proposed IWFO algorithm obtains the second minimum intra-distance rate i.e. (5.64E+03) which is the second minimum among all algorithms. It is also noticed that for the LD dataset, the BB-BC algorithm achieves minimum intra-cluster distance (1.13E+03), while the proposed IWFO algorithm obtains second minimum intra-distance rate i.e. (1.23E+03).

Table 3 Comparison of the simulation results of the proposed IWFO algorithm and other standard existing clustering algorithms using intra, SD and rank parameters

SD is also an important parameter that can verify the performance of the proposed IWFO algorithm in successive iterations. By analyzing the SD parameter, it is revealed that the proposed IWFO algorithm obtains a minimum SD rate for most datasets compared to other algorithms. The proposed IWFO algorithm gets minimum SD rate with wine (4.34E+01), ionosphere (1.53E+01), vowel (1.41E+01), balance (3.34E+02), crude oil (1.08E+01), CMC (2.14E+01), LD (1.27E+01), cancer (1.46E+02), and thyroid (1.12E+01). For the iris dataset, the ACO algorithm obtains a minimum SD rate (1.31E+00), while the proposed IWFO algorithm obtains a (3.20E+00) SD rate. For the glass dataset, the GA algorithm gets a minimum SD rate (4.14E+00) while the proposed IWFO algorithm achieves a (9.98E+00) SD rate. For the control dataset, the BB-BC algorithm achieves a minimum SD rate (1.09E+02), while the proposed algorithm obtains a (1.42E+02) SD rate. It is analyzed that the proposed algorithm achieves comparable SD results with the aforementioned datasets. Further, a rank parameter is also used to describe the ranking of each algorithm and this parameter is devised based on the minimum intra parameter. By analyzing the rank parameter, it is revealed that the proposed IWFO algorithm obtains the first rank with eight datasets (iris, glass, ionosphere, vowel, CMC, cancer, thyroid). For wine, control and LD datasets, the proposed algorithm obtains the second rank. For the crude oil dataset, the proposed algorithm gets third rank. The average ranking of the proposed IWFO algorithm is 1.5 using all datasets, while, the DE algorithm exhibits the worst rank (7.83) among all algorithms. It is also noticed that the average rank of the original WFO algorithm is 3.83. Hence, it is stated that the proposed algorithm obtains superior clustering results with most of the datasets.

Further, the performance of the proposed IWFO algorithm is also compared with the recently developed meta-heuristic algorithms. These algorithms are VS, MBOA, WOA, ICSO, TLBO, CS, GSA, LION and GWO. The results are evaluated using intra, SD and rank parameters. The intra-parameter is utilized to compute the compactness among the data objects within clusters. Table 4 presents the simulation results of the proposed IWFO and other algorithms based on the intra, SD and rank parameters. It is noticed that the proposed IWFO algorithm gets the minimum value of intra parameter with most of the datasets except vowel, balance, crude oil and thyroid datasets. The proposed IWFO algorithm obtains minimum intra values with iris (9.21E+01), wine (1.59E+04), glass (2.03E+02), ionosphere (9.04E+02), balance (5.78E+04), CMC (5.64E+03), cancer(1.23E+03), and LD (2.85E+03) except wine, control, crude oil and LD datasets. For vowel and thyroid datasets, the ICSO algorithm obtains minimum intra values (1.55E+05 and 9.90E+02) while the proposed IWFO algorithm obtains (1.56E+05 and 1.27E+03) intra values. It is also seen that the TLBO algorithm gets minimum intra values (5.36E+04 and 2.59E+02) for balance and crude oil datasets, while the proposed IWFO algorithm achieves (5.78E+04 and 2.63E+02) intra values for both datasets. SD is also a significant parameter that can verify the performance of the proposed IWFO algorithm in successive iterations. By analyzing the SD parameter, it is revealed that the proposed IWFO algorithm obtains a minimum SD rate for most of the datasets except iris, wine, control and balance compared to other algorithms. The proposed IWFO algorithm gets a minimum SD rate with glass (9.98E+00), ionosphere (1.53E+01), vowel (1.41E+01), crude oil (1.08E+01), CMC (2.14E+01), LD (1.27E+01), cancer (1.46E+02), and thyroid (1.12E+01). For the iris dataset, the WOA algorithm obtains a minimum SD rate (1.06E+00), while the proposed IWFO algorithm obtains a (3.20E+00) SD rate. For the wine dataset, the TLBO algorithm gets a minimum SD rate (2.94E+01) while the proposed IWFO algorithm achieves a (4.34E+01) SD rate. For control and balance datasets, the VS algorithm achieves minimum SD rates (1.17E+02 and 1.04E+02), while the proposed algorithm obtains (1.42E+02 and 3.34E+02) SD rates. For the crude oil dataset, the ABC algorithm gets a minimum SD rate (1.09E+01), while the proposed IWFO algorithm obtains a (3.80E+01) SD rate. It is analyzed that the proposed algorithm achieves comparable SD results with the aforementioned datasets. Further, a rank parameter is also used to describe the ranking of each algorithm and this parameter is devised based on the minimum intra parameter. By analyzing the rank parameter, it is revealed that the proposed IWFO algorithm obtains the first rank with most of the datasets except vowel, CMC, balance, crude oil and thyroid datasets. The proposed algorithm obtains the second rank with vowel, crude oil and thyroid datasets. For CMC and balance datasets, the proposed algorithm obtains the third rank among all algorithms being compared. Hence, it is stated that the proposed algorithm obtains better clustering results with most of the datasets based on intra, SD and rank parameters.

Table 4 Comparison of the simulation results of the proposed IWFO algorithm and other recently developed clustering algorithms using intra, SD and rank parameters

The efficacy of the proposed IWFO algorithm is also assessed using AR and DR parameters. The simulation results of the proposed IWFO and other standard existing algorithms including the original WFO algorithm are reported in Table 5. It is seen that the proposed IWFO algorithm gets superior clustering results based on AR parameter compared to KM, PSO, ACO, ABC, DE, GA, BB-BC and BAT algorithms. The proposed IWFO algorithm obtains better AR results with most of the datasets such as iris (0.961), glass (0.717), wine (0.873), ionosphere (0.789), control (0.801), vowel (0.878), balance (0.913), crude oil (0.796), CMC (0.613), LD (0.685), cancer (0.836), and thyroid (0.738). DR is also described as one of the potential parameters for analyzing the performance of the clustering algorithms. By analyzing the DR parameter, it is stated that the proposed IWFO algorithm gets superior results with most of datasets such as iris (0.968), glass (0.743), wine (0.896), ionosphere (0.804), control (0.819), vowel (0.904), balance (0.932), crude oil (0.823), CMC (0.646), LD (0.746), cancer (0.868), and thyroid (0.786). It is also observed that the proposed IWFO algorithm obtains better AR and DR rates as compared to the original WFO algorithm. Hence, it is stated that the proposed IWFO algorithm obtains better AR and DR results than similar classes of algorithms.

Table 5 Simulation results of the proposed IWFO and other standard existing clustering algorithms using AR and DR parameters

Moreover, the performance of the proposed IWFO algorithm is also compared with several recent clustering algorithms reported in the literature. The simulation results of the proposed IWFO and other algorithms based on AR and DR parameters are presented in Table 6. It is revealed that the proposed IWFO algorithm obtains better AR results with most of the datasets except wine, balance, control and cancer. The proposed algorithm achieves a higher accuracy rate with iris (0.961), glass (0.717), wine (0.873), ionosphere (0.789), control (0.801), vowel (0.878), balance (0.913), crude oil (0.796), CMC (0.613), LD (0.685), cancer (0.836), and thyroid (0.738). For the wine dataset, the LION algorithm obtains a higher AR rate (0.878) among all algorithms, while the proposed IWFO gets (0.873) AR rate. It is also noticed that the GWO algorithm obtains a higher AR rate (0.814) for the control dataset, whereas, the proposed algorithm obtains a (0.801) AR rate. For the balanced dataset, the CS algorithm achieves a superior AR rate (0.917), while the proposed algorithm gets (0.913) AR results. Furthermore, the TLBO algorithm achieves a better AR rate (0.921) for the cancer dataset, while the proposed algorithm obtains (0.836) AR rates. It is also analyzed that the proposed algorithm gets comparable AR results for the aforementioned datasets. By analyzing the DR parameter, it is said that the proposed IWFO algorithm obtains better DR results with most of the datasets except wine, balance, control and cancer. The proposed algorithm achieves higher DR results with iris (0.968), glass (0.743), wine (0.896), ionosphere (0.804), control (0.819), vowel (0.904), balance (0.932), crude oil (0.823), CMC (0.646), LD (0.746), cancer (0.868), and thyroid (0.786). For the wine dataset, the LION algorithm obtains a higher DR rate (0.908) among all algorithms, while the proposed IWFO gets a (0.896) DR rate. It is also noticed that the GWO algorithm obtains a higher DR rate (0.833) for the control dataset, whereas, the proposed algorithm obtains (0.819) DR rate. For the balance dataset, the CS algorithm achieves a superior DR rate (0.938), while the proposed algorithm gets (0.932) DR results. Furthermore, the TLBO algorithm achieves a better DR rate (0.952) for the cancer dataset, while the proposed algorithm obtains (0.868) DR rates. Finally, it is stated that the proposed IWFO algorithm gets superior clustering with most datasets in terms of intra, SD, rank, AR and DR parameters.

Table 6 Simulation results of the proposed IWFO and recent clustering algorithms based on AR and DR parameters using recent clustering algorithms

Apart from the well-known clustering measure, several researchers also adopt the execution time for comparing the performance of the clustering algorithms. This work also compares the performances of the proposed IWFO and other clustering algorithms using the execution time parameters. Tables 7, 8 and 9 illustrate the execution time of each clustering algorithm using different datasets. The execution time is measured in seconds. Tables 7 and 8 present the execution time of the proposed IWFO and other clustering algorithms. It is analyzed that the proposed IWFO has a competitive execution time compared to other algorithms. Table 9 depicts the average execution time of each technique. It is found that the KM algorithm obtains minimum average execution times using all datasets and its rank is 1. The average ranking of the proposed IWFO algorithm is seven in the context of execution time. It is also said that the proposed IWFO has a better average execution time compared to most of the clustering algorithms.

Table 7 Depicts the execution time of clustering algorithms (in seconds)
Table 8 Depicts the execution time of clustering algorithms (in seconds)
Table 9 Average ranking of each algorithm based on execution time using all datasets

Figure 2 illustrates the grouping of the data into different clusters based on the proposed IWFO clustering algorithm. This work considers the well-known clustering datasets such as iris, glass, wine, ionosphere, control, vowel, balance, crude oil, CMC, LD, cancer, and thyroid. The clustering results using optimal centroids are presented in Fig. 2. Hence, it is summarized that the proposed algorithm allocates the data to appropriate clusters more accurately.

Fig. 2
figure 2figure 2

Demonstrates the different clusters of data in a variety of datasets using the proposed IWFO algorithm

Statistical Test Results

This subsection demonstrates the statistical results of the Wilcoxon signed-rank test and Friedman test. The significance of the statistical tests is to validate the performance of the newly proposed algorithm compared to existing algorithms. This work considers the Wilcoxon signed-rank test and the Friedman test to check the significant difference between the performances of the proposed IWFO and other existing clustering algorithms. Wilcoxon rank test is a non-parametric statistical test that can be used to check whether the samples belong to the same population or not. So, this test considers one pair of data and checks whether the mean rank is equal or not. In this work, the simulation results of the proposed IWFO are compared with eighteen clustering algorithms. So, as per the Wilcoxon rank test, eighteen pairs of data samples are designed to perform this test. Before conducting this test, two hypotheses are designed such as H0 stands for no significant difference between the median/mean values of data samples and H1 stands for a significant difference between the mean/median values of the data samples. Further, the significance level is set to 0.05. The statistical results of the Wilcoxon rank test are presented in Table 10 using sum, mean, median, z-value, p-value and Hypothesis (H0). It is analyzed that the p-value for each pair of the data samples is less than the z-value. Hence, the hypothesis (H0) is rejected at the significance level of 0.05. So, it is concluded that a significant difference occurs between the performances of the proposed IWFO and other clustering algorithms.

Table 10 Statistical results of the Wilcoxon rank test using proposed IWFO and other clustering algorithms based on accuracy parameter

The existence of the proposed IWFO algorithm is also validated using the Friedman statistical test. This test also requires two hypotheses such as H0 and H1. \({\text{H}}_{0}\) claimed that the performances of the proposed IWFO and other clustering algorithms are similar. While, \({\text{H}}_{1}\) claimed that the performances of the proposed IWFO and other clustering algorithms are dissimilar. Firstly, this test computes the ranking of each algorithm using each dataset and further, an average ranking of each technique is computed. The Friedman test utilizes the Friedman statistics to obtain the average rank of each algorithm. Table 11 illustrates the average ranking of each algorithm using the accuracy parameter. It is seen that the proposed IWFO algorithm gets first rank (1.58) compared to other algorithms. Whereas, GA exhibits the worst rank (16.04) among all. It is noticed that the rank of the WFO algorithm is 6.25. So, it is claimed that the proposed IWFO is an outperformer and significantly different from other algorithms.

Table 11 Illustrates the average ranking of each algorithm based on accuracy parameter

Moreover, Table 12 depicts the statistics of the Friedman test at a significance level of 0.05. The degree of freedom for this test is 18 and the critical value is 28.8693. Further, the statistical value for the Friedman test is 114.6065 and the p-value is 4.339e−16. The p-value for the Friedman test is lower than the critical value. In turn, the null hypothesis (\({\text{H}}_{0}\)) is rejected and the rejection of the null hypothesis stated that the performances of the proposed IWFO and other clustering algorithms are significantly dissimilar.

Table 12 Friedman test statistical results using intra-cluster distance (intra) parameter

This work considers the two statistical tests (Wilcoxon signed-rank test and Friedman test) for validating the performance of the proposed IWFO in the clustering field. These tests aim to evaluate the performance of the proposed IWFO algorithm concerning other existing clustering algorithms and validate the proposed algorithm statistically. The findings of these tests stated that the proposed IWFO algorithm is significantly different from existing clustering algorithms. This existence is duly verified by the results of the aforementioned tests. Hence, it is said that the proposed IWFO algorithm is an effective algorithm for solving the clustering problems and it can be certified both experimentally and statistically in this work.

Conclusion

This work presents an improved WFO algorithm to handle the clustering problems. The source of inspiration for this algorithm is the flow of water. This work incorporates some improvements in the WFO to handle the shortcomings associated with it. These improvements are summarized as the initialization issue of the WFO is handled by a Gaussian-based chaotic map, nonlinear functions are considered for improving the balance between local and global searches, and the local optima issue is alleviated by a neighbourhood search mechanism. A set of twelve benchmark datasets is downloaded to examine the performance of the proposed IWFO algorithm. The results are evaluated using well-known clustering measures such as intra, rank, SD, AR and DR, and compared with popular clustering algorithms. The results showed that the proposed IWFO algorithm is superior to most of the clustering algorithms and also gets better results with most of the datasets. Further, two statistical tests are also adopted to validate the existence of the proposed IWFO algorithm in the clustering field. The statistical results showed that the proposed IWFO algorithm is significantly different from other clustering algorithms. Hence, it is stated that both simulation and statistical results certify the efficacy of the IWFO algorithm in the clustering field. It can also be concluded that IWFO is one of the most efficient and robust clustering algorithms. The future perspective will explore the IWFO for spherical-shaped clusters, feature engineering, optimizing the weights, and multi-objective clustering.