1 Introduction

The metaheuristic optimization algorithms were proposed over the past decades and implemented extensively to the problem of the complicated [1, 2]. The essential target in the optimization is the candidate the problem variables to minimize or maximize the objective function based on the global and local search [3, 4]. So as to triumph over the state-of-the-art goals in any problem, most of such algorithms were applied as an attempt to establish an approximate technique for attaining the optimum solution [5, 6]. A number of well-known new nature-inspired algorithms include the Invasive Weed Optimization (IWO) [7], the butterfly optimization algorithm (BOA) [8], the Artificial Bee Colony (ABC) [9], the Fruit Fly Optimization Algorithm (FOA) [10], the Firefly Algorithm (FA) [11], the Krill Herd (KH) algorithm [12], the Differential Evolution (DE) algorithm [13], the Flower Pollination algorithm (FPA) [14], etc. The distinction in nature is an essential factor why the algorithms have an alternate dimension of execution in delivering results [15, 16]. Besides, this factor might be the motivation behind why a few algorithms can best item an answer for specific issues, while others don't. Thus, it is according to this limitation that one algorithm is not good enough for solving every kind of problem.

During the past decade, an arithmetic framework and scientific branch, namely chaos, has been proposed, and is connected deeply with different scientific fields. Chaos involves three major dynamic properties: the quasi stochastic property, being sensitive against initial conditions, and ergodicity. The application of chaos theory in optimization research areas has attracted a lot of attention over the recent years. The Chaotic Optimization Algorithm (COA) [17] is among the applications of chaos, and uses the nature of chaos sequences. It has been indicated if random variables are replaced with chaotic variables, the performance of COA can be enhanced. Therefore, in the literature, there are a number of studies on the hybridization of chaos with other algorithms for the purpose of improving the performance of COA. Some instances include the chaotic ACO [18], chaotic DE algorithm [19, 20], chaotic KH algorithm [21, 22], chaotic FPA [23], chaotic genetic algorithm [24, 25], chaotic PSO [26,27,28], chaotic gravitational search [29,30,31], chaotic bat algorithm [32], and etc.

FS is the procedure of selecting a subset of features from an original feature set; it may be considered the most important pre-processing instrument to solve classification issues [33]. Figuring out a superior subset of features is a quite complicated challenge, and is decisive in the final results of the rates of classification error. The finalized feature subset will retain high rates of classification accuracy. The purpose is choosing an applicable subset including d features from a set of D features (d < D) in a given dataset [34]. D is made out of all features that are present in a given data set; it can encompass redundant, noisy, and misleading features. Consequently, an exhaustive search is performed within the whole solution environment, which usually takes a lot of time and cannot often be implemented in practice. To remedy this FS strategy, maintaining the best subset of d relevant features was taken into consideration. Inappropriate features are not only useless, but also can certainly worsen the classification performance. If irrelevant features are deleted, the computational efficiency can be advanced and classification accuracy expanded.

As indicated by search techniques of feature subsets, the current FS strategies can be classified into two classes: the filter-based approach and the wrapper-based approach. The filter method depends fundamentally on general qualities of datasets to assess and choose include subsets without taking into account an uncommon learning approach. Thus, the productivity of this methodology depends predominantly on the dataset itself instead of on the classifier [35, 36]. The wrapper method utilizes a classification calculation to assess feature subsets and embraces a search system to look for ideal subsets. It often leads to better since the wrapper approach takes into consideration a classifier with the evaluation or search process [37].

Each meta-heuristic algorithm has a unique search strategy. Meta-heuristic algorithms can find optimal solutions based on its own strategy such as balance between exploration and exploitation. Furthermore, VSA has advantages such as the smaller number of parameters and easy implementation. The VSA was embedded with chaotic maps to obtain a better compromise between exploitation and exploration. This paper uses hybrid methods based on CMs with the VSA for FS. The major contribution of the current paper is that a CMs model of VSA has been proposed to enhance the performance of VSA. In proposed methods, the chaotic seek method are followed to choose the ideal characteristic subset that maximizes the category accuracy and minimizes the feature subset duration. Ten one-dimensional CMs are adopted and changed with random movement parameters of the VSA. The performance of the proposed methods is tested on 24 benchmark datasets. Similarly, the performance of VSA is comparison with seven other metaheuristic algorithms. Based on mean criterion, the proposed method can obtain better solutions using the Tent Map in comparison with other metaheuristic algorithms.

The main contributions of this paper are as follows:

  • VSA and Chaotic Maps are defined to FS.

  • The proposed method has a faster convergence performance than the other algorithms. The proposed method has better convergence results on different datasets.

  • The proposed method has been evaluated with 24 UCI standard datasets.

  • The best VSA is State2 with VSAC101 that obtained by using the Tent map.

  • The proposed method has been tested on author identification datasets

  • The obtained results confirmed the validity and superiority of the proposed method in comparison to other algorithms.

The organization of this paper is as follows: Sect. 2 gives related works about chaotic and FS. Section 3 provides an introduction to VSA. The detailed description of the proposed method has been provided in Sect. 4, while the experimental results and discussion of the proposed VSA have been provided in Sect. 5. In Sect. 6, the proposed method has been applied on a real application (i.e., author identification). Finally, the conclusion and future work have been discussed in Sect. 7.

2 Related works

The Moth Swarm Algorithm (MSA) is among the most recently-developed nature-inspired heuristics for the purpose of the optimization problem. However, its shortcoming is that it has slow convergence rate, and the Chaos theory has been incorporated into it to eliminate this drawback. In [38], ten CMs have been embedded within the MSA for the purpose of finding the ideal number of prospectors to increase exploiting the most promising solutions. The proposed method was applied in solving the famous seven benchmark test functions. The results of simulation showed that CMs can enhance the performance of the original MSA with regard to the convergence speed. In addition, the sinusoidal map was found to be the best map for enhancing the performance of MSA.

The Cuckoo search algorithm (CSA) is a metaheuristic algorithm that has been inspired by nature and imitates the obligate brood parasitic behavior of the cuckoo species. The method has been proven to have promising overall performance in solving optimization problems. Chaotic mechanisms were incorporated into CSA to make use of the dynamic features of the chaos theory, to further improve its search overall performance. However, in chaotic CSA (CCSA) [39], the best CM was applied in a single search of the new release, which restrained the exploitation capability of the search. The researchers considered utilizing multiple CMs at the same time to perform the nearby search inside the community of the global best solution that is found by CSA. To attain this goal, three kinds of multiple chaotic CSAs (MCCSA) were proposed via incorporating several CMs into the chaotic local search (CLS) parallel in a random or selective manner. The overall performance of MCCSA was validated using 48 broadly-used benchmark optimization features. The experimental results indicated that MCCSAs are generally better than CCSAs, and the MCCSA-P that makes use of the CMs has the best quality among all sixteen editions of the CSAs.

In [40], a chaos-based Crow Search Algorithm (CCSA) has been proposed to solve the fractional optimization problems (FOPs). The proposed CCSA integrated the chaos theory (CT) into the CSA for the purpose of refining the global convergence velocity and enhance the exploration/exploitation inclinations. CT was utilized to track the standard CSA parameters, which yielded four versions and the high-quality chaotic variant was investigated. The incorporation of CT was able to improve the overall performance of the proposed CCSA and allow the search process to perform better speeds. The overall performance of the CCSA method was proven on twenty fractional benchmark problems. Furthermore, it was further tested on a fractional monetary environmental power dispatch problem via attempting to limit the ratio of the overall emissions to general gasoline cost. Ultimately, the proposed CCSA was compared with the PSO, standard CSA, FA, Dragonfly Algorithm (DA), and GWO. In addition, the efficiency of the proposed CCSA was justified by the non-parametric Wilcoxon signed-rank test. The experimental results proved that the proposed CCSA performs better than similar algorithms with regard to efficiency and reliability.

In [41], a new hybrid algorithm for solving optimization problems based on chaotic ABC and chaotic simulated annealing has been proposed. The chaotic ABC reveals new locations chaotically. Chaos may additionally improve the exploration of the search space. Really, the proposed hybrid method affords a hybrid of nearby search accuracy of simulated annealing and the capacities of global seek of ABC. Moreover, they used an exclusive method for producing the initial population. Sincerely preliminary populace is of brilliant significance for populace-based techniques, because it immediately influences the rate of convergence and nice of the outcomes. It is established the usage of 12 benchmark functions. The effects are as compared with those of the artificial bees’ algorithm, the hybrid algorithm of ABC and simulated annealing and PSO. Simulation effects display the performance of the proposed method.

In [42], an adaptive chaotic Bacterial Foraging Optimization (BFO) is presented. The improved BFO consisted of two new features, the adaptive chemotaxis step setting, and the chaotic perturbation operation in all chemotactic events. The former feature results in fast convergence rate and the acceptable convergence accuracy in the algorithm, while the latter further allows the search to avoid the local optima and attain better convergence accuracy. Firstly, an idea of adaptive exponential decease chemo taxis step is presented, in which the natural exponential function variable is a function about the iterations and nutritive ratio between the current bacterium position and the best bacterium position in each iteration. Secondly, when each bacterium reaches a new position through swim behavior, chaotic perturbation is applied to avoid entrapping into local optima. With five benchmark functions, Chaotic BFO is proved to have a better performance than the original BFO and BFO with linear deceasing chemo taxis step (BFO-LDC).

Jia et al. [43] proposed an effective memetic DE algorithm (DECLS), which makes use of a CLS with a ‘shrinking’ strategy. The shrinking strategy for the CLS search space was introduced in that paper. In addition, the local search length was determined according to the feedback of the fitness of the objective functions in a dynamic manner in order to save the function evaluations. Furthermore, the parameter settings of the DECLS were adapted in the process of evolution so as to further enhance the optimization efficiency. The hybrid form of the DE and a CLS as well as a parameter adaptation mechanism seemed very reasonable. The CLS is helpful in enhancing the local search capability of DE, whereas the parameter adaptation can improve the global optimization quality. The CLS is helpful in improving the optimization performance of the canonical DE through exploring a very large search space in the early phases so as to avoid the occurrence of premature convergence, and exploiting a tiny region in later phases to refine the finalized solutions. In addition, the settings of parameters in the DECLS were controlled adaptively to further improve the search capability. To assess the efficiency and effectiveness of the proposed DECLS algorithm, it was compared with four state-of-the-art DE variants and the IPOP-CMA-ES algorithm on a set of 20 selected benchmark functions. The findings showed that the DECLS is significantly superior, or at least comparable, to other optimizers with regard to the convergence performance and solution accuracy. Furthermore, the DECLS was shown to have certain advantages in terms of solving problems with high dimensions.

In [44], a modified DE algorithm based on the Opposition-based Learning (OBL) and a chaotic sequence named the OBL Chaotic DE (OBL-CDE) was proposed. The proposed OBL-CDE algorithm is different from the basic DE in two ways. The first one is related to the generation of the initial population that follows the OBL rules, while the second one is the dynamic adaption of the scaling factor F through using the chaotic sequence. The numerical results obtained by the OBL-CDE compared to the results of DE and the opposition-based DE algorithms on 18 benchmark functions indicated that the OBL-CDE is capable of finding more superior solutions and maintaining reasonable convergence rates at the same time.

The standard Glowworm Swarm Optimization (GSO) shows poor ability in global search and easily gets trapped into local optima. A Quantum GSO algorithm based on CMs was proposed [45] in order to solve such problems. First of all, a chaotic sequence was generated to initialize the population. This process results in higher probability to cover more local optimal areas, and provides the ground for further optimization and tuning. Next, the quantum behavior was applied to the elite population, which made it possible for individuals to locate any position of the solution space randomly with a certain probability. This greatly enhanced the capability of the algorithm in global search and avoiding local optima. Finally, it adopted the single dimension loop swimming instead of the original fixed-step movement mode. This not only improved the solution precision and convergence speed, but also solved GSO problems that were too sensitive to the step-size, and indirectly enhanced the robustness of the algorithm. The simulation results indicated that the proposed method was feasible and effective.

The Fruit Fly Algorithm (FOA) has recently been proposed as a metaheuristic technique, and is inspired by the behavior of fruit flies. Mitic et al. [46] improved the standard FOA through introducing the novel parameter in combination with chaos. The performance of this chaotic FOA (CFOA) was studied on ten famous benchmark problems using 10 different CMs. In addition, comparison studies with the basic FOA, FOA with Levy flight distribution, and other recently-published chaotic algorithms were made. Statistical findings on each optimization task showed that the CFOA results in a very high convergence rate. In addition, CFOA is compared with recently developed chaos enhanced algorithms such as chaotic bat algorithm, chaotic-accelerated PSO, chaotic FA, chaotic ABC, and chaotic CSA. Research findings generally indicate that FOA with Chebyshev map show superiority to the similar methods in terms of the reliability of global optimality and the algorithm success rate.

In addition, Gandomi et al.[47] proposed a chaos-enhanced version of the accelerated PSO. Some other instances of chaos-enhanced metaheuristic algorithms include the chaotic Genetic Algorithm [48], Chaotic PSO [49, 50], Chaotic Salp Swarm Algorithm [51], Chaotic Elephant Herding Optimization (EHO) algorithm [52], Chaotic Bat Algorithm[53], Chaotic FOA[46], Chaotic GSO Algorithm [45, 54], Chaotic Black Hole algorithm [55], Chaotic Simulated Annealing PSO Algorithm (CSAPSO) [56], Chaotic Social Spider Optimization Algorithm[57], Chaotic Bean Optimization Algorithm[58], Chaotic Quantum CSA [59], Chaotic Antlion Algorithm[60], Chaotic Hybrid Cognitive Optimization Algorithm[61], Chaotic Simulated Annealing [62], Chaotic Based Quantum Genetic Algorithm [63], Chaotic Teaching Learning Algorithm[64], Chaotic DE algorithm [65], Chaotic Grey Wolf Optimization Algorithm[66], Chaotic Fractal Search[67], Chaotic Brain Storm Optimization Algorithm [68], Multi-Objective CCSA [69], Chaotic Grasshopper Optimization Algorithm [70], Chaotic Krill Herd [21, 71, 72], Chaotic DE[73], Chaotic Firefly Algorithm [74, 75], Chaotic Starling PSO Algorithm[76], Chaotic CCSA [77], Chaotic Grey Wolf Optimization Algorithm [78] and etc. Table 1 shows a comparison of different models of meta-heuristic algorithms based on chaotic map.

Table 1 A Comparison of Different Models of Meta-heuristic Algorithms based on Chaotic Maps

3 Vortex search algorithm

The VSA is a recent metaheuristic optimization algorithm that changes into the stimulated mode by the vertical flow of the stirred fluids. Its processes consist of the simplified generation phases similar to other single-solution algorithms. The generation of VSA populations is modified to any generations with the aid of the usage of values completely shape the modern single solution. Furthermore, the performance of every update and seek of iteration pass at the seek space is an essential section in rendering single-solution. Inside the proposed VSA, this stability is performed with the aid of using a vortex-like search pattern. The strategies of vortex sample are simulated through some of the nested circles. The info of VSA techniques may be in brief defined in 4 steps as follow [79].

3.1 Generating the initial solution

The preliminary procedure initials ‘center’ μ0 and ‘radius’ r0. In this phase, the initial center0) can be calculated using Eq. (1).

$$ \mu_{0} = \frac{upperlimit + lowerlimit}{2} $$
(1)

where \(upperlimit\) and \(lowerlimit\) are the bound constraints of the problem, which can be defined in vector of d × 1 dimensional-space. In addition, σ0 is the initial radius r0 generated with Eq. (2).

$$ \sigma_{0} = \frac{{\max \left( {upperlimit} \right) - {\text{min}}\left( {lowerlimit} \right)}}{2} $$
(2)

3.2 Generating the candidate solutions

The procedure of producing candidate solutions is applied for the purpose of rendering the generation of populations \(C_{t} \left( s \right)\) in any iterations, where t is the t-th iteration. The VSA is randomly generated around the initial center μ0 by using a Gaussian distribution, where \(C_{0} \left( s \right) = \left\{ {s_{1} ,s_{2} , \ldots ,s_{m} } \right\}m = 1,2,3, \ldots ,n\) represents the solution and n is the overall number of candidate solutions. The equation of multivariate Gaussian distribution has been shown in Eq. (3).

$$ p\left( {x{|}\mu ,\Sigma } \right) = \frac{1}{{\sqrt {\left( {2\pi } \right)^{d} \left| \Sigma \right|} }}exp\left\{ { - \frac{1}{2}\left( {x - \mu } \right)^{T} \mathop \sum \limits_{{}}^{ - 1} \left( {x - \mu } \right)} \right\} $$
(3)

In Eq. (3) d indicates the dimension, while x is the d × 1 vector of a random variable, μ indicates the d × 1 vector of the sample mean (i.e., center), and Σ indicates the covariance matrix. Equation (4) indicates that when the diagonal elements (i.e., variances) of the Σ values are equal and the off-diagonal elements (i.e., covariance) equal zero (uncorrelated), the resulting shape of the distribution will be spherical. Thus, the value of Σ is computed through utilizing equal variances with zero covariance.

$$ \sum = \sigma^{2} .\left[ I \right]_{d \times d} $$
(4)

where the representation in Eq. (4), σ2 is the variance of the distribution, I represent the \(d \times d\) identity matrix and σ0 is the initial radius (r0) as can see in Eq. (2).

3.3 Replacement of the current solution

The replacement of the current solution is conducted for the selection process. A solution (which is the best one) \(\mathop s\limits \in C_{0} \left( s \right)\) is selected and memorized from \(C_{0} \left( s \right)\) for the purpose of replacing the current circle center (μ0). Before the selection process, it must be made sure that the candidate solutions are inside the search spaces (Eq. (5)).

$$ s_{k}^{i} = \left\{ {\begin{array}{*{20}c} {rand.\left( {upperlimit^{i} - lowerlimit^{i} } \right) + lowerlimit^{i} , s_{k}^{i} < lowerlimit^{i} } \\ {lowerlimit^{i} \le s_{k}^{i} \le upperlimit^{i} } \\ {rand.\left( {upperlimit^{i} - lowerlimit^{i} } \right) + lowerlimit^{i} , s_{k}^{i} > upperlimit^{i} } \\ \end{array} } \right. $$
(5)

where \(k = 1, 2, \ldots , n\) and \(i = 1, 2, \ldots , d\), and rand indicates a random number that is distributed uniformly. VSA uses \(\mathop s\limits\) as a new center, and reduces the vortex size using Eq. (3) to select the next solutions. Thus, the new set of solutions \( C_{1} \left( s \right)\) can be generated. If the chosen solution is better than the best solution, it can be determined as the new best solution and was memorized.

3.4 The radius decrement process

In the VSA, the inverse incomplete gamma function is applied for the purpose of decreasing the radius value during each iteration pass. The incomplete gamma function provided in Eq. (6) often arises in probability theory, especially in applications that involve the chi-square distribution.

$$ \gamma \left( {x,a} \right) = \mathop \int \limits_{0}^{x} e^{ - t} t^{a - 1} dt a > 0 $$
(6)

where a > 0 is the shape parameter while x ≥ 0 is a random variable. Similar to the incomplete gamma function, its complementary \(\Gamma \left( {x,a} \right)\) is usually also introduced (Eq. (7)). In Eq. (7), \(\Gamma \left( a \right)\) is a (1).

$$ \Gamma \left( {x,a} \right) = \mathop \int \limits_{0}^{\infty } e^{ - t} t^{a - 1} dt a > 0 $$
(7)

Table 2 describes pseudocode of VSA algorithm.

Table 2 A description of the VSA algorithm

4 Proposed methods

In this section, the hybrid form of VSA and CMs will be explained. The simple shape of the VSA consists of important keys that can be center and radius. First, the center is a current position from which the VSA may be evaluated based on the problem search space where iterations skip. With respect to exploration for a premier solution that has been carried out up to now, VSA used this function to identify the ‘center’ with the purpose of replacing a new position of the populations. Secondly, ‘radius’ is a method that is utilized to simplify the issues-creating a massive-radius problem to grow to be a small-radius problem. In extra, a Gaussian distribution is a VSA which is used to stability the exploration and exploitation at every iteration skip. However, the VSA used best a single center this is referred to as the single strategy to generate candidate solutions around the current great answer. However, the disadvantages of the VSA can be not noted from the local factor whilst it suffers from the issues that have numerous neighborhood minimal values. in the equal time, the radius used to update the pleasant solution have been capable of decrease the new release skip by way of the usage of a Gaussian distribution, making it less complicated to trap the VSA in neighborhood optima. This explains some drawbacks of the VSA. The presented have a look at specializes in hybridizing the VSA with the CMs. This hybridization is referred to as the chaotic VSA which 10 CMs have been used. These 10 maps have been used in three different locations of the VSA [74]. Figure 1 shows flowchart of proposed method. In the first step is done the initialization of the parameters. In the second step, the VSA Eqs. (9, 10, and 11) are optimized based on the chaotic maps in order to FS. In the third step, the samples are classified and at the end, the accuracy percentage is displayed.

Fig. 1
figure 1

Flowchart of Proposed Method

In the proposed model, we combine the formulas of CMs based on Table 3 with Eqs. (3), (5), and (6). The goal is to find the best CMs to optimize VSA. These places can be expressed as follows:

Table 3 CMs and proposed methods
State 1:

the production of candidate solutions inside the search circle [Eq. (9)].

State 2:

If the solution is out of range, these mappings are used to move to the desired range. (Eq. 10).

State 3:

Reduced search radius using reverse gamma function and CMs [Eq. (11)].

In Table 3, the CMs formulas and methods are shown. The optimization of the VSA based on three methods (State1, State2, and State3) has been done. In each method have been used 10 CMs. So, in each run, there are 30 different modes for a given dataset.

Chaos is described as a phenomenon. Any exchange of its preliminary scenario might also purpose non-linear change for future behavior. Chaos optimization is one of the optimization models for search algorithms. The primary idea behind it is too seriously change parameters/variables from the chaos to the solution area. It relies upon for looking out of the global optimum on chaotic motion properties including ergodicity, regularity, and stochastic properties. The major advantages of chaos are speedy convergence and functionality for warding off local minima. CMs have a form of determinate in which no random factors are applied. In this paper, 10 distinguished non-invertible unidimensional maps were adopted to attain chaotic sets. The adopted CMs have been defined in Table 3, where q denotes the index of the chaotic sequence p, and \(p_{q}\) is the \(q^{th}\) number in the chaotic sequence. The remaining parameters including d, c, and μ are considered as the control parameters, determining the chaotic behavior of the dynamic system. The initial point p0 was set to 0.7 for all CMs, as the initial values for CMs may have a great influence of fluctuation patterns on CMs. In this paper, ten different CMs were applied for the optimization process. These maps are Chebyshev, circle, gauss/mouse, iterative, logistic, piecewise, sine, singer, sinusoidal, and tent [74].

Descriptions of State 1, State 2, and State 3 are as follows:

State 1:

VSA generates candidate solutions using just a single ‘center’ (μ). The generation of ‘center’ is then transformed to new center when iterations pass through the limitation of upper and lower bound of problems. This mechanism has some problems. One of such problems is that VSA tends to be trapped in local minima when suffering from a local point of minimum problems. To overcome this, the CMs of candidate solution VSA was proposed.

In this method, chaos maps are used to generate candidate solutions. Several neighbor solutions \( C_{t} \left( s \right)\), (t indicates the iteration index and is t = 0 at initial stages) were generated randomly around the initial center µ0 in the d-dimensional space by using a Gaussian distribution and CMs. Here, \(C_{0} \left( s \right) = \left\{ {s_{1} ,s_{2} , \ldots ,s_{m} } \right\} m = 1,2,3, \ldots ,n\) represents the solutions, and n represents the total number of candidate solutions. In Eq. (9), the formula of the proposed method is given.

$$ p\left( {x{|}\mu } \right) = \frac{1}{{\sqrt {2\pi^{d} } }} exp\left\{ { - \frac{1}{2} \left( {cm - \mu } \right)^{T} \mathop \sum \limits_{{}}^{ - 1} \left( {cm - \mu } \right)} \right\} $$
(9)

where d represents the dimension, cm is the \(d \times 1\) vector of a CMs variable, µ is the d × 1 vector of sample mean (center), and Σ is the covariance matrix.

State 2:

If the solution is out of range, these mappings are used to move to the desired range. During the selection phase, a solution (i.e., the best one), \(\mathop s\limits \in C_{0} \left( s \right)\) is selected and memorized from C0(s) for the purpose of replacing the current circle center µ0. Before the selection phase, it must be made sure that the candidate solutions are inside the search boundaries. To attain this goal, the solutions that exceed the boundaries are shifted into the boundaries, as in Eq. (10). The VSA combined with chaotic sequences is described in Eq. (10). In Eq. (10), \(Cm\left( i \right)\) is the obtained value of chaotic map at \(j^{th}\) iteration.

$$ s_{k}^{i} = \left\{ {\begin{array}{*{20}c} {Cm_{i} *\left( {{\text{upperlimit}}^{i} - {\text{lowerlimit}}^{i} } \right) + {\text{lowerlimit}}^{i} , s_{k}^{i} < {\text{lowerlimit}}^{i} } \\ {{\text{lowerlimit}}^{i} \le s_{k}^{i} \le {\text{upperlimit}}^{i} } \\ {Cm_{i} *\left( {{\text{upperlimit}}^{i} - {\text{lowerlimit}}^{i} } \right) + {\text{lowerlimit}}^{i} , s_{k}^{i} > {\text{upperlimit}}^{i} } \\ \end{array} } \right. $$
(10)
State 3:

Reduced search radius using reverse gamma function and CMs. In the VSA, the inverse incomplete gamma function is used for the purpose of decreasing the value of the radius during each iteration pass. The incomplete gamma function has been given in Eq. (11).

$$ \gamma \left( {x.a} \right) = \mathop \int \limits_{0}^{cm} e^{ - t} t^{a - 1} dt a > 0 $$
(11)

where a > 0 is known as the shape parameter and cm ≥ 0 is a CMs variable.

In the current study, the chaotic VSA has been implemented as an FS algorithm based on the wrapper method. In VSA, a chaotic sequence is embedded in the search iterations, and the optimal feature subset that describes the dataset is selected using VSA. The FS strategy is aimed at improving the classification efficiency, reducing the length of feature subset, and reducing the computational costs.

4.1 Fitness function

At each iteration, every point position is evaluated the use of a special fitness function fit. The data are randomly divided into extraordinary components, especially training and testing datasets by using the m-fold techniques. Goal standards are used for assessment, which are classification accuracy and the number of selected features. The followed fitness function equation hybrids the two standards into one by means of setting a weight factor as in Eq. (12). a is the class accuracy calculated with the aid of dividing the variety of efficiently labeled instances over the full variety of instances. K-nearest neighbor (KNN) [80] is the used classifier in which k equals to three with suggesting absolute distance. KNN is one in every of supervised learning algorithms which rely on classifying new instance based totally on distance from the new sample to the training samples. The KNN classifier predicts the class of the testing sample through calculating and sorting the distances between the testing sample and each one of the training samples. Such a process is repeated until each datum in the dataset has been selected once as the testing sample. What is meant by the classification accuracy of a feature subset is the ratio of the number of samples that have been predicted correctly to that of all the samples. In this paper, KNN has been used for determining the fitness of the selected features. The selection of K and distance method was decided based on trial and error. Ls is the length of the selected feature subset, Ln is the total number of features, and β is the weighted factor which has value in [0, 1]. β is used to control the importance of classification accuracy and the number of selected features. Since improving accuracy is the primary goal for any classifier, the weight factor is usually set to values near 1 [81]. In this paper, β was set to 0.8. The best solution is maximizes the classification accuracy and minimizes the number of selected features [81].

$$ Fit = maximize\left( {a + \beta \times \left( {1 - \frac{{L_{n} }}{{L_{s} }}} \right)} \right). $$
(12)

5 Result and discussion

In this section, first a summary of the main characteristics of the implemented datasets will be discussed. Second, the proposed methods (State1, State2 and State3) using different CMs will be investigated. Third, comparisons will be made between VSA and the proposed method based on FS. Finally, to emphasize the advantages of the proposed method compared to other algorithms, different experiments will be described and the obtained results will thoroughly be discussed.

5.1 Datasets description

Twenty-four benchmark datasets from different types including medical/biology and business were used in the experiments. Four datasets (21, 22, 23, and 24) were related to the identification and classification of the text author. The datasets were collected from the UCI machine learning repository [82]. A short description of each one of the adopted datasets has been presented in Table 4. As it can be observed, the used datasets involve missing values in some records. In the current study, all such missing values were replaced by the median value of all known values of a given feature class. The mathematical definition of the median method has been defined in Eq. (13). \(S_{i,j}\) parameter is the missing value for \(j^{th}\) feature of a given \(i^{th}\) class W. For missing categorical values, the most appeared value for a feature given class is replaced with the missing value.

$$ \overline{s}_{i,j} = median_{{i:s_{i,j} \in W_{r} S_{i,j} }} $$
(13)
Table 4 Dataset description

Four different statistical measurements—including the worst, the best, the mean fitness value, and the standard deviation (SD) were adopted. In the current study, this test was used to evaluate the performance of each CM and determine the best one. The worst, the best, the mean fitness value, and the SD are mathematically defined as follows:

$$ Best = max{}_{i = 1}^{tMax} BS_{i} $$
(14)
$$ Worst = min{}_{i = 1}^{tMax} BS_{i} $$
(15)
$$ Mean = \frac{1}{tMax}\mathop \sum \limits_{i = 1}^{tMax} BS_{i} $$
(16)
$$ SD = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{M} (BS_{i} - \mu )^{2} }}{tMax}} $$
(17)

BS is the best score gained so far for each iteration.

5.2 Analysis and discussion

For evaluation of methods on different datasets of four criteria (worst, best, mean and SD) have been used. In Table 5, 30 modes and the VSA with the mentioned criteria are investigated. Proposed Method (State1) is equal to VSAC11 to VSAC101 modes, Proposed Method (State2) is equal to VSAC12 to VSAC102 modes, and Proposed Method (State3) is equal to VSAC13 to VSAC103 modes. With regard to the results, it can be stated that Proposed Method (State2) has better results. Proposed Method (State2) with VSAC101 mode offers best result than other modes using Tent map. The main target of this test is to evaluate the efficiency of VSA with different chaotic maps and define the optimal chaotic map (Tables 6, 7, 8).

Table 5 Comparison of results of methods with VSA
Table 6 Comparison of results of methods with VSA (continuance)
Table 7 Comparison of results of methods with VSA (continuance)
Table 8 Comparison of Results of methods with VSA (continuance)

5.3 Comparisons between VSA and proposed method based on FS

In Table 9 and Fig. 2, the results of the VSA and the Proposed Method are shown based on the FS. We chose the Proposed Method in order to FS because it had a high percentage of accuracy. Based on the results, it can be said that the Proposed Method in 19 datasets is better than the VSA.

Table 9 Results of the VSA and the proposed method based on the FS
Fig. 2
figure 2

a Feature Count-Total. b Accuracy (%). c Time (sec)

5.4 Comparison and evaluation

Comparison of the Proposed Method with GA, PSO, ABC, BOA, IWO, FPA, and FA algorithms has been performed to evaluate the efficiency. In Table 10, the control parameters of the algorithms are expressed.

Table 10 Control parameters of the algorithms

The comparison of the Proposed Method with the PSO, ABC, BOA, IWO, GA, FA, FPA, and VSA was performed according to the worst criteria. According to Table 11 and Fig. 3, it is clear that the results of other algorithms are worse than the Proposed Method.

Table 11 Comparison of the proposed method with other algorithms based on the worst criterion
Fig. 3
figure 3

Result of comparison a The worst criterion. b The best criterion. c The mean criterion

In Table 12, the comparison of the Proposed Method with PSO, ABC, BOA, IWO, GA, FA, FPA, and VSA was performed based on the best criteria. According to Table 12 and Fig. 3, it is clear that the results of the Proposed Method are better than other algorithms.

Table 12 Comparison of the proposed method with other algorithms based on the best criterion

In Table 13, the comparison of the Proposed Method with PSO, ABC, BOA, IWO, GA, FA, FPA, and VSA was performed based on the mean criteria. According to Table 13 and Fig. 3, it is clear that the results of the Proposed Method are better than other algorithms.

Table 13 Comparison of the Proposed Method with other algorithms based on the mean criterion

To sum up, the results and discussion of this paper demonstrate that integrating CMs to the VSA is definitely beneficial. The reason why that the Proposed Method outperforms all the other algorithms is that the Tent chaotic map assists this algorithm to highly emphasize exploration in the initial steps of optimization and reduced search radius.

6 Real application: author identification

Author identification, is a \(stylometric\) problem that tries to identify a copied text belonging to an original author [85, 86]. With ever-increasing volume of documents uploaded to the internet, new methods for analyzing and extracting data and knowledge are needed. In order to prevent plagiarism and copying copyrighted materials, the best solution is to use authorship identification. Every writer has his/her own writing style in manuscripts that he/she writes, and the writer’s style can be identified in other papers [87]. Authorship identification is one of the up-to-date problems in the field of natural language processing. Author identification, is an effort to show the writer’s personal characteristics, based on a piece of linguistic information [88] such that various manuscripts written by various authors can be distinguished. Humans possess certain writing patterns for using a language in their writings, which act like figure prints of the writer (writer print); these patterns are specific to the writers [89].

Authors in [90] have proposed an approach known as the \(stylometric\) approach to deal with the problem of Author Identification. There are four different steps in this approach:

  • Calculation of word frequencies to find the most frequent words in the entire corpus.

  • Calculation of normalized frequency. This is done by dividing the frequency of the most frequent word in that document to the total number of words in entire corpus.

  • Using Z-score method.

  • Calculation of distance table by finding distance between two matrices.

Therefore, since the text is converted into numeric representation (feature extraction), classification, and clustering techniques of machine learning can be implemented on it. The Reuter_50_50 data set is used for experiments. There are 50 authors and 50 documents per each author in this dataset. Thus, both training corpus and test corpus contains 2500 texts. These corpuses do not overlap with each other. By applying \(stylometry\) approach and n-gram features to the author identification problem an accuracy of about 85% of that of SVM classifier is achieved which is a higher accuracy in comparison to Delta and KNN classifier.

Dissimilarity Counter Method (DCM), DCM-Voting, and DCM-Classifier have been applied in [91] to the problem of Author Identification. Once the representation spaces are selected, similarity measures such as Euclidean distance, correlation coefficient, and Cosine can be used to compare the documents and then, the document author can be identified using one of the above-mentioned approaches (DCM, DCM-voting, or DCM-Classifier). DCM only uses the similarities between Victoria representations of documents in one space to solve a problem p of P. In the other two DCM-based approaches, it is possible to hybrid different representation spaces. In the case of DCM-voting approach, this is done using a voting technique and as for the DCM-classifier, it can be performed through a supervised learning method which requires the definition of predictive features. During evaluation of the challenge PAN-CLEF 2013, it is observed that DCM-classifier has the best performance only on the Greek corpus with 85%, and the two other approaches i.e. DCM-voting and DCM-classifier obtain the best results or equivalent to the winner of the competition for all evaluation measures (F1, precision and recall) on all the corpora.

The General Impostors Method (GENIM) which took part in the PAN'13 authorship identification competition has been evaluated in [92]. The basis of this model is the comparison made between the given documents and a number of external (impostor) documents, and since there are two stages in their method, the performance had to be measured and parameters needed to be optimized at each step. 25–33 percent of the training documents of each language were used for measuring and optimizing IM, whereas the rest were used for evaluation of GENIM. For the IM evaluation set, 3 or 4 documents were used as seed documents to retrieve the web impostor. The test accuracy is equal to 75.3%.

Blocks containing 140, 280, and 500 characters were investigated. The feature set contains conventional features like syntactic, lexical, application specific features, and some new features that are extracted from n-gram analysis. Moreover, the proposed approach has a mechanism for handling issues related to unbalanced dataset. It also uses Support Vector Machine (SVM) for data classification and uses Information Gain and Mutual Information as a FS strategy. The proposed approach was evaluated experimentally using the Enron email and Twitter corpuses. The results of this evaluation were very promising including an Equal Error Rate (EER) changing between 9.98% and 21.45%, for different block sizes [93].

In [94], by using a cluster-based classification approach, a model is presented for email authorship identification (EAI). Contributions of this paper are as follows: a) Developing a new model for email authorship identification. b) Evaluation of using additional features together with basic \( {\text{stylometric}}\) features for email authorship identification as well as content features that are based on Info Gain FS. On the Enron dataset, the proposed model achieved accuracies of 94, 89, and 81 percent for 10, 25, and 50 authors, respectively. Whereas, on real email dataset constructed by authors, it attained an accuracy of 89.5%.

A large number of researches only focus on enhancing predictive accuracy and do not pay much attention to intrinsic value of the collected evidence. In this paper, a customized associative classification approach, which is a well-known data mining technique, is applied to the authorship attribution problem. This method models the features of writing style which are unique to a person. Then, it measures the associativity level of these features and generates an instinctive classifier. In this research, it is also concluded that a more accurate write print can also be provided by applying modifications on the rule pruning and ranking system described in the popular Classification by Multiple Association Rule (CMAR) algorithm. More convincing evidences can be provided for a court of law by eliminating patterns common amongst different authors since it leads to fairly unique and easy-to-understand write prints. Since this customized abandonment counter method is helpful in solving the problem of the e-mail authorship attribution, it can be used as a powerful tool against cybercrimes. The effectiveness of the presented approach is verified by the results obtained through experiments [95].

An effort is made by authors in [96] to identify the author of articles written in Arabic. They introduced a new dataset which is composed of 12 features and 456 samples of 7 authors. Furthermore, to distinguish different authors from each other, powerful classification techniques were hybrids with the proposed dataset in their approach. The obtained results revealed that the proposed dataset was very successful and achieved a classification performance accuracy of 82% in the hold-out tests. They also conducted some experiments with two well-known classifiers namely the SVM and functional trees (FT) in order to show the efficiency of the proposed feature set. The reported an accuracy of 82% with the FT approach and holdout testing which confirmed robustness of the proposed feature set. Moreover, an accuracy of 100% has been achieved in one of the classes. They also conducted some test on FT by using tenfold cross validation and the proposed approach retained its accuracy to some extent.

One of the classifiers which have been extensively used for language processing is the Naive Bayes classifiers. Nevertheless, the event model used which can remarkably affect the classifier performance is not often mentioned. So far, Naive Bayes (NB) classifiers have never been used for authorship attribution in Arabic. Thus, they proposed to apply these classifiers to this problem, taking into consideration various event models such as simple NB, multinomial NB (MNB), multi-variant Bernoulli NB (MBNB), and Multi-variant Poisson NB (MPNB). The MBNB probability estimation is dependent on whether a feature exists or not, whereas MNB and MPNB a probability estimation is dependent on the frequency of the feature. The mean and standard deviation of the features form the basis of probability estimation in the NB model. The performances of these models are evaluated using a large Arabic dataset taken from books written by 10 different authors. Then, they are compared with other methods. The obtained results reveal that MBNB outperforms other techniques and is able to identify the author of a text with an accuracy of 97.43%. In addition, these results show that MNB and MBNB can be considered as a good choice for authorship attribution [97].

In [98], authorship identification methods were applied to messages of Arabic web forum. In this study, syntactic, lexical, structural, and content-specific writing style features were used to identify the authors. Some of the problematic characteristics of Arabic language were addressed in order to present a model with an acceptable degree of classification accuracy for authorship identification. SVM had a better performance than C4.5 and compared to English performance, the overall accuracy for Arabic was lower. These results were in consistence with previous researches. Finally, as future work, the authors proposed to analyze the differences between these two languages by evaluating the key features as determined by decision trees. Highlighting the linguistic differences between English and Arabic languages provides further insight into possible technique for enhancing the performance of authorship discrimination methodologies in an online, multilingual setting. The results showed accuracies of 85.43 and 81.03 for SVM and C4.5, respectively.

In [99], they developed an authorship visualization known as Write prints which can be used for identification of individuals based on their writing style. Unique writing style patterns are created through this visualization. These patterns can be distinguished in a similar way that fingerprint biometric systems work. Write prints provide an approach which is based on component analysis and utilizes a dynamic feature-based sliding window algorithm. This makes them very suitable for visualizing authorship across larger groups of messages. The performance of visualization across messages taken from three different Arabic and English forums was evaluated and compared with the performance of SVM. This comparison indicated that Write prints show an excellent classification performance and provide better results than SVM in many instances. They also concluded that visualization can be used to identify cyber criminals and can help users authenticate fellow online members to prevent cyber fraud. Accuracies of 68.92 and 87.00 were obtained for Write prints and SVM, respectively.

In [100], they introduced approaches to deal with imbalanced multi-class textual datasets. The main idea behind their approach is to divide the training texts into text samples based on the class size thus, a fairer classification model could be generated. Therefore, it becomes possible to divide majority classes into less and longer samples and minority classes into many shorter samples. They used text sampling techniques to form a training set based on a desirable distribution over the classes. By text sampling, they developed new synthetic data that artificially caused the training size of a class to increase. A series of authorship identification experiments were conducted by these researchers on different multiclass imbalanced cases belonging to two text corpora of two languages; newspaper reportage in Arabic and newswire stories in English. Properties of the presented techniques were revealed by the results obtained through these experiments. They also tested four methods to deal with the problem of class imbalance [100]:

  • The first method: To under-sample majority classes based on training texts. The same amount of text which was equal to the base was used. No modification is applied to the length of each text.

  • The second method: To Under-sample majority classes based on training text lines. All the training texts for a particular author were merged to form a big text. Assuming that xmin represents the size (in text lines) of the shortest big file then, the first xmin text lines of each big file were segmented into text samples of length a (in text lines). It is worth noting that there was as least one complete sentence in each text line in both corpora. It was concluded that smaller values (such as 2 or 3) lead to better results.

  • The third method: Re-balancing the dataset by text samples of varying length. As was mention earlier in this paper, one big file is generated for each author by concatenation of training texts. In other words, the length of text sample is equal to xi/k (where, k is predefined parameter). Short text samples belong to minority authors and long text samples belong to majority authors. Therefore, a balanced dataset is generated which consists of k text samples per class. Experiments were conducted for k = 10, 20, and 50. It is noteworthy that each text line of the training corpus is used exactly once in the text samples.

  • The fourth method: Re-balancing the dataset through text re-sampling. A big file is generated for each author once again. Assuming that x represents the text-length (in text lines) of the \({\text{ith}}\) author and xmax is the longest file then, for each author, k + xmax/xi text samples are generated each of which consisting of xi/k. Therefore, based on the length of the big file, a variable number of text samples are generated for each author. Nonetheless, the relationship is inversed now. Longer text samples are generated for the majority classes but a large number of short text samples are generated for the minority classes.

Using a data set extracted from Arabic novels, they the modified this to two sets of words AFW54 and AFW65, with 11 words eliminated [101]. These two sets were used convert several Arabic texts into frequency vectors. They carried out a performance evaluation on these word sets through experiments which used a hybridization of an EA and LDA to generate a classifier. Then, they fed unseen data to that classifier in order to test it. The obtained performance was apparently consistent with results of authorship attribution researches performed on other languages. It is arguable that AFW54 is a more suitable choice nevertheless; such a claim cannot be made with any statistical significance. For the cases considered here, only a small number of investigations are reported for evaluating the appropriate ‘chunk’ size. In real-world applications this will be probably dependent on several factors, but they have identified at least about 1,000 characterization of function word usage for Arabic authors. Through this work, they have confirmed that the concept of function words translates properly into the Arabic language. In other words, various authors use this set of words in various ways, and this enables us to recognize stylistic features of individual authors and use them to distinguish between different authors [101].

High dimensional datasets bring about more computational challenges. One of the problems with high dimensional datasets is that in most cases, all features of data are not crucial for the knowledge implicit in the data [85, 102]. Consequently, in most occasions, reduction in the dimensions of data is a favored subject. Often, many of candidate features for learning are irrelevant and superfluous and degrade the efficiency of the learning algorithm [103, 104]. Learning accuracy and teaching speed may be worsening with superfluous features. Therefore, choosing the corresponding necessary features in preprocessing phase is essentially important. In this section, for identifying the author, at the first stage, the frequency of words is obtained using the method TF-IDF [105]. At the second stage, each feature is weighted [106]. At the third stage, using metaheuristic algorithms, FS is performed. At the fourth stage, classification is performed via KNN [106].

Furthermore, we used the accuracy as the evaluation measure. This accuracy is calculated as:

$$ Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}}*100 $$
(22)

In the case that TP represents the number of authors who are in the positive class while, TN indicates the number of authors who are in the negative class. Furthermore, FP is the number of authors falsely was considered as positive class by the model and FN is the number of authors falsely was considered as negative class by the model, even though they were positive.

6.1 Reuter_50_50 dataset

In this subsection, the Proposed Method and other algorithms are applied to Reuter_50_50 datasets. The dataset contains 2500 documents and 50 writers (https://archive.ics.uci.edu/ml/datasets/reuter_50_50). The results from the discussed algorithms and the results from other papers are presented in Table 14 and Fig. 4. The results show that the proposed method has a better identification accuracy compared to other algorithms. Moreover, BOA and FPA have also better identification accuracy compared to other algorithms.

Table 14 Comparison of proposed method with other algorithms on Reuter_50_50 datasets
Fig. 4
figure 4

Result of comparison a Reuter_50_50 dataset. b PAN Dataset. c Enron Email Dataset. d Arabic Scripts

6.2 PAN dataset

These datasets consist of scientific documents in Greek, English, and Spanish, and from 2011 until now, a new dataset has been added to the existing ones every year (https://pan.webis.de). The results from the discussed algorithms and the results from other papers on these datasets are evaluated in Table 15 and Fig. 4. Identification accuracy of proposed method for PAN11, PAN12, PAN13, PAN14, PAN15, and PAN16 are 84%, 80.9%, 81.3%, 82.12%, 83.25%, and 81.79%, respectively. Moreover, identification accuracies of DCM models are less than other algorithms. The algorithms BOA, ABC and IWO have better identification accuracies compared to the algorithms GA, PSO, FPA, and FA.

Table 15 Comparison of proposed method with other algorithms on PAN datasets

6.3 Enron email dataset

This dataset is collected and prepared by CALO project (a cognitive assistant that learns and organizes). This dataset includes comments of 150 users who are CEOs of Enron (https://www.cs.cmu.edu/~enron/). The results from the proposed method and the results from other papers on Enron Email dataset are presented in Table 16 and Fig. 4. The results show that the accuracy and error rate in proposed method are 95.04 and 11.68, respectively. Accuracy in the algorithms PSO, BOA and FPA are 91.02, 93.01, and 90.78, respectively. The accuracy and error rate in ABC algorithm are 90.02 and 15.2, respectively. Among other models, the model CCM-10 has a better accuracy, and the lowest accuracies are seen in the models Naïve Bayes and Bayes Net.

Table 16 Comparison of proposed method with other algorithms on enron email dataset

6.4 Arabic scripts

This dataset consists of 30 documents from 10 authors. The author was chosen from the website, (http://www.alwaraq.net) and their names are: Aljahedh, Alghazali, Alfarabi, Almas3ody, Almeqrezi, Altabary, Altow7edy, Ibnaljawzy, Ibnrshd, and Ibnsena. The results from the proposed method and the results from other papers on these datasets are presented in Table 17 and Fig. 4. The identification accuracy of proposed method model is 93.24%, which is better than other models.

Table 17 Comparison of proposed method with other algorithms on Arabic scripts

According to the experiments results, it is concluded that the Proposed Method has a better performance than other models in terms of identification accuracy. According to Tables 15, 16, 17 the proposed method in benchmark functions is the closest to minimum compared to the algorithms FPA, IWO, BOA, ABC, PSO, GA, and FA. Moreover, the proposed method has a better accuracy in the author identification problem. The rate of accuracy of ABC, BOA and proposed method is indicated in Table 17. The results revealed that the proposed method outperformed to the other models that is ABC and BOA models. The percentage of proposed method is 93.24%. Consequently, the percentage of ABC is 91.00% and it is 92.51% for BOA model.

7 Conclusion and feature works

We proposed three State based on the hybrid of chaotic and VSA in this paper for FS. State2 compared with Method1, Method3, and VSA where it had better values. We also used State2 to FS and text author identification. This paper is accompanied by using 10 CMs to enhance the overall performance and precision of the VSA. VSA is introduced to one of the challenge problems, especially FS. The proposed methods have been evaluated on 24 benchmark datasets. Four precise evaluation standards are followed in this paper. These standards are worst, best, mean, and SD. Similarly, the performance of Proposed Method is compared with the popular and maximum current other algorithms. These algorithms are PSO, ABC, BOA, IWO, GA, FA, FPA, and VSA. The experimental effects show that State2 outperforms the other algorithms in terms of best and mean fitness.

Moreover, the outcomes displayed that the Proposed Method (State2) with Tent map can drastically enhance VSA in terms of classification overall performance, stability exceptional, number of FS, and convergence speed. Moreover, the outcomes showed that Tent map turned into the satisfactory map. Therefore, the following conclusion can be drawn:

  • The CMs improve the section of exploration because they change the radius of value search, helping the trapped masses to release themselves from local minima.

  • The CMs are permitted to adaptively adjust exploration and exploitation by the proposed method. As it were, the Proposed Method (State1) encourages VSA to transit gradually from the exploration stage to the exploitation stage.

Essential work on the integration of CMs with the addition of other metaheuristic algorithms will be considered. VSA's performance on more problematic science and real-world engineering problems will be applied in future verification.