1 Introduction

The cemented paste backfill (CPB) technique, as one of the most widely used treatment and disposal methods of mining waste, has been successfully applied for alleviating the stress from accumulation of tailings [1,2,3]. The fresh CPB materials are liquid and can be transported to the stopes of underground mine by their own gravities or pumping stresses [4]. After a period of chemical reaction, the mechanical properties of CPB materials are strengthened, and thus, CPB plays a crucial role in supporting the stability of surrounding stopes and providing working faces for excavation equipment and operators [5]. To satisfy these requirements, the properties of CPB must be designed carefully [6,7,8] and usually the uniaxial compressive strength (UCS) of CPB is denoted as an effective parameter used for evaluating its endurance capability [2, 9]. Therefore, the UCS value of CPB is considered as a representative parameter which can be used for estimating the stability of mining stope [10, 11].

For the conventional composition of CPB, tailings, binders and water are indispensable. Apart from that, some studies exhibited the advantages of the CPB materials with the addition of fibers [12,13,14,15] and blast furnace slag [16]. These materials have been proved to be beneficial to the mechanical properties of backfilling bodies and reduce the usage of binding agents.

To characterize the influence of different components on the UCS of CPB, several scholars [16, 17] designed various scenarios and performed numerous UCS tests. For these traditional methods, unconfined compressive tests (UCTs) have to be conducted until the curing time finished and thus tend to be time-consuming; in addition, insufficient test samples may bring devious results. Considering this, some other studies proposed alternative methods which employ electrical resistivity (ER) and ultrasonic pulse velocity (UPV) measurements. According to the bibliography [18,19,20], these innovative measurements have been widely used for determining the strength of CPB or rock materials because of their quick and easy implementation, low cost and, most importantly, nondestruction. However, the test results of these methods are significantly influenced by the homogeneity of materials which means same components of CPB samples may obtain different values. Particularly, for large-size samples, nondestructive methods hardly reflect their actual strength.

Recently, artificial intelligence (AI)-based models are frequently proposed to solve different problems of mining and geotechnical engineering [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40]. Particularly, many scholars applied AI techniques to investigate the effects of different components on strength development of CPB and predict the UCS of CPB [3, 5, 9, 11, 41]. Among these methods, neural network-based methods presented outstanding prediction performance on the UCS of CPB. However, there are some disadvantages in existing research methods that need to be addressed. For instance, the most prominent disadvantage of neural network methods is that the number of hidden layer(s) is difficult to determine [42]. In order to predict the UCS of fiber-reinforced cemented paste backfilling, as far as the authors know, there is no study available in the literature. Therefore, more novel methods are worthwhile to explore and understand the relationships between the UCS of CPB and its important factors. The prediction of fiber-reinforced CPB strength would be a good strategy to examine the generalization capability of artificial intelligence in this specific field.

The support vector machine (SVM) [43], as one of the most effective regression tools for predicting the system output, has been applied in mining issues such as blasting environmental issues [44], hard rock pillar stability [45], rockburst phenomenon [46] and blast-induced rock movement [47]. In SVM, few hyper-parameters are involved and it is more effective when handling small sample data. Thus, SVM is considered to utilize in this study to predict the strength of fiber-reinforced CPB. Meanwhile, four optimal algorithms and cross-validation methods are combined to optimize the hyper-parameters. In this regard, different hyper-parameter optimization techniques based on the SVM has been proposed in this study, namely the grid search (GS), the genetic algorithm (GA), the particle swarm optimization (PSO) and salp swarm algorithm (SSA). The aforementioned techniques are used for evaluating the strength of fiber-reinforced CPB. This is an innovative work as the strength prediction of fiber-reinforced CPB has not been investigated before. The potential of optimizing the hyper-parameters of SVM models for fiber-reinforced CPB datasets is worthwhile to investigate. This study will promote the application of the SVM in mining operations and other geotechnical areas; meanwhile, it will be helpful for interpreting the strength development of fiber-reinforced CPB.

2 Materials, testing procedure and input parameters

2.1 Fiber properties

The Fankou lead–zinc mine which is located in Shaoguan, Guangdong Province of China, is one of the biggest lead–zinc mine corporations in Asia, and it produces enormous tailings every year. According to current production process, raw tailings are classified and only those tailings with particle size larger than 19 μm are used for underground backfilling. The main reason is that fine tailings will cause the decrease in strength of backfill bodies [12]. Therefore, most of tailings are discharged into tailing reservoirs and thus cause many pollution and safety problems. In order to address these problems, some studies advised that utilizing fibers additives in the backfilling slurry and is able to improve mechanical properties of backfilling bodies [12, 48,49,50,51,52]. Therefore, in the experimental section, the main objective is to explore the influence of novel fiber additives on the UCS of CPB. To achieve this goal, all tailings were extracted from ore processing plants without the process of filtration and sedimentation, and hence, the experiments were implemented in the laboratory of Fankou lead–zinc mine. In this study, two kinds of fibers are employed, namely straw fibers and polypropylene (PP) fibers. The polypropylene fibers have stable chemical properties and powerful physical properties. In addition, they have been proved to be positive for improving the mechanical properties of CPB [12]. The straw fibers were collected from the farm, and they are composed of lots of cellulosic substances with stable chemical properties and mechanical strength. Additionally, they are environmentally friendly, and thus, they will not create additional pollution problems to underground environment. Therefore, these two kinds of fibers are developed in this study. The main properties of PP fibers, the main components of straw fibers and the main properties of straw fibers are presented in Tables 1, 2 and 3, respectively. Figure 1a, b shows the shape and appearance of PP fibers and straw fibers.

Table 1 Main properties of polypropylene fibers
Table 2 Main components of straw fibers
Table 3 Main properties of straw fibers
Fig. 1
figure 1

a The appearance of PP fibers. b The appearance of straw fibers

2.2 Specimen preparation and important factors

The concentration of raw tailings is too low; therefore, tailings are dried in an oven with the temperature around 100° to prevent the chemical properties of the tailings from changing. Next, the traditional gravitational sedimentation and filtering methods are implemented for measuring tailings size distribution as shown in Fig. 2. The test results indicated that the tailing size larger than 74 μm, 37 μm, 19 μm, 15 μm and 10 μm accounts for 20.4%, 36.7%, 47.5%, 54.9% and 60.2% of the total tailings, respectively. That is to say, the tailing size smaller than 10 μm accounts for about 40% of the total tailings; thus, the raw tailings can be classified as ultra-fine tailings which will greatly reduce the filling performance of the tailings. As mentioned before, only size larger than 19 μm can be used for underground backfilling; thus, more than 50% of the total tailings are wasted and discharged into reservoirs. According to the relevant laws and regulations of Guangdong Province, all mine reservoirs have been forced to shut down in the next few years; thus, all tailings must be recycled. That is to say, current backfilling scenarios must be modified and raw tailings will be used for underground backfilling instead of classified tailings. For achieving this, it is imperative to conduct detailed mechanical tests to analyze the impact of different factors on the strength of CPB. And then, various important factors were employed, i.e., fiber properties, cement type, curing time, cement–tailings ratio and concentration. For fiber properties description, two appearance parameters (fiber length and fiber weight) and one mechanical parameter (textile strength) were identified as fiber indexes. The design of cement–tailings ratio and concentration referenced previous practical scenarios of Fankou lead–zinc mine. The cement–tailings ratio was categorized as 1:2, 1:3, 1:4, 1:5, 1:6, 1:8 and 1:10. The concentration was categorized as 66%, 68%, 70%, 72% and 73%. Different cement–tailings ratio designs were planned for different stope requirements. The concentration was projected to be higher than previous mining scheme, because the particle size of current tailings is lower than previous one. The curing time was set to 3, 7 and 28 days. For simulating the engineering practice, two kinds of cement, namely Fankou Dachang cement and Portland 42.5 R cement, were selected as the binder because they are being used in Fankou lead–zinc mine. The 28-day minimum compressive strength of cement was applied to characterize the cement type. As mentioned before, PP fibers and straw fibers were chosen for improving the CPB strength. For developing the SVM-based prediction model, these parameters were undertaken to emulate the couple effect of the fiber-reinforced CPB strength and an intuitive schematic diagram is demonstrated in Fig. 3.

Fig. 2
figure 2

The particle size test of raw tailings by gravitational and filtering methods

Fig. 3
figure 3

Parameters selected for developing the SVM-based models

3 Uniaxial compression strength tests

The UCTs were conducted for each specimen to compare and analyze the influence of different components on CPB strength. To this end, specimens were tested utilizing a semi-automatic pressure testing machine (TYE-300D, Wuxi Test Instruments Corporation, China) as shown in Fig. 4. During the process of testing, the upper surface of specimens was parallel to the lower surface of the pressure sensor. For the test piece with high compressive strength, the instrument will automatically stop pressurizing, while, for the test piece with low compressive strength, it is necessary to manually stop pressurizing. To ensure the precise test results, for each scenarios, three same specimens were tested and the average values were retained. Finally, 144 different test results were obtained. These results will be used for establishing the CPB strength datasets. Among these test results, 44 results were added with fibers: 36 results were added with straw fibers, and 8 results were added with PP fibers according to different design requirements. According to partial testing results, it is found that PP fibers play an important role in improving the CPB strength which is complied with previous studies [12]. However, straw fibers cannot ensure the increase of CPB strength; in some scenarios they even decrease the CPB strength. The detailed results can be further presented in Fig. 5 and Table 4.

Fig. 4
figure 4

The conducted UCT

Fig. 5
figure 5

The UCT results of 3-day and 28-day curing time for fiber-reinforced and no-fiber CPB

Table 4 The compression test results of 3-day and 28-day curing time for fiber-reinforced and no-fiber CPB

3.1 Input and output variables

As mentioned in Sect. 2.2, various important factors were selected to explore the UCS of ultra-fine tailings-based CPB. In addition, the fiber length is determined as 12 mm for PP fibers and as 10 mm for straw fibers. The additive amount of PP fibers and straw fibers are obtained as 5 g/kg and 3 g/kg, respectively. The textile strength of PP fibers and straw fibers is identified as 500 and 5.4 MPa, respectively. Then, the minimum 28-day compressive strength of two kinds of cement with 42.5 MPa and 39.5 MPa for Portland 42.5 R cement and Fankou Dachang cement, respectively, is set as input variables. As a result, seven parameters, namely cement–tailings ratio (CtT), concentration (Co), curing time (T), fiber length (Fl), fiber weight (Fw), fiber’s tensile strength (Fs) and 28-day minimum strength of cement (S), were set as input variables and the UCS of CPB was considered as the output variable. Figure 6 demonstrates all input and output parameters used for developing the prediction models of CPB strength with their range, mean and outlets.

Fig. 6
figure 6

Boxplot of input and output variables used in modeling process

4 Principles of the used techniques

4.1 Support vector machine

SVM was first proposed and developed by Vapnik [43] based on statistical learning theory and has been prioritized for considering to solve various pattern recognition problems among many available supervised learning methods [53]. As mentioned before, we have conducted many uniaxial compressive tests to examine the compressive properties of different backfilling scenarios. However, in the future, different backfilling scenarios will be applied to satisfy different design requirement. Conducting more experiments will cost much time, and for addressing the CPB strength prediction problems, this study applies SVM techniques as a regression tool. In such a situation, this section presents a concise introduction and description about how to employ SVM techniques to solve regression problems. More details about SVM theory and its implementation can be found in other available studies [54,55,56].

At first, SVM was proposed for addressing classification problems, and then, by means of introducing the ε-insensitive loss function, it can be used for doing linear or nonlinear regression tasks as shown in Fig. 7a, b. In the processing of the linear regression, the training data are described as (\(x_{c} ,y_{c}\)), (c = 1,2,3…n) where the input x is M-dimensional, \(x \in R^{M}\) represents the input values and \(y \in R\) represents the corresponding output results. The goal of SVM is to maximize the function relationship between the input data and the output values by a repeated training optimization process.

Fig. 7
figure 7

a Data space transformation by SVM [57]. b Linear SVM regression by means of the ε-insensitive loss function [58]

Then, the training set is trained by SVR nonlinear regression model represented as:

$$f(x) = \left( {\omega \cdot \phi (x)} \right) + e$$
(1)

where ω and e, respectively, refer to weight coefficient and model error values and \(\phi \left( x \right)\) is introduced to transform the nonlinear problem into a linear problem [59]. After training process, assuming that the difference between the predictive values \(f(x_{a}^{p} )\) and the corresponding actual values \(f(x_{a}^{r} )\) is approximate to zero, then the process of nonlinear regression can be regarded as an optimization problem shown in Eq. 2 by means of the ε-insensitive loss function.

$$\left\{ {\begin{array}{*{20}l} {\hbox{min} \phi \left( \omega \right) = \frac{1}{2}\left\| \omega \right\|^{2} } \hfill \\ { - \varepsilon \le f(x_{a}^{p} ) - f(x_{a}^{r} ) \le \varepsilon ,\;a = 1,2, \ldots ,n} \hfill \\ \end{array} } \right.$$
(2)

In order to ensure the constraint condition feasible, two slack variables \(\xi\), \(\xi^{*}\) and the punishment coefficient C are introduced. Then, Eq. (2) is transformed into the following convex optimization problem.

$$\begin{aligned} & \hbox{min} \phi \left( {\omega ,\xi } \right) = \frac{1}{2}\left\| \omega \right\|^{2} + C\sum\limits_{a = 1}^{m} {(\xi_{a}^{*} + \xi_{a} )} \\ & s.t. \\ & \left\{ {\begin{array}{*{20}l} {\xi_{a} ,\xi_{a}^{*} > 0} \hfill \\ {f\left( a \right) - \left( {\omega \cdot x_{i} + e} \right) \le \varepsilon + \xi_{a} } \hfill \\ {\omega \cdot x_{i} + e \le \varepsilon + \xi_{a}^{*} } \hfill \\ \end{array} } \right. \\ \end{aligned}$$
(3)

The first item in Eq. (3) shows the complexity of the proposed model, and the empirical error is mentioned in the second item. The constant C is a regularization factor which can be used for adjusting the tradeoff between the complication of the model and empirical deviations.

Further, by introducing the Lagrange multipliers \(\alpha\) and \(\alpha^{*}\) [60, 61], the constrained optimization of Eq. (3) can be transformed into a new form as shown in Eq. (4):

$$\begin{aligned} & g(x) = e + \sum\limits_{a}^{n} {h(x_{a} ,x) \cdot (\alpha_{a} ,\alpha_{a}^{*} )} \\ & s.t. \\ & \left\{ {\begin{array}{*{20}l} {\alpha_{a} \ge 0} \hfill \\ {\alpha_{a}^{*} \le C} \hfill \\ \end{array} } \right. \\ \end{aligned}$$
(4)

where the function \(h\left( {x_{a} ,x} \right)\) represents the kernel function which maps the input data to a high-dimensional feature space [43]. There are mainly four principal kernel functions of SVM, i.e., the radial basis function (RBF), the linear, the sigmoid and the polynomial [62,63,64,65]. Among them, the RBF of Gaussian kernel function will yield good generalization ability for different range of datasets. Therefore, its function can be shown as follows:

$$f\left( x \right) = e + \sum\limits_{a}^{n} {\left( {\alpha_{a} - \alpha_{a}^{*} } \right)\left( {x_{a} \cdot x} \right)} .$$
(5)

In terms of the nonlinear regression, the kernel function which satisfies the Mercer condition is introduced and the kernel function can replace the inner product of the linear regression condition. Finally, the nonlinear regression function can be represented as:

$$f\left( x \right) = e + \sum\limits_{a}^{n} {H(x)} \cdot \left( {\alpha_{a} - \alpha_{a}^{*} } \right).$$
(6)

As aforementioned earlier, the Gaussian kernel function is employed to optimize the SVM model; hence, the two important parameters C (penalty factor) and g (RBF kernel deviation) were chosen as optimal parameters. For conducting the parametric optimization and reducing the parametric searching time, many metaheuristic algorithms have been widely used, such as GS [46, 66], GA [67,68,69], PSO [70, 71], and recently developed fruit fly optimization algorithm [72, 73] and imperialist competitive algorithm [74, 75]. To compare the performance of different optimal algorithms, i.e., GS, GA, PSO and SSA were selected in this study to improve the prediction capability of SVM on the strength of fiber-reinforced CPB. The SVM regression program is implemented in the environment of MATLAB R2016b. Figure 8 depicts a general implementation process of SVM-based models with SSA, GS, GA and PSO optimization scenarios.

Fig. 8
figure 8

The process architectures of the proposed SVM-based models with GS, GA, PSO and SSA optimization strategies

4.2 Grid search algorithm

The GS, as a typical global optimization method, has been applied to most of parametric algorithms [46]. The main model parameters of the SVM models are penalty factor, C, and the RBF deviation, g, which greatly determine the performance of model learning, analyzing and generalization. As a kind of exhaustive search method, all combination parameters are listed and these combinations form a parameter selection form where each cell of the form represents a candidate solution (a grid). By iterations of the loop, the optimal parameter combination was obtained. The detailed description of obtaining the best C and g by means of utilizing the grid searching method is depicted in the literature [55, 66]. Although the GS tends to be described a time-consuming method, it has an outstanding effect to solve the complicated nonlinear problem and is able to transform the problem of the optimal evaluation function to the optimal combination of parameters (C and g), which solve the problems of the coupling of multi-function.

4.3 Particle swarm optimization

The PSO, as an evolutionary populated search method, gets inspiration from the simulation of the fish schooling and bird flocking, and therefore, it is a powerful technique in solving the parametric optimization problems [76,77,78]. In PSO, the bird flocking represents the swarm of particles and the food source denotes the objective function. To find out the location of food source, many individuals of birds exchange and convey their information about the distance of food source. By means of such cooperation, the whole bird flocking screens optimal information about the location of food source and finally is able to accumulate around the food. In other words, the most optimal solution can be found through the said process. To achieve this purpose, PSO initializes an amount of particles and each particle has the equal probability to be acceptable as a candidate solution in the optimization of solution. In the search space, each particle is endowed with two properties: velocity (V) and position (X) where velocity denotes the speed of movement and position demonstrates the direction of movement. According to this, each particle represents a potential solution and estimates the target function based on its current position. After appraising the fitness function, particles move toward the next location which influenced by their present location, other particles’ location and some accidental perturbations [79]. Eventually, particle swarms move toward the user-defined fitness function iteratively.

The equations of updating the velocity and position of each particle can be defined as follows:

$$\begin{aligned} & V^{'} = wV + \delta_{1} (p_{\text{best}} - X) + \delta_{2} (g_{\text{best}} - X) \\ & X^{'} = X + V^{'} \\ & \delta_{1} = L_{1} R_{1} \\ & \delta_{2} = L_{2} R_{2} \\ \end{aligned}$$
(7)

where R1 and R2 denote random numbers in the range of (0, 1); pbest and gbest signify the single particle’s best position and the best position of the particle swarms, respectively; L1 and L2 denote positive acceleration constants; the current position and velocity of particle are denoted by X and V; X′ and V′ specify the updated position and velocity of particles, respectively; and w denotes the inertia weight which is used for controlling the particle velocity [79].

In SVM, the purpose of the PSO method is to optimize the parameters C and g. By operating the velocity and position updating formulas, the particle swarm finally reaches a global minimum in the process of updating iteratively. The main process of optimizing the SVM’s parameters with PSO is described as below:

  1. (1)

    Data preparation: separating the datasets to training sets and testing sets in the proper proportion.

  2. (2)

    Initialize parameters: setting the PSO parameters including swarm size, number of minimal iterations, inertia weight and the speed and velocity of particles, etc.

  3. (3)

    Fitness evaluation: computing the fitness function and evaluating fitness of each particle before optimizing the values of the objective parameters.

  4. (4)

    Update: the velocity and position of the particles are updated until the parameters C and g satisfy the requirement of the SVM model.

  5. (5)

    Stop condition checking: if iteration is achieved or the accuracy meets the stop requirement, stop the iterative process; then the optimal parameters are obtained.

4.4 Genetic algorithm

The GA, as one of the most widely used global optimization algorithms, was first developed by Holland [80]. GA is originated from computer simulation of biological systems and gets inspiration from the theory of natural reproduction and genetics. GA drew based on Darwin’s theory of evolution which introduced strategies for the survival of the fittest. In GA, each individual represents a candidate solution aiming at a specific problem. It is a stochastic global search and optimization method and by means of the adaptive search process, the location of a better optimal solution is acquired and substitutes the former research result. By iterating the search process, the best optimal strategy can be obtained. During these search process, four stages are involved, namely population initialization, chromosome selection, chromosome crossover and mutation [81, 82].

The first stage is population initialization, and within this stage each individual of genetic structure (called chromosome) is given an equal opportunity to be chosen as a solution. After initialization stage, the selection of chromosomes is important which determines the preservation and propagation of chromosomes. For evaluating the performance of these chromosomes, it tends to be a critical step to introduce a user-defined fitness function, and utilizing this function, the performance of chromosome is encoded numerically. According to the value of scoring, chromosomes corresponding to the solution with a higher user-defined scoring standard will acquire the right to replicate and generate offspring. The parameter optimization process will be continued based on those chromosomes with high performance, whereas low-performing structural chromosomes will not participate in the next search process. By iterating aforementioned parametric selective process, the chromosomes which can bring better performance in the solution space take up the majority proportion of the population. To ensure the stability of evolutionary process, crossover randomly selects two good-performing chromosomes as “parents” and they generate a “child”. Crossover permits the parental chromosomes to recombine and exchange their strings which represent the characteristic of chromosomes and carry out the mating process by searching various attractive solutions. It is noted that there are different methods of crossover that can be implemented: the one-point, the two-point, and the uniform type [83]. After implementing the crossover operation, the evolutionary process tends to develop aiming at adaptive optimal solutions. Similar to the evolutionary process of organisms, the mutation is also critical which can prevent the searching from falling into local convergence in GA and arbitrarily introduces some new chromosomes to improve the flexibility and diversity of chromosome population. If the mutant individual improves the performance, it substitutes the original individual. By means of the mutation process, GA fulfills the global search and thus ensures the generalization of final search results.

4.5 Salp swarm algorithm

The SSA, as a novel swarm-based intelligent optimizer, is originated from the foraging behavior of salp swarms [84]. Salps are transparent with barrel-shaped body which are similar to jelly fish. For struggling for more food sources, this creature is connected with each other by salp chain so that food source information can be conveyed and located quickly [85]. Among the salp swarms, various salps play different roles: leaders and followers. The leaders guide the whole population, while the followers follow the direction of leaders.

The ultimate purpose of salp swarms is to find the best food source denoted as F in the search domain. Similar to other swarm-based optimizers, an initial population is predefined including the number and positions of individuals. Each single individual represents a potential candidate solution for the optimal target. The space of whole solutions can be denoted by a two-dimensional matrix called Ui as defined in Eq. (8):

$$U_{i} = \left[ {\begin{array}{*{20}c} {u_{1}^{1} } & {u_{2}^{1} } & \cdots & {u_{d}^{1} } \\ {u_{1}^{2} } & {u_{2}^{2} } & \cdots & {u_{d}^{2} } \\ \vdots & \vdots & \cdots & \vdots \\ {u_{1}^{n} } & {u_{2}^{n} } & \cdots & {u_{d}^{n} } \\ \end{array} } \right].$$
(8)

Then, the initialized salps are updated following the mathematical model, where leaders and followers obey different equations. As mentioned before, the salp leaders play a significant role in foraging and navigation and they are updated through Eq. (9):

$$u_{m}^{1} = \left\{ {\begin{array}{*{20}l} {F_{m} + c_{1} \left( {\left( {kb_{m} - jb_{m} } \right)c_{2} + jb_{m} } \right)\quad c_{3} \ge 0} \hfill \\ {F_{m} - c_{1} \left( {\left( {kb_{m} - jb_{m} } \right)c_{2} + jb_{m} } \right)\quad c_{3} \le 0} \hfill \\ \end{array} } \right.$$
(9)

where \(u_{m}^{1}\) represents the position of the leader in the mth dimension, \(F_{m}\) denotes the position of the food source for the mth dimension, \(kb_{m}\) and \(jb_{m}\) represent the upper and power bounds of the mth dimension, respectively. Variables \(c_{2}\) and \(c_{3}\) are random numbers in the range of [0, 1]. These two variables determine the later search direction in the search domain toward \(+ \infty\) or \(- \infty\) and search step size. \(c_{1}\) is a variable acting as a tradeoff between exploration space and exploitation depth. Generally, it dwindles with the increase in iterations and can be obtained by Eq. (10):

$$c_{1} = 2e^{{ - \left( {\frac{4j}{J}} \right)^{2} }}$$
(10)

where j and J represent current iteration number and maximum iteration number, respectively.

For salp followers, their positions are updated according to Eq. (11):

$$u_{m}^{i} = \frac{1}{2}\left( {u_{m}^{i} + u_{m}^{i - 1} } \right)$$
(11)

where \(i \ge 2\) and \(u_{m}^{i}\) represents the position of ith salp for the mth dimension.

Similar to other metaheuristic algorithms, SSA optimizers avoid local optima with powerful flexibility. In SVM, the ultimate aim for hybridization of SSA is to optimize the hyper-parameter combination (C, g) according to the objective function. Before implementing the program of SSA-SVM, the population size of salps is generated initially. After that, the fitness function is defined as the food source F. All salps explore this function in the search space with the lead of salp leaders. Then, by updating the position of salp swarms, the search results are gradually optimized. Before the optimal results satisfy the stopping criteria, aforementioned process is repeated recursively excepted from the initialization process.

5 Data pre-processing and evaluation metrics

In this study, the CPB strength prediction models were developed in MATLAB environment and the computation code was updated by the open-source toolbox invented by Chang and Lin [55]. The code was implemented on the computer with Intel (R) Core (TM) i7-7500U CPU running at 2.70 GHz and 2.90 GHz. For constructing the CPB strength prediction models, seven evaluation indices were selected, namely cement–tailings ratio, concentration, curing time, fiber length (mm), fiber weight (g/kg), fiber’s tensile strength (MPa) and 28-day minimum strength of cement (MPa). Therefore, these seven parameters are used as input parameters in each SVM-based prediction model. The UCS values of specimens were chosen as the output parameter. For removing the dimension and magnitude interference of different indices, the input data and output data were scaled into the range of [− 1, 1] by means of the following normalization Eq. (12):

$$x^{*} = \frac{{2(x - x_{\hbox{min} } )}}{{(x_{\hbox{max} } - x_{\hbox{min} } )}} + ( - 1)$$
(12)

where x* denotes the normalized values, x represents the original values, xmax signifies the maximum value and xmin represents minimum value of this label, respectively.

To evaluate the difference between original and predicted values for all CPB strength samples, two effective mathematical measures were employed in this study, namely mean squared error (MSE) and coefficient of determination (R2). Generally, the closer value of MSE to 0 and R2 closer to 1 indicate the higher prediction performance of the predictive models

$${\text{MSE}} = \frac{1}{m}\sum\limits_{i = 1}^{m} {(y_{i} - y_{i}^{*} )^{2} }$$
(13)
$$R^{2} = 1 - \frac{{\sum\nolimits_{i} {(y_{i}^{*} - y_{i} )^{2} } }}{{\sum\nolimits_{i} {(y_{i}^{\text{avr}} - y_{i} )^{2} } }}$$
(14)

where m denotes the number of samples,\(y_{i}\), \(y_{i}^{*}\), \(y_{i}^{\text{avr}}\) represent the actual values, predicted values and average values, respectively.

After implementing the normalization process, principal component analysis (PCA) was employed. The PCA, also known as Karhunen–Loeve Transform, is a technique that can be used to explore high-dimensional data structures. PCA is often used for the exploration and visualization of high-dimensional datasets. In this study, it was used for extracting the important features from important variables and reducing the calculation time and complexity. Therefore, five character variables were chosen as the principle component (occupied 95% of total information amount).More details about PCA can be found in the literature [86, 87].

The overall robustness and generalization ability of the training model was essential because it influences the prediction accuracy of testing models. Currently, two mainstream methods for assessing the regression performance on training datasets are k-fold [88] and leave-one-out cross-validation [89]. For k-fold cross-validation, the training samples are separated into k disjointed subsets, and the sample number of these subsets is equal. Then, take one subset as the testing set while the other k-1 subset as the training set. This process is computed for k times. In this way, k prediction results for training sets are obtained and the final prediction accuracy for training sets is the average of these prediction accuracies. For different characteristic of datasets, the number of cross-validation is varied, typically five- or tenfold cross-validation is well performed [90]. In this study, fivefold cross-validation is employed. By combining the aforementioned performance metrics and k-fold cross-validation, the ultimate prediction performance of every model can be procured.

Before conducting SVM-based prediction models, the partition of training sets and testing sets is essential. For training sets, it is used for fitting the predicted models and selecting hyper-parameters in SVM models, and the optimal parameters will be used for examining the goodness of testing models. Too large training sets will cause the overfitting condition and thus weaken the predicted ability of models, while too small training sets will fail to provide reliable configuration parameters. For testing sets, it is used for validating the robustness of proposed models. It can be regarded to be completely independent which means it did not participate in the development of predicted models. Generally, the ratio of training and testing sets which equal to 8:2 is recommended [91]. For achieving this, the whole datasets were randomly sorted utilizing the rand() function in Excel and then divided into two datasets according to the proportion of 8:2. Therefore, 115 datasets from the whole datasets (144 datasets) were included in the training sets and the remaining 29 datasets were considered as testing samples. All models will comply with this ratio in the later calculation.

6 SVM-based model development

6.1 GS-SVM

As one of the most classical parameter optimization strategies, the grid search method has been effectively employed in solving parametric optimization problems in SVM models [92,93,94]. As a kind of exhaustive search method, the grid search method explores the influence of different combinations of hyper-parameters on prediction accuracy. For reducing the local minimum and overfitting problems, the results of grid search method can be strengthened by employing cross-validation. Based on the number of datasets, a fivefold cross-validation process was carried out to improve the robustness of prediction models.

The main routine of the parameter search in GS-SVM is described as follows:

  1. 1.

    Through the experiment and testing, determine the bound of the grid search and grid step.

  2. 2.

    To obtain the best parameter combination, a preprocess step of the training datasets A is needed. The original training datasets are divided into five equal subsets where the other four sets are chosen to fit models and the rest of the training sets are used for selection parameters. According to the exhaustion method, all combinations are experimented and tested. The aforementioned preprocess will be repeated 5 times.

  3. 3.

    Finally, we selected the new parameter pairs of (C, g) taking the optimum parameters previously obtained as a reference point to make the accuracy of the model higher.

To tune two significant parameters utilizing the GS, the grid search bound and grid search step must be determined carefully. And fivefold cross-validation methods were employed in this study to reduce the contingency of search results. In GS, each grid corresponds to one search result and generally, with the decrease in search step, more grids will be generated and meanwhile it will consume more search time. With the expansion of search bound, more search results will be found, and similarly, more search time will be cost. However, if the value of search bound is too less and the value of search step is too large, the search results tend to be easy to get stuck in local minimum. By comparison and calculation, it can be observed that when the search bound of C and g is equal to (2−8, 28) and the grid step for C and g is equal to 0.5, the GS will produce better results and cost less calculation time. The search results are demonstrated in Fig. 9. In this figure, with the change of grid color, it can be found that yellow lines correspond to worse search results and blue lines correspond to better search results. In addition, by observing the GS-SVM optimal curve, it is found that with the change of log2C and log2g, the whole image presents concave transformation. When C is equal to 125 and g is equal to 0.03125, the GS-SVM model receives best prediction results with MSE values of 0.01464 and 0.01655 for training sets and testing sets, respectively, and R2 values of 0.9165 and 0.9445 for training data and testing data, respectively.

Fig. 9
figure 9

GS-SVM optimal curve for the best parameters of C and g

6.2 PSO-SVM optimization

According to previous investigations, the PSO parameter optimization strategy has the strong ability when assisting searching and optimizing hyper-parameters [91, 95]. Therefore, it was applied to optimize the hyper-parameters in SVM-based prediction models. In the PSO algorithm, there are some crucial parameters which can influence the optimization velocity and results need to be selected carefully. Among these parameters, coefficients of velocity equation δ1 and δ2 dominate the local search ability and global search ability, respectively. When these parameters are equal to 2, the PSO algorithm demonstrates better prediction performance [91, 96, 97]. Therefore, these results were adopted in all PSO-SVM models. In addition, another two significant parameters which control the updating speed of search velocity and particle are both set to 1 according to the default values. To determine the other parameters: swarm size and iteration number, the trial-and-error and variable-controlling approach were employed. By a series of testing results, it can be found that when the number of iteration is bigger than 1000, the change of swarm size was not influential much in results of the system. Therefore, the number of iteration was set as 1000. For the number of swarm size, it also needs to be selected carefully. After lots of tests, it can be found that too less swarm size will produce unstable fitness values with the increase in iteration, while too large swarm size will increase the calculation time. Finally, several models with swarm size values of 50, 100, 150, 200, 250, 300, 350, 400, 450 and 500 were selected to test and corresponding fitness values were used for evaluating their performance as shown in Fig. 10. From this figure, we can observe that all fitness values are invariable after iterations of 500. These results showed that selected parameters can produce stable prediction results. In addition, different swarm sizes will output different MSE values. To further compare the swarm size values, the ranking system proposed by Zorlu [98] was used in this study. According to the relative training (TR) accuracy, testing (TS) accuracy, training MSE and testing MSE of each swarm size, different scores can be procured and better performance will be given better scores as shown in Table 5. It can be observed that when swarm size is equal to 250 (total rank value of 40), corresponding performance score is the highest for each item. Then, an intuitive figure which can reflect the ranking results is depicted in Fig. 11.

Fig. 10
figure 10

PSO-SVM optimizations with different swarm sizes

Table 5 Ranking values of the PSO-SVM models with different swarm sizes
Fig. 11
figure 11

Intuitive ranking with different swarm sizes for PSO-SVM algorithm

6.3 GA-SVM

To realize the powerful parametric adjustment and optimization, there are some crucial parameters that need to be determined initially. Crossover possibility and mutation possibility are set as 0.4 and 0.01, respectively. Then, the number of generation and population size will be selected by means of iterative calculation and comparison. Different with PSO algorithms, when the number of generation reaches 200, the fitness values will stop changing for difference population size. Regarding this, the number of generation was determined as 200. To choose the best population size, the population size values of 5, 15, 20, 25, 30, 35, 40, 45 and 50 were selected and corresponding fitness values were employed as performance index. As shown in Fig. 12, a value of 200 generation was allocated as stopping criteria. An interesting result can be found that for different population size, their obtained best fitness values would not change after 75 generations which means the optimal process has completed after 75 generations. Their obtained R2 and MSE values are very close to each other and selecting the best model is difficult. Therefore, the ranking system was applied to assist selecting the best population size as shown in Table 6, and to better display the ranking results, a ranking score chart is drawn as Fig. 13. As a result, the population size with 15 shows comparatively better overall performance.

Fig. 12
figure 12

GA-SVM optimizations with different population size values

Table 6 Ranking values of the GA-SVM models with different population sizes
Fig. 13
figure 13

Intuitive ranking display with different population sizes for GA-SVM algorithm

6.4 SSA-SVM

To procure the best optimal results, in the initialization stage, upper bound and lower bound values which control the parametric exploration range are set to be 100 and 0.01, respectively. As a member of swarm intelligence techniques, two crucial parameters need to be tuned carefully, namely swarm number and iterations. In order to determine these two parameters, various models were established and their performance was evaluated by R2 and MSE values. Because more iterations will cost more calculation time and after 400 iterations, the change of swarm size won’t induce obvious change of the best fitness values, the number of iterations is determined as 400. In order to identify the salp size, ten salp sizes were designed, i.e., 20, 40, 60, 80, 100, 120, 140, 160, 180 and 200. Their optimal results are presented in Fig. 14. It can be observed that after about 175 iterations, the optimal results are inclined to be stable. Similar to previous sections, ranking system was used and the best predictive SSA-SVM model was selected accordingly (Table 7). The most satisfactory optimal parametric group was considered as the one with 400 iterations and 180 salp size. Corresponding R2 and MSE values are 0.9245 and 0.01309 for training sets and 0.9475 and 0.01555 for testing sets, respectively. Detailed description of results will be analyzed in the next section. Similarly, an intuitive figure which can reflect the ranking results is displayed in Fig. 15.

Fig. 14
figure 14

SSA-SVM optimizations with different swarm size values

Table 7 Ranking values of the SSA-SVM models with different swarm sizes
Fig. 15
figure 15

Intuitive ranking display with different swarm sizes for SSA-SVM algorithm

7 Results and discussion

To predict the strength of fiber-reinforced CPB, four optimal algorithms were combined with SVM, i.e., GS, GA, PSO and SSA. According to the aforementioned optimization results, different parameter configurations and algorithms were obtained different prediction performances. Table 8 shows the results of performance indices as well as the ranking system for the selected 4 models of SVM-PSO, SVM-GS, SVM-GA and SVM-SSA in predicting the CPB strength.

Table 8 Ranking results for the selected SVM-based models

It can be found that SSA-SVM presented best prediction accuracy for training sets and testing sets with R2 values of 92.45% and 94.75%, respectively, compared to other three optimal strategies. This proves that for the fiber-reinforced CPB datasets, SSA-SVM optimization networks can fit the sophisticated relationship between CPB parameters and CPB strength better and whose generalization abilities are more outstanding. It can also indicate that SSA algorithms are more capable and flexible.

For GS-SVM approach, it provided inferior calculation results whatever aiming at the predictive accuracy or MSE compared to other three metaheuristic algorithms. As mentioned before, as a kind of exhaustive searching algorithm, it is possible to miss some more effective parameter optimization strategies and easy to get stuck in local optima. Therefore, the GS-SVM strategies tend to be strengthened by cross-validation methods. That is to say, the predictive results of GS-SVM will be largely influenced by the number of cross-validation.

For GA-SVM algorithm and PSO-SVM algorithm, their prediction accuracy level is lower than SSA-SVM algorithm. But for GA-SVM algorithm, it shows the best result for MSE value of testing sets. Therefore, its prediction capability cannot be ignored because the prediction performance of supervised machine learning methods would be influenced by the properties and dimension of datasets. For different datasets, GA-SVM algorithm and PSO-SVM algorithm probably show better prediction results. It is worth mentioning that for GA-SVM and PSO-SVM algorithms, there is a complex nonlinear relationship between their predictive capability and swarm (population) size when the number of iteration (generation) is constant. This phenomenon indicates that the number of swarm (population) size needs to be set carefully.

Finally, it can be concluded that all predictive models obtained satisfactory prediction results. According to the ranking system, the overall model performances were sorted by SSA-SVM, GA-SVM, PSO-SVM and GS-SVM. These results indicated that the proposed metaheuristic algorithms were superior to exhaustive search method (grid search method) and selected optimal parameters improved prediction capability for SVM-based models successfully.

Predicted CPB strengths together with their actual values for GS-SVM, GA-SVM, PSO-SVM, and SSA-SVM are demonstrated in Figs. 16, 17, 18 and 19, respectively. From these figures, it can be observed that each model has the outstanding ability to predict CPB strengths, while SSA-SVM algorithms display a little bit better fitness; therefore, SSA-SVM algorithms will be superior to apply in evaluating the CPB strengths. Particularly for large datasets, slight deviations will cause the magnification of prediction errors.

Fig. 16
figure 16

Results of GS-SVM optimization strategy for evaluating CPB strength

Fig. 17
figure 17

Results of GA-SVM optimization strategy for evaluating CPB strength

Fig. 18
figure 18

Results of PSO-SVM optimization strategy for evaluating CPB strength

Fig. 19
figure 19

Results of SSA-SVM optimization strategy for evaluating CPB strength

8 Sensitivity analysis

For exploring and comparing the sensitivity of different influenced factors on CPB strengths, in this section, the cosine amplitude method [99] was employed. Each input variable and one input variable were transformed into a single column matrix. Thus, eight single column matrixes were obtained as follows:

$$x_{a} = \left\{ {x_{a1} ,x_{a2} , \ldots ,x_{an} } \right\}$$
(15)

where the length of each single column matrix is equal to the number of all datasets and then the sensitivity of different influenced factors on CPB strengths can be calculated as follows:

$$s_{ab} = \frac{{\sum\nolimits_{n = 1}^{144} {x_{an} x_{bn} } }}{{\sqrt {\sum\nolimits_{n = 1}^{144} {x_{an}^{2} } } \times \sqrt {\sum\nolimits_{n = 1}^{144} {x_{bn}^{2} } } }}.$$
(16)

According to the results (Fig. 20), it can be observed that the most sensitive factor is cement–tailings ratio and among three fiber factors, the length of fibers is more important. Although the sensitivity of fiber parameters is not so prominent than other parameters, these results can provide a constructive guideline for selecting fiber properties. Finally, the sensitivity of different parameters on CPB strengths can be sorted in descending order as: CtT, T, Co, S, Fl, Fw, Fs.

Fig. 20
figure 20

Sensitivity analysis of different factors on CPB strengths

9 Conclusions

Totally 144 CPB strengths were obtained from practical experiments. It was found that PP fibers play an important role in improving the strength of CPB. For straw fibers, they can also increase the backfilling strength while they decrease the compressive strength of CPB in some conditions. According to the sensitivity analysis, among three fiber parameters, fiber length played a more important role in improving CPB strengths. Although this result is limited by the number of specimens, it will provide an invaluable guideline for future study. Then, the most important parameters of CPB strength were used for developing the CPB strength predictive models by means of GS-SVM, GA-SVM, PSO-SVM and SSA-SVM optimal scenarios. According to previous investigations and practical engineering conditions, seven input parameters, namely cement–tailings ratio, concentration, curing time, fiber length, fiber weight, fiber’s tensile strength and 28-day minimum strength of cement, are used for developing the training models. Aforementioned four optimal strategies are employed to tune the training networks and search ideal hyper-parameters. Via iterative tests and comparison, the best optimized eight CPB strength prediction models for each algorithm were procured. By employing the ranking system, each model was given a corresponding score. Ranking scores indicated that the proposed metaheuristic algorithms presented better performance than GS method. The SSA-SVM algorithm was found to be more effective and flexible with R2 values of 0.9245 and 0.9475 for training sets and testing sets, respectively, and MSE values of 0.01309 and 0.01555 for training sample and testing samples, respectively. For other two metaheuristic algorithms, although they did not produce better performance levels than SSA-SVM algorithms, their predictive capability was still satisfactory and acceptable, because the performance of intelligent predictive models partially depends on the characteristic of datasets. Therefore, for different datasets they probably get better performance. Finally, it can be concluded that the SSA-SVM algorithms will be given priority to apply in view of their prominent robustness and predictive capability for CPB strength.

Nevertheless, there are still some limitations that need to be addressed in the future study. First, more UCS tests can be undertaken in the future to enrich the CPB datasets and further understand the influence of different factors on CPB strength. Particularly for the research of 7-day fiber-reinforced CPB strength, this study didn’t cover. Other essential influencing variables such as tailing type and size, curing temperature and sulfur content were not considered in the current study. Secondly, the optimization ability of other novel heuristic algorithms is worthwhile to analyze combined with SVM models such as gray wolf optimization [47], whale optimization algorithm [100], firefly algorithm [101], Harris Hawks optimization algorithm [102] and artificial bee colony algorithm [103, 104]. Finally, more advanced supervised learning methods such as Cubist algorithm [105] and random forest [106] which have exhibited satisfactory predicted capability on other geotechnical engineering have not been undertaken and compared on CPB strength prediction.