1 Introduction

Mine blasting is an indispensable activity on opencast mines, especially quarries. In this regard, the energy of explosives has been used as a useful tool to fragmentation/movement/displacement of rock mass. However, undesirable phenomena occur during blasting (i.e., rock fly, misfire, ground vibration, premature blast, air over-pressure, to name a few) are of particular concern for engineers, mining businesses, and neighboring residents. Of the undesirable phenomena, fly-rock (Fig. 1) is considered as the most dangerous phenomenon [1]. It is considered to be the leading cause of human injuries and loss of properties in open-pit mining [2]. The primary factors answerable for fly-rock are incorrect loading and dispose of blast-hole, inadequate burden, aberrancy in the rock mass and geology structures, tenuous firing delay, and incomplete stemming. Moreover, damages since the lack of security in the blast area, such as deficiency to use proper blasting cubbyhole, bad connections, and insufficient sentry of the blast area, were also the concerns of engineers and managers [3].

Fig. 1
figure 1

Source: https://www.lakecountrycalendar.com

Fly-rocks induced by blasting.

According to previous studies, more than 85% of the total energy is wasted due to improper use of explosive energy [4,5,6,7,8]. It is the cause of undesirable incidents, especially fly-rock [9,10,11]. Therefore, proper use of explosive energy and accurate prediction of fly-rock distance are the challenges of blasting engineers. According to previous researchers, controllable factors (i.e., burden, delay timing, stemming, drilling parameters, and powder factor) and uncontrollable factors (i.e., geotechnical and geological conditions) should be used in predicting fly-rock since their effects on the occurrence of fly-rock, as well as its intensity [12, 13]. However, due to the difficulties of geotechnical and geological conditions, uncontrollable factors are rarely used in predicting blast-induced issues (e.g., fly-rock, ground vibration, air over-pressure) [14,15,16]. Thus, controllable parameters are often investigated and used in estimating the distance of fly-rock.

2 Related works

To predict fly-rock induced by blasting in open-pit mines, empirical and artificial intelligence (AI) are the most popular techniques used during the past three decades [17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32]. Of those, AI techniques were highly recommended due to its advantages and high accuracy. Many AI techniques developed were used to predict the distance of fly-rock in bench blasting. Rezaei et al. [33] developed a fuzzy system to predict the fly-rock phenomenon in an iron mine of Iran with a promising result. Amini et al. [14] developed another AI technique using an SVM model for estimating the fly-rock phenomenon with positive results. ANN was also introduced by Monjezi et al. [34] as an alternative AI technique to predict fly-rock with high accuracy. Marto et al. [35] proposed a novel approach based on ICA and ANN algorithms, for estimating fly-rock, called ICA–ANN model. A comparative study of ANN and ANFIS in predicting the phenomenon of fly-rock was also implemented by Trivedi et al. [36]. They found that the ANFIS model in their study was the most superior technique that should be used to estimate the distance of fly-rock. A new combination of ANN and optimization algorithm of ant colony (ACO) was also proposed by Saghatforoush et al. [37], for estimating fly-rock. In another study, Hasanipanah et al. [38] applied the PSO algorithm for predicting the fly-rock distance with high accuracy. Another survey on prediction and minimization of fly-rock distance was also implemented by Faradonbeh et al. [39] with a promising result. The firefly algorithm was used to optimize the gene expression programming model in their study for prediction of fly-rock purpose. A new computational intelligence model, namely RFNN-GA model (recurrent fuzzy neural network-genetic algorithm), was also introduced by Rad et al. [40] for fly-rock prediction in mine blasting with high reliability. Using another optimization algorithm (i.e., whale optimization algorithm—WOA) and deep learning (i.e., deep neural network—DNN), Guo et al. [41] built a novel intelligent technique WOA–DNN to predict the distance of fly-rock with a promising accuracy (i.e., R2 = 0.983, RMSE = 8.269). Asl et al. [42] also successfully developed the FFA–ANN model for estimating fly-rock based on a combination of an ANN and firefly algorithm (FFA). The simulations of fly-rock using the Monte Carlo technique were also conducted by Zhou et al. [43]. Based on the advantages of AI techniques, Zhou et al. [44] reduced the distance of fly-rock using the PSO-ANN model. From a geological point of view, Mohamad et al. [45] predicted the distance of fly-rock and minimized it during blasting operations through geological structures. Another study implemented by Hudaverdi and Akyildiz [46] aims to predict fly-rock based on a new classification approach, namely multiple discriminant analysis. Positive results were reported in their study. The other studies on the prediction of fly-rock in open-pit mines can be found in refs. [1, 13, 47,48,49,50,51,52,53,54].

According to the best review of the authors, many AI techniques were developed and proposed for estimating fly-rock distance. However, their effectiveness is different. Furthermore, depending on the blast design parameters, geological conditions, as well as the location of each mine, the distance of the fly-rock and its effects are different. In this study, a new technique to predict fly-rock in bench blasting was proposed based on an ensemble of support vector regression (SVR) and the Lasso and elastic-net generalized linear model (GLMNET), called SVRs–GLMNET model.

3 Principle of the artificial intelligence techniques used

3.1 Support vector regression (SVR)

SVM was introduced by [55] with the capability to widely apply as a benchmark machine learning technique for forecasting problems. It includes two primary branches, including support vector regression (SVR) and support vector classification (SVC). In which, SVR was used as the most common form of SVM in the field of engineering [56]. The essence of SVR is based on target values that find a \(\varphi (x)\) function to map data to flat space such that as flat as possible. It is capable of solving complex problems with two forms of linear and non-linear regression.

Linear and optimized regression problems by SVR for the linear regression problems can be implemented by a convex calculation optimization with solutions and constraints, as shown in Fig. 2.

Fig. 2
figure 2

Linear SVR

In SVR, non-linear regression and optimization problems can be implemented by a convex optimization calculation with functions’ kernel to transform the dataset into a high-dimensional feature space. Two forms of the kernel function, which is the most commonly used (i.e., polynomial and radial basis functions), are also introduced in Fig. 3.

Fig. 3
figure 3

Non-linear SVR

3.2 Lasso and elastic-net regularized generalized linear model (GLMNET)

The Lasso and elastic-net generalized linear model (GLMNET) is one of the machine learning algorithms in the artificial intelligence system introduced by Friedman et al. [57]. In GLMNET, each parameter is optimized by the minimization of the objective function; whereas, the remaining parameters are fixed. On other words, GLMNET implements optimization for each parameter of the model and the optimization process is continuously performed. It uses cyclical coordinate descent and executes consistently until convergence [58]. For predicting blast-induced fly-rock, the GLMNET can be described as follows.

Let \(y_{\text{fr}}\) be the value to forecast, i.e., fly-rock distance; \(x_{i}\) is a matrix consisting of input variables such as B, S, ST, W, and PF; \(x_{\text{fr}} = (x_{{{\text{fr}}1}} ,x_{{{\text{fr}}2}} , \ldots ,x_{{{\text{fr}}j}} , \ldots ,x_{{{\text{fr}}k}} )^{T}\) with k denotes the number of descriptors. A linear model for each predicted fly-rock result is assumed as follows:

$$y_{\text{fr}} = x_{\text{fr}}^{T} \beta + \varepsilon_{\text{fr}} ,$$
(1)

where \(\beta\) is a coefficient, \(\beta = (\beta_{1} ,\beta_{2} , \ldots ,\beta_{\text{j}} , \ldots ,\beta_{k} )^{T}\); \(\varepsilon_{\text{fr}}\) is the error between the actual and the predicted fly-rock values. The factors \(\beta\) are determined that \(\varepsilon_{\text{fr}}\) is minimized. The residual sum of squares is reduced as follows:

$$E(\beta ) = \sum\limits_{{{\text{fr}} = 1}}^{n} {(y_{\text{fr}} - x_{\text{fr}}^{T} \beta )^{2} } .$$
(2)

The minimizing coefficients are defined by the ordinary least squares method [59] as follows:

$$\hat{\beta } = (X^{T} X)^{ - 1} X^{T} y,$$
(3)

where \(X = (x_{1}^{T} ,x_{2}^{T} , \ldots ,x_{i}^{T} , \ldots ,x_{n}^{T} )\) and \(y = (y_{1} ,y_{2} , \ldots ,y_{i} , \ldots ,y_{n} )^{T}\).

It should be noted that this equation cannot be solved in the case of \(k > n\) because \(X^{T} X\) becomes singular. Therefore, the regularized regression technique can be employed instead. The loss function for a type of regularized regression, i.e., Elastic-Net, is defined as follows:

$$E(\beta )\sum\limits_{i = 1}^{n} {(y_{i} - x_{i}^{T} \beta )^{2} + \lambda \sum\limits_{i = 1}^{k} {(1 - \alpha )\beta_{j}^{2} + \alpha \left| {\beta_{j} } \right|} } .$$
(4)

By minimizing the loss function of Elastic-Net in Eq. (4), the coefficients \(\beta\) can be estimated. The factors that do not affect the predictive model can be eliminated. Herein, \(\alpha\) and \(\lambda\) can be used to adjust the accuracy of the model \((0 < \alpha < 1)\). If \(\alpha = 0\), this model corresponds to ridge regression [60]. In the case of \(\alpha = 1\), this model corresponds to LASSO regression [61]. For each value of \(\alpha\), the \(\lambda\) and \(\beta\) parameters are defined so that the loss function \(E(\beta )\) is minimized. The values \(\lambda\) are determined by the leave-one-out cross-validation method (LOOCV) [62].

By continuously optimizing the objective function on each parameter while other parameters are fixed, GLMNET has the high-speed computing power and sparse resolution in the input matrix \(x_{\text{fr}}\) [58] for predicting blast-induced fly-rock.

3.3 Ensemble of SVR and GLMNET (SVRs–GLMNET)

The ultimate goal of this study is to propose a new technique for estimating the distance of fly-rock caused by bench blasting using an ensemble of SVR models and GLMNET model, namely SVRs–LMNET model. Accordingly, the fly-rock database was divided into three parts, including training (70%), validating (20%), and testing datasets (10%). These data sizes were recommended by Güera et al. [63] and Knox [64] to ensure the reliability of the dataset during data analysis.

In the first step, the training dataset, including 150 blasting events, was used to develop six SVR models as the sub-models. Subsequently, 40 experimental blasts (of the validating dataset) were applied to validate the performance of the six designed SVR models, as the second step. The outcome predictions of these six sub-models then were used as the six input variables of the new training datasets for the development of the GLMNET model as the third step. In other words, the new training dataset includes 40 observations with six input variables and one output variable (i.e., fly-rock distance). The developed GLMNET model based on the predictions of the six SVR models is called SVRs–GLMNET model. Finally, 20 blasting events of the testing dataset were applied to check the accuracy/quality of the developed SVRs–GLMNET model. They were also used to verify the accuracy of the six developed SVR models to have a complete comparison with the proposed SVRs-GLMNET model. Figure 4 presents the ensemble of SVR models and GLMNET model for predicting fly-rock distance in the present study.

Fig. 4
figure 4

Ensemble of SVR models and GLMNET model for predicting the fly-rock distance

4 Case study

After AI techniques were assigned to predict the fly-rock distance for ongoing research, a quarry in central Vietnam was selected as a case study. It is located in the latitudes 11°55′45″N–11°55′30″N and longitudes 109°05′55″E–109°06′13″E (Fig. 5).

Fig. 5
figure 5

Location of the study site in this work

Mine blasting is the primary method used to break rock at this mine. ANFO (ammonium nitrate/fuel oil) and emulsion explosives are used to break up dry rock and hydrated rock, respectively (Fig. 7b). Blast holds with the diameter of 75 mm and the time delay of 17 ms and 42 ms were used for all types of rock at the study site (Fig. 6). Herein, the residential areas were considered as a dangerous area with a distance of 450–500 m (Fig. 7a), and the distance from the explosion sites to the office of the mine is about 250–300 m. Whereas, the maximum range of fly-rock was recorded as 290.1 m. It can be seen that fly-rock is a dangerous threat to the neighborhood and workers on the mine.

Fig. 6
figure 6

Scheme of blast network used in the mine

Fig. 7
figure 7

a Blast site and residential area, b explosive used in the mine, and c iGeoTrans app for measuring the fly-rock distance

To carry out this study, 210 blasting events were investigated based on 210 blasting designs and the distance of fly-rock values. The blasting parameters such as burden (B), spacing (S), stemming (ST), the capacity of the explosive charge (W), and powder factor (PF) were collected from the blast patterns. To determine the distance of fly-rock, the iGeoTrans app—a product of Hanoi University of Mining and Geology, Hanoi, Vietnam—was utilized, as shown in Fig. 7c. This app can determine the positions of blast sites and fly-rock through global positioning system (GPS), assisted GPS, GLONASS, Wi-Fi, and cellular network for positioning [65]. Finally, a database includes 210 observation was established with five input variables (i.e., B, S, ST, W, PF), and one output (i.e., fly-rock—FR). The characteristics, as well as the range of the dataset used in this study, are shown in Fig. 8.

Fig. 8
figure 8

Box and whisker plots of the fly-rock database used

5 Development of the models

As a necessary AI printing procedure, the original dataset was divided into three parts, as described above (i.e., 70/20/10). In which, 70% (~ 150 observations) of the whole original dataset was selected randomly to build the predictive models. Note that, all the predictive models developed in this work are used the same training dataset. To avoid over-fitting or under-fitting of the models, the data were normalized by the Box-Cox transformation technique [66].

5.1 GLMNET model

As stated above, GLMNET is one of the AI techniques, which is used in this study for predicting the fly-rock distance of the mine. It is a technique that represents linear regression methods. For the GLMNET model, regularization parameter (\(\alpha\)) and mixing percentage (\(\lambda\)) were used as the key parameters to tune the accuracy of the GLMNET model. One hundred GLMNET models were established based on a “trial and error” procedure of the hyper-parameters (Fig. 9). A resampling technique of tenfold cross-validation was utilized to increase the accuracy of the models. Ultimately, an optimal GLMNET model was defined with the following parameters, i.e., \(\alpha = 0.433\) and \(\lambda = 0.003\).

Fig. 9
figure 9

Performance of 100 GLMNET models with a “trial and error” procedure

5.2 SVR models

Similar to the GLMNET model, one hundred SVR models have been established to estimate fly-rock distance in the present work. However, SVR models in this section represent non-linear regression techniques. Also, the main purpose of this study is to develop a new hybrid model based on an ensemble of six SVR models and GLMNET model (i.e., SVRs-GLMNET model). Therefore, the six best SVR models have been selected among one hundred SVR models that have been developed. Note that, all the similar techniques were also used for the development of the SVR models as those used for the development of the GLMNET model. Review of literature showed that there are many types of kernel functions that can be applied for the SVR development [67]. However, the radial basis kernel function (RBF) is the most common kernel function which was used for the SVR development [5]. Therefore, the RBF was applied for the development of the SVR models. Accordingly, sigma (\(\delta\)) and cost (C) were used as the key hyper-parameters for the SVR models. Eventually, one hundred SVR models with their performance were developed, as shown in Fig. 10. Subsequently, the six best SVR models were selected as listed in Table 1.

Fig. 10
figure 10

Performance of one hundred SVR models with a “trial and error” procedure

Table 1 The six selected SVR models with their hyper-parameters and performances

5.3 SVRs–GLMNET model

To develop the SVRs–GLMNET model for estimating the distance of fly-rock in this mine, the framework in Fig. 4 was applied. Accordingly, six SVR models were developed based on 70% of the whole original dataset, as described above. Then, 20% of the dataset (~ 40 observations) was used to validate the performance of the constructed SVR models. The outcome predictions of the six developed SVR models were used as the new input variables for the new dataset. Their results and accuracy level are shown in Fig. 11. Finally, a combination of the predictions of the six developed SVR models and the output of the validating dataset was implemented for generating a new dataset with 40 observations, six input variables, and one output variable. The properties of the created new dataset are shown in Fig. 12.

Fig. 11
figure 11

The outcome predictions of the six developed SVR models and their accuracy level

Fig. 12
figure 12

Properties of the new dataset with 40 observations (i.e., six inputs and one output)

After developing six SVR models and a new dataset has been created, a GLMNET model has been prepared based on the new dataset, called SVRs–GLMNET. The process of developing SVRs–GLMNET model is like the process of developing the GLMNET model with the same techniques. Eventually, an optimal SVRs–GLMNET was found with the lowest RMSE (i.e., RMSE = 3.695) (Fig. 13). The parameters of the developed SVRs–GLMNET models are defined as the following: \(\delta_{1} = 0.011\); C1= 11.889; \(\delta_{2} = 0.012\); C2= 53.792; \(\delta_{3} = 0.014\); C3= 2.831; \(\delta_{4} = 0.019\); C4= 3.901; \(\delta_{5} = 0.032\); C5= 5.517; \(\delta_{6} = 0.013\); C6= 383.617; \(\alpha = 0.259\), and \(\lambda = 0.007\).

Fig. 13
figure 13

Performance of the proposed SVRs-GLMNET model based on the new dataset

6 Results and discussion

In this section, the effectiveness and accuracy of the models are evaluated, primarily the ensemble of the proposed SVRs-GLMNET model. As mentioned above, the remaining 10% of the original dataset (~ 20 observations) was used to confirm the accuracy of the developed models (i.e., GLMNET, SVR1, SVR2, SVR3, SVR4, SVR5, SVR6, SVRs–GLMNET). Note that these 20 blasting events have never been used before to build models, as well as participate in the ensembling process. A variety of model quality evaluation criteria have been applied, including RMSE, R2, MAE, MAPE, and VAF, which were calculated as

$${\text{RMSE}} = \sqrt {\frac{1}{m}\sum\nolimits_{{{\text{fr}} = 1}}^{m} {(y_{\text{fr}} - \hat{y}_{\text{fr}} )^{2} } }$$
(5)
$$R^{ 2} = 1 - \frac{{\sum\nolimits_{{{\text{fr}} = 1}} {(y_{\text{fr}} - \hat{y}_{\text{fr}} } )^{2} }}{{\sum\nolimits_{\text{fr}} {(y_{\text{fr}} - \bar{y})^{2} } }}$$
(6)
$${\text{MAE}} = \frac{1}{n}\sum\limits_{{{\text{fr}} = 1}}^{m} {\left| {y_{\text{fr}} - \hat{y}_{\text{fr}} } \right|}$$
(7)
$${\text{MAPE}} = \frac{100\% }{n}\sum\limits_{{{\text{fr}} = 1}}^{n} {\left| {\frac{{y_{\text{fr}} - \hat{y}_{\text{fr}} }}{{y_{\text{fr}} }}} \right|}$$
(8)
$${\text{VAF}} = \left( {1 - \frac{{\text{var} \,(y_{\text{fr}} - \hat{y}_{\text{fr}} )}}{{\text{var} \,(y_{\text{fr}} )}}} \right) \times 100,$$
(9)

where m denotes the number of samples; \(y_{\text{fr}}\), \(\hat{y}_{\text{fr}}\), and \(\overline{y}\) are actual, forecasted, and average of the actual values, respectively.

Also, a ranking method was used to classification the developed models. The performance of the models, as well as their ranking on the testing dataset, are computed and listed in Table 2.

Table 2 Confirmation of the accuracy of developed models for estimating fly-rock distance in this study

From the results reported in Table 2, it can be commented that the GLMNET model is the worst model for the current problem. The results in Table 2 seem to confirm that the linear regression technique (i.e., GLMNET) is not suitable for the issue of fly-rock in this study. Meanwhile, the SVR models have worked very well with quite stable performance on both validating and testing datasets. Therefore, the outcome predictions from the six developed SVR models were entire of high reliability. Based on the outcome predictions of the six designed SVR models, a new GLMNET model was developed (i.e., SVRs–GLMNET). The outcome from the proposed SVRs–GLMNET model provided the most dominant accuracy with the lowest RMSE, MAE, and MAPE, and the highest R2 and VAF in Table 2. Based on the results in Table 2, it can be confirmed that the ensemble of six developed SVR models and GLMNET model is a powerful technique to predict fly-rock in this case with a total ranking of 40 and the sort order of 1. Figure 14 shows the accuracy of the regarded models in the predictions of the fly-rock distance on the testing dataset.

Fig. 14
figure 14

Accuracy of individual models on the testing dataset

As demonstrated above, the accuracy level of the proposed SVRs–GLMNET model has been significantly improved; however, it is necessary to determine the degree of influence of the independent variables on the performance of the model in an aim to explain the relationship between the independent variables and the dependent variables. Thus, the Sobol sensitivity analysis technique [68] was applied to implement this task. The results of the sensitivity analysis of input variables are illustrated in Fig. 15.

Fig. 15
figure 15

The main and total effect of the independent variables

As a visually report, Fig. 15 shows that ST, W, and PF are the main independent variables, which has a significant effect on the dependent variable (i.e., fly-rock). The other variables (i.e., B and S) have a tiny impact on the accuracy of the model.

7 Conclusion

Fly-rock is one of the most dangerous phenomena for human and equipment in open-pit mines, as well as neighboring residential areas. Accurately predicting the distance of flying rocks is a great achievement to minimize the risks posed by fly-rock in bench blasting. This study developed and proposed a novel AI model based on an ensemble of SVR models and GLMNET model, which is the SVRs–GLMNET model. It was considered as a new technique with high reliability in predicting the distance of fly-rock (i.e., MAE of 3.214, RMSE of 3.737, MAPE of 0.018, VAF of 99.207, and R2 of 0.993). Although linear regression techniques do not provide a satisfactory level of accuracy in the prediction of fly-rock due to the non-linear relationship of the variable inputs; however, a combination of multiple non-linear regression models with a linear regression model is an innovative idea to improve the accuracy of the predictive model. It should be surveyed and developed for many other AI models in the future works for estimating and controlling the distance of fly-rock.