1 Introduction

Blasting is the process of using explosive material to fragment or displace the rock masses. Appropriate particle size distribution is one of the most important aims in mining and tunneling industries. On the other hand, the optimum particle size distribution leads to increase in crusher and mill throughput, increase in loader and excavator productivity as well as decrease in energy consumption in size reduction process [1,2,3]. According to the mentioned descriptions, accurate predictions of rock fragmentation are a necessary work in this field, especially to optimize the overall mine/plant economics. In the literature, many parameters such as blast design parameters are considered as the effective parameters on fragmentation [4,5,6]. Burden, spacing, stemming, sub-drilling, number of blast-holes, number of rows in blasting pattern, height benches, blast-holes depth, delay times between rows, type of explosives such as ANFO and dynamite, weight used charge per delay and powder factor are all blast design parameters. In recent years, the use of soft computing methods has been widely used for solving different engineering problems [7,8,9,10,11,12,13]. In the field of rock fragmentation prediction, several soft computing methods have been proposed by researchers. Artificial neural networks (ANN) and multiple regression (MR) were employed by Monjezi et al. [5] for predicting the rock fragmentation using five input parameters, i.e. burden to spacing ratio, weight used charge per delay, stemming and blast-holes depth. Based on their obtained results, ANN was a suitable method for forecasting the rock fragmentation and its results were more accurate than the MR results. Xiu-zhi et al. [1] employed the support vector machines (SVM), ANN, MR for the estimation of rock fragmentation. Their results showed significant capability of the SVM compared to ANN and MR in forecasting rock fragmentation. In the other study of soft computing methods, Monjezi et al. [4] offered the use of fuzzy inference system (FIS) for predicting the rock fragmentation, so that burden, spacing, weight used charge per delay, stemming, powder factor and rock density were adopted as the input parameters. They showed that the accuracy of FIS was superior to that of MR. A comprehensive research work was carried out for forecasting the rock fragmentation in Chadormalu iron mine, Iran, by Esmaeili et al. [14] based on adaptive neuro-fuzzy inference system (ANFIS), SVM and Kuz-Ram empirical model. Respectively, the amount of R-square (R 2) for the ANFIS, SVM and Kuz-Ram empirical models were obtained as 0.89, 0.83 and 0.38. These values indicate that the performance capacity of ANFIS is better than SVM and Kuz-Ram empirical models. Recently, Hasanipanah et al. [6] offered the use of a hybrid model of ANFIS-optimized by particle swarm optimization (PSO) for predicting the rock fragmentation. For comparison aims, ANFIS, SVM and MR models were also employed in their study. Their results indicated that the ANFIS-PSO possessed superior predictive ability than the ANFIS, SVM and MR models, since a very close agreement between the measured and the predicted values was obtained. The main objective of the present research is to investigate the ability of Gaussian process regression (GPR) for forecasting the rock fragmentation in the Shur river dam region, Iran.

2 Case study

In the present research work, the blast database is taken from Hasanipanah et al.’s [6] results compiled in Shur river dam area, Iran. Shur river dam is the tallest asphaltic concrete core dam in Iran, located in Kerman Province and near to Sarcheshmeh copper mine. This dam has 85.5 m height from the foundation and the crest length is 450 m. To construct the Shur river dam, two mines were extracted in the around area using bench blasting method. In blasting process, ANFO was utilized as the explosive material for charging the drilled holes. Afterwards, the holes were stemmed using fine gravels. As mentioned in the introduction, an appropriate particle size distribution after mine blasting is one of the most important aims in mining process. Also, accurate predictions of rock fragmentation are a necessary work to optimize the overall mine/plant economics. To achieve these aims, a comprehensive research work was carried out for forecasting the rock fragmentation. In this regard, a database including 72 datasets was prepared, so that the values of burden, spacing, stemming, weight used charge per delay and powder factor, as the effective parameters on fragmentation, were measured. The quality of fragmentation has been also evaluated on the basis of 80% passing size (D80) using image processing method. For image analysis, a digital camera was used and the images were analyzed by using Split Desktop software. For instance, a sample size distribution curve obtained through the Split Desktop is shown in Fig. 1. More details regarding measured datasets are also given in Table 1. To develop the predictive models, the datasets were divided into the following two sets: (1) training datasets. This is applied to build the predictive models. In this research, 58 datasets were assigned as the training datasets; (2) testing datasets. This is applied to test the built predictive models. The remaining 14 datasets were assigned as the testing datasets. Table 2 summarizes the basic statistics of the train and test sets.

Fig. 1
figure 1

Sample of a size distribution curve obtained using Split Desktop software

Table 1 The used parameters in this research for the D80 estimation
Table 2 The basic statistics of the train and test sets in the present study

3 Gaussian process regression (GPR)

In the present study, GPR is proposed for forecasting the D80. A Gaussian process (GP) is a probabilistic nonparametric model, where observations occur in a continuous domain [15]. It can be used for solving non-linear regression [16] and classification [17] problems. A GPR directly defines a prior probability distribution over a latent function. GPR is specified by its mean function and covariance (kernel) function.

$$f({\mathbf{x}})\sim GP(m({\mathbf{x}}),~k({\mathbf{x}},{\mathbf{x^{\prime}}})).$$
(1)

The mean function is often assumed zero, as it encodes central tendency of the function [18]. The covariance function encodes information about shape and structure of the function that we expect to have. The connection among input and output variables is expressed as:

$$y=f({\mathbf{x}})+~\varepsilon .$$
(2)

It is assumed that noise \(\varepsilon\) is independent and a Gaussian distribution with zero mean and \(\sigma _{n}^{2}\) variance is distributed over it.

$$\varepsilon \sim \mathcal{N}~(0,~\sigma _{n}^{2}).$$
(3)

According to Eq. (2), the likelihood is given by

$$p({\varvec{y}}|{\varvec{f}})=\mathcal{N}({\varvec{y}}|{\varvec{f}},~\sigma _{n}^{2}I~),$$
(4)

where \({\varvec{y}}={[{y_1},{\text{~}}{y_2}, \ldots ,{\text{~}}{y_n}]^T},\) \({\varvec{f}}=[f({{\mathbf{x}}_1}),f({{\mathbf{x}}_2}), \ldots ,{\text{~}}f({{\mathbf{x}}_3})]\) and I is a \(M \times M\) unit matrix.

According to the definition of Gaussian process [19], the marginal distribution \(p({\varvec{f}})\) is given by a Gaussian whose mean is zero and whose covariance is defined by a Gram matrix, so that

$$p({\varvec{f}})=\mathcal{N}({\varvec{f}}{\text{|}}0,~K),$$
(5)

where \(K=k({x_i},~{x_j})\). Since both Eqs. (4) and (5) follow the Gaussian distribution, the marginal distribution of \(y\) is given by

$$p({\varvec{y}})=\mathop \smallint \nolimits^{} p({\varvec{y}}{\text{|}}{\varvec{f}})p({\varvec{f}}){\text{d}}{\varvec{f}}=\mathcal{N}({\varvec{f}}|0,{K_y}),$$
(6)

where \({K_y}=K+\sigma _{n}^{2}I.\)

To predict the target variable \({y_*}\) for a new input \(~({{\mathbf{x}}_*}),\) the joint distribution over \({y_1},{y_2}, \ldots ,{y_m},{y_*}\) is given by

$$\left[ {\begin{array}{*{20}{c}} y \\ {{y_*}} \end{array}} \right]=\left( {\left[ {\begin{array}{*{20}{c}} f \\ {{f_*}} \end{array}} \right]+\left[ {\begin{array}{*{20}{c}} \varepsilon \\ {{\varepsilon _*}} \end{array}} \right]} \right)\sim \mathcal{N}\left( {0,~\left[ {\begin{array}{*{20}{c}} {{K_y}~~~~~~~~~~~{{\mathbf{k}}_*}} \\ {{\mathbf{k}}_{*}^{T}~~~~{k_{**}}+\sigma _{n}^{2}} \end{array}} \right]} \right),$$
(7)

where \({f_*}=f({{\mathbf{x}}_*})\) is the latent function for input variable \({{\mathbf{x}}_*}\) and \({\varepsilon _*}\) is corresponding noise; \({{\mathbf{k}}_*}={\left[ {k({{\mathbf{x}}_*},{{\mathbf{x}}_1}), \ldots ,~k({{\mathbf{x}}_*},{{\mathbf{x}}_M})} \right]^T}\) and \({k_{**}}=k({{\mathbf{x}}_*},{{\mathbf{x}}_*}).\) Using the rules for conditioning Gaussians [20], the predictive distribution \(p({y_*}|{\varvec{y}})\) is a Gaussian distribution with mean and covariance given by

$$m({x_*})={\mathbf{k}}_{*}^{T}K_{y}^{{ - 1}}{\varvec{y}},$$
(8)
$${\sigma ^2}({{\mathbf{x}}_*})={k_{**}} - {\mathbf{k}}_{*}^{T}K_{y}^{{ - 1}}{{\mathbf{k}}_*}+\sigma _{n}^{2}.$$
(9)

The Cholesky decomposition [21] can be used to calculate the inverse of the covariance matrix \({K_y}.\) The covariance (kernel) function is a critical component in a Gaussian process regression. In supervised learning, similarity among data is very important. The covariance function defines this similarity [21]. In this research, the following covariance function was used:

  • Squared Exponential Kernel

$$k({x_i},~{x_j}|\theta )=~\sigma _{f}^{2}\exp \left[ { - \frac{1}{2}\frac{{{{({x_i} - {x_j})}^T}({x_i} - {x_j})}}{{\sigma _{l}^{2}}}} \right].$$
(10)
  • Exponential Kernel

$$k({x_i},~{x_j}|\theta )=~\sigma _{f}^{2}\exp \left[ { - \frac{r}{{{\sigma _l}}}} \right].$$
(11)
  • Matern 3/2

$$k({x_i},~{x_j}|\theta )=~\sigma _{f}^{2}\left( {1+\frac{{\sqrt {3~} r}}{{{\sigma _l}}}} \right)\exp \left[ { - \frac{{\sqrt {3~} r}}{{{\sigma _l}}}} \right].$$
(12)
  • Matern 5/2

$$k({x_i},~{x_j}|\theta )=~\sigma _{f}^{2}\left( {1+\frac{{\sqrt {5~} r}}{{{\sigma _l}}}+\frac{{5{r^2}}}{{3\sigma _{l}^{2}}}} \right)\exp \left[ { - \frac{{\sqrt {5~} r}}{{{\sigma _l}}}} \right].$$
(13)
  • Rational Quadratic Kernel

$$k({x_i},~{x_j}{\text{|}}\theta )=~\sigma _{f}^{2}{\left( {1+\frac{{{r^2}}}{{2\alpha \sigma _{l}^{2}}}} \right)^{ - \alpha }},$$
(14)

where \(r=\sqrt {{{({x_i} - {x_j})}^T}({x_i} - {x_j})}\) is the Euclidean distance between \({x_i}\) and \({x_j},\) \({\sigma _l}\) is the characteristic length scale and \({\sigma _f}\) is the signal standard deviation. Hyperparameters of the covariance function \(\theta ({\sigma _l}~,~{\sigma _f})\) can be estimated from the above equations using a gradient-based algorithm [21]. The performance capacity of the Squared Exponential Kernel, Exponential Kernel, Matern 3/2, Matern 5/2 and Rational Quadratic Kernel models in predicting the D80 is evaluated in the next section.

4 Results and discussion

In the present research work, various GPR models, i.e. Squared Exponential Kernel, Exponential Kernel, Matern 3/2, Matern 5/2 and Rational Quadratic Kernel models have been employed for forecasting the D80. The predicted D80 values by the GPR models are summarized in Table 3. To evaluate the proposed models performance, the following expressions were used:

Table 3 Comparison between the actual D80 vs. the predicted values for testing datasets
$${\text{RMSE}}=\sqrt {\frac{{\mathop \sum \nolimits_{{i=1}}^{N} {{({O_i} - {P_i})}^2}}}{N}} ,$$
(15)
$${\text{RRMSE}}=\frac{{{\text{RMSE}}}}{{{{\bar {O}}_i}}},$$
(16)
$${\text{MBE}}=\frac{1}{N}\mathop \sum \limits_{{i=1}}^{N} ({O_i} - {P_i}),$$
(17)
$${\text{MAPE}}=\frac{1}{N}\mathop \sum \limits_{{i=1}}^{N} \left| {\frac{{{O_i} - {P_i}}}{{{O_i}}}} \right|,$$
(18)
$${R^2}=\frac{{{{\left( {\mathop \sum \nolimits_{{i=1}}^{N} ({O_i} - {{\bar {O}}_i})({P_i} - {{\bar {P}}_i})} \right)}^2}}}{{\mathop \sum \nolimits_{{i=1}}^{N} {{({O_i} - {{\bar {O}}_i})}^2}\mathop \sum \nolimits_{{i=1}}^{N} {{({P_i} - {{\bar {P}}_i})}^2}}},$$
(19)

where \({O_i}\) is the actual value, \({P_i}\) is the predicted value, \({\bar {O}_i}\) is the mean value of actuals, \({\bar {P}_i}\) is the mean value of predictions, \(i\) is the subscript which indicates the ID of data, and \(N\) is the total number of data. The RMSE describes the average difference between predicted value and measured value. The mean average error (MBE) shows how models overestimate or underestimate the measured values. The mean average percentage error (MAPE) describes the accuracy of the models by error percentage. The coefficient of determination R 2 describes the degree of association between the predicted and the measured values. The performance of the model according to RRMSE is defined as follows [22]:

Excellent if: RRMSE < 10%

Good if: 10% <RRMSE < 20%

Fair if: 20% <RRMSE < 30%

Poor if: RRMSE > 30%

The values of the mentioned expressions obtained from the predictive models are given in Table 4. Moreover, Figs. 2, 3, 4, 5 and 6 illustrate the scatter plots of D80 predicted by the models for only testing datasets. As shown in Table 4 and Figs. 2, 3, 4, 5 and 6, the predicted values using the developed GPR models are in good agreement with the actual data, which demonstrates the reliability of the GPR models for forecasting the D80. The amount of R 2 for the Squared Exponential Kernel, Exponential Kernel, Matern 3/2, Matern 5/2 and Rational Quadratic Kernel models was obtained as 0.948, 0.939, 0.942, 0.943 and 0.936, respectively. These values indicate that the performance capacity of GPR-Squared Exponential model is better than the other models. It should be mentioned that the used datasets in this research were already used by Hasanipanah et al. [6]. In their study, SVM, ANFIS and ANFIS-PSO models were employed for forecasting the D80. They concluded that the ANFIS-PSO model with the R 2 of 0.89 for the testing set was a useful model for predicting the D80 and its results were more accurate than the SVM and ANFIS models. As shown in Table 4, performance of the developed models by Hasanipanah et al. [6] can be improved to R 2 of 0.948 in the present study. In other words, the GPR-Squared Exponential model presented in this study is more acceptable model for the D80 prediction in comparison to the SVM, ANFIS and other GPR models. In other words, this work presents the applicability of GPR-Squared Exponential model as an estimator and method employed could be useful to forecast the D80. Also, sensitivity analysis is performed in the present study, as shown in Table 5. Based on this table, when the PF was omitted from the modeling process, the performance capacity of GPR-Squared Exponential model was significantly decreased. In other words, the PF was the most effective independent parameter in the GPR-modeling process in the present study.

Table 4 The values of RMSE, RRMSE, MBE, MAPE and R 2 for the various GPR models using testing datasets
Fig. 2
figure 2

The performance of the squared exponential model for forecasting the D80

Fig. 3
figure 3

The performance of the exponential kernel model for forecasting the D80

Fig. 4
figure 4

The performance of the matern 3/2 model for forecasting the D80

Fig. 5
figure 5

The performance of the matern 5/2 model for forecasting the D80

Fig. 6
figure 6

The performance of the rational quadratic model for forecasting the D80

Table 5 The results of sensitivity analysis for selecting the most effective parameter on D80

5 Conclusion

Precise estimation of rock fragmentation is a necessary work to optimize the overall mine/plant economics. In the present study, various types of GPR are proposed for forecasting the rock fragmentation. In this regard, Squared Exponential Kernel, Exponential Kernel, Matern 3/2, Matern 5/2 and Rational Quadratic Kernel were employed and then their performances were compared. For developing the predictive models, 72 datasets were gathered from the Shur river dam region, in Iran, using five independent (input) parameters, i.e. PF, W, S, B and ST as well as one dependent parameter (output), namely D80. Firstly, 58 datasets were used for constructing the predictive models and then the remaining 14 datasets were used to test the models. The performance capacity of the GPR models was evaluated based on several statistical functions, i.e. RMSE, RRMSE, MAPE, MBE and R 2. Based on the obtained results, the accuracy of the GPR-squared exponential model was higher than the other GPR models in predicting the D80. It is important to note that the datasets used in this study were already utilized by Hasanipanah et al. [6]. In their study, SVM, ANFIS and PSO-ANFIS models were offered for forecasting the D80. Hence, we can compare the predicted values by the GPR models with the Hasanipanah et al.’s [6] results. Finally, it was found that the GPR-Squared Exponential model with R 2 = 0.948 has better performance than the GPR-Exponential Kernel with R 2 = 0.939, the GPR-Matern 3/2 with R 2 = 0.942, the GPR-Matern 5/2 with R 2 = 0.943, the GPR-Rational Quadratic Kernel with R 2 = 0.936, the ANFIS with R 2 = 0.81, the SVM with R 2 = 0.83 and the ANFIS-PSO with R 2 = 0.89. In addition, the sensitivity analysis was carried out and based on the obtained results, PF was chosen as the most effective factor on the D80 in the present study.