Introduction

Rocks, being solid materials, have the ability to undergo deformation when subjected to stress. Stress can be caused by various factors such as tectonic forces, gravitational forces, temperature changes, and human activities like mining and excavation. When stress is applied to a rock mass, it results in rock strain, which is the measurable change in the shape or size of the rock. Measuring and analyzing rock strain is essential for understanding rock behavior and stability. Various techniques are used to quantify and monitor strain in rocks, including strain gauges, extensometers, and geological surveys(Petr et al. 2016; Xu et al. 2020). These measurements provide valuable data for assessing the mechanical properties of rocks, predicting rock failure, and designing safe structures. Numerous studies on the identification of strain have been undertaken thus far as a result of the requirement for strain identification (Gundewar 2014; Laghaei et al. 2018; Isah et al. 2020; Li et al. 2020). In the study of rock, strain tensors for two main reasons. One is the assessment of intricate stress fields and the second is an estimation of complex strain fields through loading tests (Kikumoto and Togashi 2022). Monitoring of strain or displacement is frequently used to study the origin, development, and destruction of rock fractures. Rock failure could be forecast using the variation coefficient, according to a relation between the variation coefficient of rock surface displacement and the peak stress (Kong et al. 2023).

Rock strain prediction plays a crucial role in various fields such as geotechnical engineering, mining, and natural hazard assessment (Liu et al. 2013). Understanding the deformation behavior of rocks under different loading conditions is essential for ensuring the safety and stability of underground structures, optimizing mining operations, and assessing the potential risks associated with geological activities. Traditionally, rock strain prediction has relied on empirical and analytical models that require extensive data collection and manual analysis (Vergara et al. 2020). Conducting experiments in a lab can be expensive, requiring resources such as equipment, materials, and personnel. In some scientific experiments, obtaining representative samples or cores can be challenging, especially when dealing with large or inaccessible areas. Certain experiments may require long observation periods or involve time-sensitive processes. Analyzing large volumes of data generated in the lab can be a laborious and time-consuming task. However, advancements in machine learning techniques have opened up new possibilities for more accurate and efficient prediction models (Liu et al. 2017). Numerous academic institutions, research organizations, and industry companies have ongoing research projects and publications on this topic (Liu et al. 2017; Cioffi et al. 2020). Many researchers in the fields of geology, geotechnical engineering, and computational science are exploring the use of machine learning for predicting for different purposes as well as rock strain (Tariq et al. 2017; Cui et al. 2019; Kumar and Samui 2019; Sun et al. 2020; Asteris et al. 2021; Jahed Armaghani et al. 2021; Yu et al. 2021; Koopialipoor et al. 2022; Kumar et al. 2022c; Abdi et al. 2023a).

It's important to note that while machine learning can be a powerful tool, it is not a one-size-fits-all solution. The successful application of machine learning in the lab requires careful consideration of the specific problem, availability and quality of data, appropriate algorithm selection, and continuous validation and refinement of models to ensure accurate and reliable results (Indraratna et al. 2023; Medawela et al. 2023). Several researchers utilized the machine learning algorithm to solve complex and non-linear problems in different fields, especially in engineering and science (Al-Jeznawi et al. 2023; Tran et al. 2024; Jitchaijaroen et al. 2024). The estimation of soil resilient modulus using several machine learning algorithms was proposed by various researchers in recent few years (Sadik 2023; Sadik et al. 2024). The single models require more computational resources, can be more complex to implement, and may have higher training times. Additionally, the effectiveness of ensemble methods depends on the diversity and quality of the individual models. Nevertheless, when appropriately applied, ensemble learning algorithms have demonstrated significant advantages in various domains and have become a valuable tool in machine learning (Dietterich 2000; Sagi and Rokach 2018; Kunapuli 2023). (Momeni et al. 2015)predict the uniaxial compressive strength of rock samples using hybrid artificial neural networks augmented by particle swarm optimization techniques. (Mohamad et al. 2015) predict the the unconfined compressive strength (UCS) of soft rocks based on ANN hybrid swarm based optimization algorithm. (Mohamad et al. 2018) proposed the back-propagation (BP) artificial neural network (ANN) to estimate the uniaxial compressive strength (UCS) of rocks with a quite accurately having the R values equal to 0.988 for training phase and 0.999 for testing phase. (Nazir et al. 2013a) proposed the a new, reliable correlation between UCS and brazilian tensile strengthBTS for limestone, achieving a strong 0.9 determination coefficient. Several machine learning technique have been utilized to predict the UCS of rock and prediction of elastic modulus of weak rock mass sample (Nazir et al. 2013b; Momeni et al. 2015; Armaghani et al. 2020; Abdi et al. 2023b; Dowlatshahi et al. 2023).

In this study, ensemble learning algorithms are used to predict rock strain. "XGB: A Scalable Tree Boosting System" by (Chen and Guestrin 2016), is the original paper introducing XGB, and provides an overview of the algorithm, as well as its advantages over other gradient boosting implementations. Its speed and scalability, as it can effectively handle large datasets with a high number of features. XGB is highly robust to noise and outliers in the data, which makes it suitable for real-world applications where data quality may be suboptimal. It provides a feature importance metric that allows users to identify the most important features in their dataset. Overall, XGB is a powerful machine-learning library that offers significant advantages over traditional single-model approaches, including improved accuracy, speed, scalability, and interpretability (Chen and Guestrin 2016; Ma et al. 2022; Su et al. 2022). Extra Trees Regression (ETR) is a powerful ensemble learning method that can be used to improve the accuracy of regression tasks. The algorithm is based on decision trees and uses multiple random subsets of the data and features to construct the trees. There have been many research papers on Extra Trees Regression, and the results have shown that the algorithm can achieve high accuracy while requiring less training time than other methods(Geurts et al. 2006; Ahmad et al. 2018; Kumar et al. 2023c). KNN is a widely used and well-studied machine learning algorithm that has been applied to various domains. The literature on KNN includes both theoretical and practical aspects, and there have been many developments and improvements to the algorithm over the years. ((Bhatia and Vandana 2010; Stevens et al. 2011). The limitations, Extra Trees can suffer from high variance, especially when the number of trees in the forest is large. This might lead to overfitting, especially on smaller datasets. While decision trees are generally interpretable, the ensemble nature of Extra Trees makes it harder to interpret the model's decision-making process compared to a single decision tree. tuning of hyperparameters like the number of trees, maximum depth, and minimum samples per leaf, which can be time-consuming and computationally expensive. XGBoost has several hyperparameters that need to be tuned, such as the learning rate, maximum depth, minimum child weight, etc. The performance of XGBoost can be sensitive to these parameters, and finding the optimal set of parameters might require extensive computational resources. Similar to other gradient-boosting algorithms, XGBoost can be prone to overfitting, especially when the dataset is noisy or when the number of trees (iterations) is too large. Prediction with KNN involves calculating distances between the query instance and all training instances, which can be computationally expensive, especially with large datasets or high-dimensional data. KNN typically requires storing the entire training dataset in memory, which can be memory-intensive for large datasets, limiting its scalability.

Predicting rock strain using machine learning is a promising area of research that has the potential to improve our understanding of rock deformation behavior and aid in the exploration and exploitation of geological resources. In this study, ensemble models are applied to estimate the strain in the rock sample by data collected from the laboratory experiment. The data consists of pairs of input and output data, where the input data is used to make a prediction and the output data represents the correct or desired output for that input. The input features are the Height of the strain gauge, the Angle of strain gauge, and Stress. There will be development, training, and testing of the model. Overall, model result analysis is a crucial step in the machine learning process, as it helps to evaluate the performance of a model. The results of all models must be compared in order to select the most robust model using the performance parameter, Actual and predicted curve, Rank analysis, Sensitivity Analysis, Error matrix, and OBJ criteria.

Research significance

Many regions are prone to natural hazards such as earthquakes, landslides, and rock falls. Climate change is causing significant shifts in environmental conditions, which can impact rock behavior and stability. By studying rock strain, researchers can gain insights into the behavior of rocks under high-stress conditions, which contributes to the understanding and prediction of earthquakes, landslides, and other geological hazards. Also affects the stability and performance of civil infrastructure, such as dams, bridges, and underground structures. This knowledge is crucial for developing strategies to mitigate the risks associated with such events. Predicting rock strain using ensemble models can aid in assessing the potential for these hazards and developing effective mitigation strategies. This research can contribute to enhancing the resilience of communities globally, minimizing the loss of lives and infrastructure, and improving disaster preparedness and response.

Details of data

In this study, the cylindrical granite rock sample having fixed diameter and height of 40 mm and 108 mm respectively was subjected to uniaxial compression loading to measure the load and deformation. Data from the lab was acquired during experimental testing to predict strain in a granite rock material. Measurements of the load and deformation on the longitudinal axis of the circumference of the rock sample were made using a uniaxial compression load cell and several strain gauge transducers. Figure 1 depicts the specifics of the strain gauge's setup. The outer perimeter was equipped with a total of 48 electronic strain gauges in both lateral direction and longitudinal directions. The readings of these loads and deformation were gathered in a data-gathering system. The collected dataset from the experiments was utilized to calculate stress and strain based on the rock's dimensions. The average Poisson ratio of this granite rock is 0.27. To develop the machine learning models total of 3000 datasets of granite rock samples were acquired which include the variables namely, height of rock samples, angles, stress of the strain gauge, and strain in lateral and longitudinal directions. Measuring strain at different heights within the rock mass offers insights into stress distribution vertically and identifies potential zones of weakness or concentration. Rocks often exhibit anisotropic behavior, signifying variation in mechanical properties with direction. Orienting strain gauges at specific angles aids in understanding stress distribution along different axes, characterizing rock behavior under varying loading conditions. Therefore, to train and test the machine learning models the height, angle, and stress of the strain gauge are considered input variables. Whereas, the lateral and longitudinal strains are considered as the output variable. Soft computing techniques were used to predict the strain in a rock sample using these input and output data. The sample datasets are shown in Table 1.

Fig. 1
figure 1

Illustration of loading and strain gauges under uniaxial load

Table 1 Sample dataset

After the input and output selection, the whole dataset is normalized in the range between [0,1] to standardize the formatting of the dataset. The main aim of normalization is to reduce the dimensional effect of different variables on the output. The following mathematical Eq. 1 has been utilized to normalize the input and output variables (Singh and Singh 2020). Out of the 3000 datasets, 2100 datasets were utilized for model training (TR) (about 70% of the total), while 900 datasets were used for model testing (TS) (30%). To ensure the proper learning and validation of the models, the authors have selected the widely popular and successful criteria from the literature review.

$${\varvec{N}}=\frac{{{\varvec{N}}}_{{\varvec{a}}}-{{\varvec{N}}}_{{\varvec{m}}{\varvec{i}}{\varvec{n}}}}{{{\varvec{N}}}_{{\varvec{m}}{\varvec{a}}{\varvec{x}}}-{{\varvec{N}}}_{{\varvec{m}}{\varvec{i}}{\varvec{n}}}}$$
(1)

where N is the Normalized value, \({N}_{a}\) is actual value, \({N}_{min}\) is Minimum value and \({N}_{max}\) is Maximum value.

Statistical description of the dataset

Statistical analysis is a method of analyzing numerical data using statistical techniques to draw conclusions from the data Compared to many previous research types, a more comprehensive range of the database has been considered in this study. Table 2 summarizes a dataset of 3000 data points with five variables. The provided statistics offer insights into the central tendency (mean, median), spread (standard deviation), and distribution (quartiles) of the data for each variable.

Table 2 Statistical view of the dataset

Performance indices

Several performance indices along with the coefficient of determination (\({R}^{2}\)), variance account factor (VAF), performance index (PI), Mean absolute error (MAE) (Chai and Draxler 2014), and Root mean square error (RMSE) were evaluated to check the performance of proposed models (Kumar et al. 2022a, 2023a, b). The mathematical expression of these performance indices along with their ideal values are presented in Table 3.

Table 3 Performance evaluation parameter with the ideal value

Where P indicates the predicted value of the target variable A indicates the experimental value of the target variable and n indicates the total number of datasets used in this study.

Machine learning algorithms

Extreme gradient boosting (XGB)

Extreme Gradient Boosting (XG-Boost) is a powerful and widely used machine learning algorithm for supervised learning problems, particularly in the fields of predictive modeling and data analysis. XG-Boost is based on the concept of boosting, where weak learners are combined to create a stronger, more accurate model. XG-Boost is an extension of the gradient boosting method, which iteratively adds decision trees to a model to improve its performance (Chen and Guestrin 2016; Bentéjac et al. 2021). The complexity of the trees is regulated by a variation of the loss function, as shown by Eq. (2), where T represents the number of leaves in the tree and w represents the output scores of the leaves. XG-Boost improves upon traditional gradient boosting by incorporating regularization to prevent overfitting, and by optimizing the objective function using second-order gradients, which leads to faster convergence and better accuracy.

$${\mathrm{L}}_{xgb}=\sum_{i=1}^{N}L\left({y}_{i},F\left({x}_{i}\right)\right)+{\sum }_{m=1}^{M}\Omega \left({h}_{m}\right)$$
(2)
$$\Omega\left(h\right)=\gamma T+\frac12\lambda\left\|w\right\|^2$$
(3)

The split criterion of decision trees in XG-Boost can incorporate a user-defined loss function, which allows the algorithm to optimize for a specific metric that is relevant to the problem at hand. This can be used as a pre-pruning strategy by setting a threshold on the gain in the loss function that must be achieved by a potential split for it to be considered. If the gain is below the threshold, the split is not made, and the tree is pruned.

The loss function used in XG-Boost is typically a sum of two terms: a term that measures the goodness of fit of the model to the training data, and a regularization term that penalizes complex models to prevent overfitting. The regularization term includes a parameter γ, which controls the complexity of the tree. Trees with higher values of γ are simpler, as they are penalized more heavily for having many splits or a large number of leaf nodes.

In XG-Boost, the optimal value of γ can be selected using cross-validation or other methods, and this can help prevent overfitting and improve the generalization performance of the model. By combining regularization with the use of a pre-pruning strategy based on a loss function, XG-Boost can produce accurate and interpretable models that are robust to noise and outliers in the data (Kumar et al. 2022b).

Extra Tree Regressor (ETR)

The Extra Trees Regression (ETR) technique is a tree-based ensemble machine learning approach designed to mitigate the overfitting issue inherent in the original random forest algorithm (Geurts et al. 2006). The mathematical formulation used in this approach is the same as that of the decision tree regression (DTR) algorithm proposed by (Ahmad et al. 2018). Instead of utilizing the bagging approach to build the training subset for each tree, all datasets are used in this strategy to train all trees in an ensemble. By merging a certain number of estimators, ensemble algorithms aim to lower the model's variance relative to the variance of a single tree. The result is a model with greater stability and generalization potential for the geotechnical practitioner. Such models predict a final output that is simply the mean of all individual trees' outputs that are causally in the same family. This algorithm's model includes a measure of unpredictability, which is one of its distinguishing features. The process of randomness is incorporated in two ways: (i) either a random subset of features is selected from the whole collection of features to be used in producing a split, (ii) or the thresholds of the selected features are chosen at random. On the basis of above stated two arbitrary choices are used in the following formula to get the optimal split. One benefit of utilizing such ensembled models is that, in most cases, no considerable hyperparameter adjustment is required to create a good-quality model, despite the fact that such models cannot be regarded as DTR models. However, the most crucial factor that should not be overlooked while creating ETR models is the selection of the number of estimators to be utilized.

K-nearest neighbor

Fix and Hodges (Fix and Hodges Jr 1952) and Altman (Altman 1992) created the k-nearest Neighbors algorithm (k-NN) in 1951 as a non-parametric classification technique. It belongs to the family of instance-based learning methods, where the algorithm doesn't explicitly learn a model but instead uses the training instances themselves to make predictions on new instances. The object's property value is the outcome of the k-NN regression. As k-NN predictions are based on the intuitive premise that neighboring objects may be similar, it makes sense to discriminate between the K nearest neighbors when making predictions. Like lazy learning, neighbors-based regression just stores examples from the training data rather than attempting to build a generic internal model (Roh et al. 2020). The choice of k can have a significant impact on the performance of the k-NN algorithm. A small value of k can lead to overfitting, where the algorithm captures the noise in the data and produces overly complex decision boundaries. On the other hand, a large value of k can lead to underfitting, where the algorithm fails to capture the local structure of the data and produces overly simplistic decision boundaries. One of the advantages of the k-NN algorithm is that it doesn't require a training phase to determine the model parameters, unlike many other machine learning models. Instead, the training dataset serves as a reference dataset to determine the number of nearest neighbors (k) to consider when making predictions. Calculating the average of the output strain in the lateral and longitudinal direction of the K nearest neighbors forms the foundation of the straightforward mathematical application of KNN regression. Generally, a straightforward KNN regression algorithm used at least three different distance functions. (1) Euclidean distance function (\({E}_{d}\)), (2) Manhattan distance function (\({M}_{d}\)), and (3) Minkowski distance function (\({M}_{id}\)), mathematical expression presented in Eq. (4), (5) and (6) respectively.

$${E}_{d}=\sqrt{{\sum }_{i=1}^{K}{\left({x}_{i}-{y}_{i}\right)}^{2}}$$
(4)
$${M}_{d}={\sum }_{i=1}^{K}\left|{x}_{i}-{y}_{i}\right|$$
(5)
$${M}_{id}={\left\{{\sum }_{i=1}^{K}{\left(\left|{x}_{i}-{y}_{i}\right|\right)}^{q}\right\}}^{1/q}$$
(6)

Result and discussion

Modeling parameters of proposed models

In this study, three advanced machine learning models i.e., ETR, XGB, and KNN were proposed to predict the strain of lateral and longitudinal direction. It is important to note the optimum models were developed by following a trial-and-error approach used for selecting hyperparameters of proposed models in Python using the Scikit-optimizer package. In Table 4, the optimum hyperparameters of the XGBoost model, such as the number of estimators, number of iterations, learning rate, and number of boosting rounds, are listed along with their corresponding values.

Table 4 Result of optimal hyperparameters value for XGB model

The same dataset used for the construction of the ERT and KNN model and optimum hyperparameters with their corresponding values for the ETR model are mentioned in Table 5. These parameters include the number of estimators, maximum depth, maximum number of features, minimum number of leaf nodes, minimum number of samples to split a node, and the number of jobs to run in parallel. In the application of the KNN model the neighbor value considered one (i.e., K = 1) in this study, and other hyperparameters are taken as the default in the original study of the KNN algorithm.

Table 5 Result of optimal hyperparameters value for ETR model

Performance evaluation

To evaluate and compare the performance of proposed models, ten performance indices namely, R-squared (R2), Adjusted R-squared (AdjR2), Weighted Mean Absolute Percentage Error (WMAPE), Nash–Sutcliffe Efficiency (NS), Root Mean Square Error (RMSE), Variance Accounted For (VAF), Prediction Interval (PI), RMSE-observations Standard Deviation Ratio (RSR), Willmott’s Index of Agreement (WI), and Mean Absolute Error (MAE), were determined for both lateral and longitudinal dimensions presented in Table 6 and 7 respectively. The results presented in Tables 5 and 6 provide the proposed model's performance in quantitative manners for both lateral (x) and longitudinal (y) dimensions. For an ideal model performance indices value should be equal to their respective ideal value mentioned in Table 2. Generally, the models that achieved the higher accuracy parameters and lower error parameter values are considered the best-performing models. The proposed XGB models attained the maximum coefficient of determination and RMSE value \({(R}_{\mathrm{TR}}^{2}=0.991, RMS{E}_{TR}=0.042,{ R}_{\mathrm{TS}}^{2}=0.963, RMS{E}_{TS}=0.077)\) followed by KNN \({(R}_{\mathrm{TR}}^{2}=0.974, RMS{E}_{TR}=0.065,{ R}_{\mathrm{TS}}^{2}=0.961, RMS{E}_{TS}=0.082)\) and ETR \({(R}_{\mathrm{TR}}^{2}=0.888, RMS{E}_{TR}=0.142,{ R}_{\mathrm{TS}}^{2}=0.878, RMS{E}_{TS}=0.150)\) model during the both training and testing phases in the lateral (x) dimension. Similarly, the XGB model outperformed the KNN and ETR in the longitudinal (y) dimension. Finally, from the presented performance indices value for proposed machine learning models in both lateral (x) and longitudinal (y) dimensions presented in Tables 1 and 2, it can be concluded that the XGB model outperformed the KNN and ETR models.

Table 6 performance indices in lateral (x) dimension
Table 7 Performance indices in longitudinal (y) dimension

Actual and predicted curve

The "actual vs predicted curve" in regression analysis pertains to a visual depiction that contrasts the observed or factual values of the dependent variable against the predicted values produced by a regression model. Visual assessment is a technique utilized to evaluate the degree of alignment between the predictions of a model and the actual data points. The Actual vs Predicted Curve displays a comparison between the observed values of the dependent variable in both the training and testing sets and the corresponding predicted values generated by the regression model. Each data depicted on the graph corresponds to an individual instance of observation. The comparison between the actual and predicted curve holds significance in regression analysis owing to its ability to offer a graphical evaluation of the model's efficacy. This enables one to comprehend the degree of precision with which the model captures the fundamental association between the independent and dependent variables. Through a visual examination of the curve, it is possible to recognize discernible patterns, trends, or outliers, which may offer valuable insights into the efficacy and constraints of the regression model.

Moreover, the curve facilitates the assessment of the model's predictive precision. When the predicted values exhibit a high degree of similarity with the actual values, it suggests that the model is performing effectively. However, major outliers or patterns in the curve may point to places where the model should be strengthened or where more research is required. In general, the comparison between the actual and predicted curve is a valuable visual aid for evaluating a model, offering a rapid and intuitive means of gauging the efficacy and prognostic potential of a regression model.

In Fig. 2 and 3 (c and d), XGB model in both the cases and in both training and testing phases, the XGB model exhibits a curve that closely follows the diagonal line, indicating a strong agreement between the predicted and actual values. The performance of the KNN model in the Y-direction dataset is on par with the XGB, which is the best-performing model overall. The curve for the ETR model in the Y direction for both the training and testing phases exhibits deviations from the diagonal line and suggests discrepancies or errors in the model's predictions. In the X direction too, the ETR model’s scatter plot doesn’t indicate satisfactory performance. Thus, the performance of XGB and KNN is satisfactory in both cases, out of which XGB is the most robust model. The performance of ETR is not satisfactory in both datasets.

Fig. 2
figure 2

Actual vs. predicted; Lateral dimension a ETR (Training) b ETR (Testing) c XGB (Training) d XGB (Testing) e KNN (Training) f KNN (Testing)

Fig. 3
figure 3

Actual vs. predicted; longitudinal dimension a ETR (Training) b ETR (Testing) c XGB (Training) d XGB (Testing) e KNN (Training) f KNN (Testing)

Rank analysis

The evaluation of soft computing models' performance is facilitated by score analysis or rank analysis. Through the analysis of the scores produced by the model, it is possible to evaluate the model's capacity to effectively classify or forecast occurrences. Soft computing models have the potential to offer valuable insights into the level of confidence or uncertainty associated with their predictions, as inferred from the generated scores. Higher scores are indicative of a more robust correlation or probability of inclusion within a specific category, while reduced scores may suggest equivocation or indeterminacy. The analysis of scores is a useful tool for comprehending the degree of assurance exhibited by a model. This information is pertinent in various contexts, such as decision-making, risk evaluation, and the detection of cases that necessitate additional examination. On an individual basis, the technique assigned the highest rank to the model with the best value for each index and the lowest rank to the model with the poorest value for each index, for both the training and testing results. The summation of the individual ranks was utilized to determine their ultimate score. Finally, the ultimate score for each model is determined by the aggregate ranks obtained from both the training and testing phases (Fig. 4).

Fig. 4
figure 4

Rank analysis for lateral (x) and longitudinal (Y) dimensions in both training and testing

The results of the rank analysis for the study are presented in Table 8 and Fig. 4 for the X-direction dataset, where TR stands for training and TS stands for testing. The XGB model is the top scorer in both the training phase (30) and the well-testing phase (30), followed by KNN (20 in both training and testing). In overall score as well, XGB scored highest (60), followed by KNN and ETR having scores of 40 and 20 respectively. Thus, this section concludes that XGB is the first-ranked model, KNN is second-ranked, and ETR is the lowest third-ranked. The score analysis for the Y-direction dataset is similar to that of the X-direction. Thus, this section concludes that XGB is the first-ranked model, KNN is second-ranked, and ETR is the lowest third-ranked. Overall, XGB is the best-performing model, followed by KNN and ETR respectively.

Table 8 Result of score analysis for lateral (x) and longitudinal (Y) dimension

Sensitivity Analysis

Sensitivity analysis is a technique used to assess the impact of variations or changes in the input variables (i.e., Height, Angle, and stress) of a model on the output (i.e., lateral and longitudinal strain) of that model. It helps in understanding how sensitive the model is to different factors and allows for the identification of critical variables that have the most influence on the results. The purpose of sensitivity analysis is to gain insights into the relationships between the inputs and outputs of a system, evaluate the robustness of a model, and assess the risks and uncertainties associated with decision-making. By systematically varying the input variables within a defined range and observing the corresponding changes in the output, sensitivity analysis provides a quantitative measure of the impact of each input variable. In this study to determine the strength between input variables and output variables cosine amplitude method (CAM) has been used. The following mathematical expression is presented to calculate the impact strength

$${{\varvec{R}}}_{{\varvec{i}}{\varvec{j}}}=\frac{{\sum }_{{\varvec{k}}=1}^{{\varvec{n}}}{{\varvec{x}}}_{{\varvec{i}}{\varvec{k}}}{{\varvec{x}}}_{{\varvec{j}}{\varvec{k}}}}{\sqrt{{\sum }_{{\varvec{k}}=1}^{{\varvec{n}}}{{\varvec{x}}}_{{\varvec{i}}{\varvec{k}}}^{2}{\sum }_{{\varvec{k}}=1}^{{\varvec{n}}}{{\varvec{x}}}_{{\varvec{j}}{\varvec{k}}}^{2}}}\times 100$$
(7)

where the strength of correlation between the input data pair \({x}_{i}\) and \({x}_{j}\) and impact strength is denoted by \({R}_{ij}\). The value of \({R}_{ij}\) closer to 100 means more impact of the corresponding input variable on the outstrain in both lateral (x) and longitudinal (y) direction. The impact of individual input parameters on output parameters for the actual dataset and all proposed models are presented in Figs. 5 and 6 for lateral and longitudinal direction respectively. From the presented result of sensitivity analysis, it can be concluded that the impact of stress in predicting the lateral and longitudinal strain is greater as compared to angle and height. In this study, stress has the highest influencing parameters in predicting the lateral and longitudinal strain followed by angle and height.

Fig. 5
figure 5

Impact of input and output data with the predicted model (Lateral)

Fig. 6
figure 6

Impact of input and output data with the predicted model (Longitudinal)

Error matrix

An error matrix is a table that summarizes the performance of a classification model. It provides a more detailed breakdown of the model's predictions by showing the number of correct and incorrect predictions for each class. Error matrix provides a detailed and intuitive representation of a regression model's performance, enabling deeper analysis, error identification, and informed decision-making. It is a valuable tool in evaluating, improving, and fine-tuning machine learning models.

The error matrix for both datasets is provided in Figs. 7 and 8. It can be observed that the XGB model in the training phase has an error of almost zero for the datasets of both, X and Y directions. In the testing phase as well, the errors exhibited by the XGB model are the lowest, compared to the other two models. Conversely, ETR exhibits errors as large as 10% and 10% for MAE criteria in training and testing, respectively. The errors for R2 and RMSE are 12% and 15% respectively, for the testing phase. Thus, the matrix provides error regonization and comparison between the models, concluding the XGB and ETR as the best performing and the poorest model, respectively (Fig. 9).

Fig. 7
figure 7

Error matrix of lateral dimension

Fig. 8
figure 8

Error matrix of longitudinal dimension

Fig. 9
figure 9

Illustration of OBJ value for lateral (x) and longitudinal (y) dimension

OBJ criteria

To determine the degree to which the generated solution is the genuine optimal solution (Hossein et al. 2013) suggested a set of objective (OBJ) criteria. The values of the objective function are used as a yardstick to determine the solution's quality. The mathematical equation used to characterize the OBJ value is presented as follows:

$${\varvec{O}}{\varvec{B}}{\varvec{J}}=\left(\frac{{{\varvec{N}}}_{{\varvec{T}}{\varvec{R}}}-{{\varvec{N}}}_{{\varvec{T}}{\varvec{S}}}}{{{\varvec{N}}}_{{\varvec{T}}{\varvec{o}}{\varvec{t}}{\varvec{a}}{\varvec{l}}}}\right)\times \left(\frac{{\varvec{M}}{\varvec{A}}{{\varvec{E}}}_{{\varvec{T}}{\varvec{R}}}}{{{\varvec{R}}}_{{\varvec{T}}{\varvec{R}}}^{2}}\right)+\left(\frac{2{{\varvec{N}}}_{{\varvec{T}}{\varvec{S}}}}{{{\varvec{N}}}_{{\varvec{T}}{\varvec{o}}{\varvec{t}}{\varvec{a}}{\varvec{l}}}}\right)\times \left(\frac{{\varvec{M}}{\varvec{A}}{{\varvec{E}}}_{{\varvec{T}}{\varvec{S}}}}{{{\varvec{R}}}_{{\varvec{T}}{\varvec{S}}}^{2}}\right)$$
(8)

where, \({N}_{TR}\) and \({N}_{TS}\) is the number of training data and testing data. \({R}_{TR}^{2}=\mathrm{ coefficient of determination for the training} \mathrm{phase}\) and \({R}_{TS}^{2}=\) coefficient of determination for training phase Similarly, \(MA{E}_{TR}=\mathrm{mean absolute error for the training phase}\) and \(MA{E}_{TS}\)=mean absolute error for the training phase. For the best model, the OBJ value should be near zero, while for the ideal model, it should be zero. From Fig. 8 it can be concluded that the XGB is the best performing model. The values in the Y direction dataset is particularly are more encouraging, which indicates that the performance of the models varies as per the datasets and it’s important to train and test the models before application to the field datasets. The performance of KNN is close to the XGB, however, that of ETR is not satisfactory.

Conclusion

In conclusion, this study successfully implemented three advanced machine learning models—Extra Trees Regressor (ETR), eXtreme Gradient Boosting (XGB), and K-Nearest Neighbors (KNN)—to predict the strain in lateral and longitudinal directions. The optimal models were developed using a trial and error approach to select hyperparameters via the Scikit-optimizer package in Python. The XGB model, in particular, demonstrated superior performance, with its optimal hyperparameters, such as the number of estimators and learning rate, carefully tuned to achieve the best results as listed. Similarly, the optimal parameters for the ETR model are detailed in Table 4, and the KNN model was configured with the neighbor value set to one. The performance of these models was evaluated using ten different indices, including R2, RMSE, and MAE, for both lateral (x) and longitudinal (y) dimensions. The percentages used for model training and testing were, respectively, 70% and 30% of the main dataset. The predicted results were examined using the performance evaluation, rank analysis, the actual vs. predicted curve, the Error matrix, and the OBJ values. According to the experimental data, for each analysis, the XGB (R2 = 0.991; 0.963 and 0.999; 0.997) was more accurate than the ETR and KNN models in training and testing of lateral and longitudinal dimensions. The "actual vs. predicted" curves further corroborated the superior performance of the XGB model, showing a strong alignment with the diagonal line, indicating accurate predictions. In contrast, the ETR model exhibited significant deviations, suggesting less accurate predictions. The rank analysis further confirmed the XGB model as the top performer, with the highest overall scores in both training and testing phases, followed by the KNN model, while the ETR model ranked lowest. Sensitivity analysis revealed that stress was the most influential parameter in predicting strain, followed by angle and height. Error matrix evaluations highlighted the XGB model's minimal errors compared to the other models, reinforcing its robustness and reliability. The OBJ criteria analysis also identified the XGB model as the closest to the ideal solution, particularly for the longitudinal (y) dimension dataset. Overall, the XGB model emerged as the most robust and reliable for predicting strain in both lateral and longitudinal directions, making it a valuable tool for practical applications in fields requiring precise strain predictions. The study underscores the importance of hyperparameter optimization and thorough model evaluation to achieve the best predictive performance. Further investigations are required to assess its performance under different loading conditions and with diverse rock types. The study's next directions might involve a full evaluation of the proposed ETR, XGB, and KNN models as well as hybrid models that combine deep learning, additional optimization methods, and neural networks.