Introduction

UCS is the most important and significant strength parameter, which is basically obligatory to determine in any civil, mining, geotechnical or geological projects. It is also crucial in slope mass or rock mass categorization systems like Q-slope (Bar and Barton 2017), Rock Mass Rating (Bieniawski 1973), and Slope Mass Rating (Romana 1985).

A plethora of researchers have attempted to estimate the UCS from indirect methods because the determination of UCS using standard methods suggested by ISRM (1979) and ASTM (2000) is very tough, time-consuming, destructive, cannot be determined in the field, expensive and requires expert handling. Indirect tests which have advantages over the UCS are P-wave velocity (VP), point load strength index (PLSI), Schmidt hammer rebound number (Rn), Brazilian tensile strength (BTS), and physical properties (ρ and porosity) have been used in different predictive models to estimate the UCS (Kahraman 2001, Chatterjee and Mukhopadhyay 2002; Yasar and Erdogan 2004; Sousa et al. 2005; Çobanoglu and Çelik 2008; Khandelwal and Singh 2009; Yagiz 2011; Sharma et al. 2017; Rahman et al. 2020, Rahman and Sarkar 2021, 2022, 2023 etc.). Among these indirect methods, the determination of ρ is the easiest, quick, cheapest, non-destructive, can be determined in the field, and requires no sample preparation.

The relationship between UCS and ρ has been thoroughly investigated in this work on the basis of lithology. Six rock types, including Pyroclastic, Sandstone, Shale, Carbonate, Plutonic, and Volcanite have been statistically studied, which have been incorporated from 26 different previous studies. The lithological control on the correlations for different rock types have been evaluated using SR. Two ANN models were also trained to estimate the UCS using the ρ and rock type information for overall dataset (ANN-1), and using ρ only as input parameter for each rock type separately (ANN-2). Then the estimated values of UCS using the models of SR and ANN were obtained and were analysed with respect to the measured values of UCS. The statistical significance and validation of the proposed estimation models is of great importance. Hence, the given models were assessed using different statistical tools such as, mean absolute percentage error (MAPE), root mean square error (RMSE), and correlation coefficient (R) to validate the accuracy and strength of the proposed models to estimate the UCS.

A number of researchers have used ρ and related parameters to estimate the UCS for multiple rock types at home and abroad (Kahraman et al. 2000; Chatterjee and Mukhopadhyay 2002; Karakuş et al. 2005; Moradian and Behnia 2009 etc.) while on the other hand, some researchers have studied this relationship based on the lithology (Tugrul and Zarif 1999; Singh and Dubey 2000; Yasar and Erdogan 2004; Mishra and Basu 2013; Rahman et al. 2020; Rahman and Sarkar 2021, 2022 etc.). Similarly, using supervised and unsupervised learning, Rahman and Sarkar (2021) proposed regression equations for 12 rock types between UCS and VP. The regressions to estimate UCS from ρ presented by different researchers in earlier works are displayed in Fig. 1, indicating that there is no universal regression that can be used to estimate UCS for diverse rock types based on lithology. As a result, the current research is quite crucial.

Fig. 1
figure 1

Empirical regression curves between UCS and ρ obtained from previous studies databases

Previous studies

A number of researchers have studied the relationship between mechanical properties and ρ of the rocks (Tugrul and Zarif 1999; Singh and Dubey 2000; Kahraman (2001); Chatterjee and Mukhopadhyay 2002; Moradian and Behnia 2009; Anikoh and Olaleye 2013; Awang and Naru 2015; Rahman and Sarkar 2021 etc.) (Table 1). Some researchers have provided the dataset for UCS and ρ but did not propose a correlation between these two parameters. Therefore, all the 26 previous studies datasets were analysed in bivariate regression analysis in Fig. 1. Tugrul and Zarif (1999) collected 19 granitic rock samples from different parts of Turkey and correlated the UCS with the dry and saturated unit weight, where they suggested good R2 values. Kahraman et al. (2000) used a new drillability index to estimate the penetration rate of rotary blast-hole drills, where they have provided different physical and mechanical properties, including UCS and ρ of 22 different rock types. Singh and Dubey (2000) used eight coal measure sandstone samples to suggest a correlation between UCS and ρ with an excellent R2 value of 0.96. Kahraman (2001) used 27 samples of coal measure rocks and suggested an exponential correlation to estimate UCS as a function of Rn and ρ with an average R2 value of 0.78. Chatterjee and Mukhopadhyay (2002) suggested separate linear correlations between UCS and ρ for K-G basin and Cauvery basin rocks of India with excellent R2 values of 0.92 and 0.96, respectively. Yasar and Erdogan (2004) studied the carbonate rocks of Turkey to suggest correlation equations between sound velocity and UCS, ρ and Young’s modulus (E). Karakuş et al. (2005) used index tests to estimate the elastic properties of 9 rock samples by multiple regression modelling. With a good R2 value of 0.84, Agustawijaya (2007) developed a linear correlation to estimate the UCS from ρ. Moradian and Behnia (2009) performed basic regression analysis to estimate the static elastic modulus using dynamic elastic modulus. Sarkar et al. (2010) employed an artificial neural network to estimate the strength characteristics from 40 samples of four different rock types. Kurtulus et al. (2010) collected 12 andesitic rock samples Cape Kaskaval location of Turkey and suggested a correlation between VP and ρ. Sarkar et al. (2012) used 94 samples of 13 rock types to develop regression models for estimating the UCS from VP. They also proposed equations to estimate the ρ from VP. For sandstone, schist, and granite rocks, Mishra and Basu (2013) proposed a correlation equation between UCS and ρ. Based on the lithology, their predicted regressions differ from one another. For shale samples, Anikoh and Olaleye (2013) proposed a linear regression equation. The mechanical and physical properties of hydrothermally altered volcanic rocks from New Zealand were investigated by Wyering et al. (2014). Majeed and Bakar (2016) studied the dependence of CERCHAR Abrasivity Index on the petrography and mechanical properties of 46 rock samples from Pakistan. Momeni et al. (2015) used hybrid particle swarm optimisation based artificial neural network to estimate the UCS from 4 indirect tests, including dry ρ. Awang and Naru (2015) studied 29 samples of granite and 44 samples of limestone and suggested linear regression equations between UCS and ρ separately. Kurtulus et al. (2016) studied 32 samples of limestone and suggested regression equations between ρ and VP, UCS and VP, etc. Ince and Fener (2016) developed a predictive model for UCS using several different parameters, including ρ for ten pyroclastic rock samples. Madhubabu et al. (2016) trained an artificial neural network to estimate the UCS and elastic modulus using five different parameters, including ρ. Broome and Lee (2018) obtained results for different physical and mechanical properties for granites, including UCS and ρ. González et al. (2019) employed regression analysis to calculate the UCS of 13 saturated limestones based on VP and porosity, as well as suggesting ρ and S-wave velocity values. Teymen and Mengüç (2020) estimated the UCS using 93 samples of various rock types and performed a comparitive investigation of different estimation models. The dynamic elastic characteristics and mechanical properties of coal measure rocks in India were connected orrelated with the VP by Rahman et al. (2020). For sandstone and shale samples individually, Rahman and Sarkar (2022) proposed a correlation equation between UCS and ρ in dry and saturated conditions.

Table 1 Previous studies with their proposed regression equation, rock types, and database identifier

Data processing

The data points collected by earlier researchers for various rock types have been disintegrated and categorised into six rock types based on lithology (Fig. 2). A total of 800 data points were used in the analysis. With an R-value of 0.731, a general overall trend was provided that included all rock types. Equation 1 is an exponential regression equation that was proposed.

$$\mathrm{UCS}={0.536}\mathrm{e}^{1.8238\uprho }$$
(1)
Fig. 2
figure 2

Lithology based classification of data-points with an overall regression trendline for the correlation between UCS and ρ

The data points were categorised into six rock types after data disintegration from prior investigations on the basis of lithology (pyroclastic, sandstone, shale, carbonate, plutonic and volcanite).

Result and discussion

Regression analysis

A basic regression analysis was used in this study to classify six different rock types. On the basis of the best R-value, the best fit curve was determined as linear (y = mx + c), exponential (y = mex), or power (y = mxc). The independent variable is x, the dependent variable is y, and the constant is c. The many lithological groups discovered by data processing are discussed below. For different correlations between UCS and ρ, an overall regression equation has been proposed to compare it with the lithology-based regression equations with respect to strength and accuracy.

Pyroclastic

A total of three previous study datasets were analysed in order to propose a general regression equation for pyroclastic rocks, and an exponential regression equation with an excellent R-value of 0.86 was developed (Fig. 3a).

$$\mathrm{UCS}={0.1216\mathrm{e}}^{3.1116\uprho }$$
(2)
Fig. 3
figure 3

Correlation between UCS and ρ for a pyroclastic, b sandstone, c shale, d carbonate, e plutonic, and f volcanite rocks

All the three previous studies' databases complement each other. The obtained regression for the T18 database is almost parallel and overlaps with the overall trend. The trendline obtained for the T24 database lies above and parallel to the overall trend, while the trendline for the T28 database overlaps the overall trendline.

Sandstone

To produce a characteristic regression trendline for sandstone rocks, 11 previous study datasets were employed (Fig. 3b). With an R-value of 0.87, an exponential regression equation was found.

$$\mathrm{UCS}={0.3007}\mathrm{e}^{1.9927\uprho }$$
(3)

The trendline obtained for databases T3, T14 and T28 are parallel to each other and steeper than the overall trend plotted at higher ρ values, respectively. The trend lines for databases T4. T5, T6, T9, T10, T13, T30 and T31 lie almost parallel to the obtained overall trend.

Shale

An exponential overall trendline was obtained using four previous studies databases with an excellent R-value of 0.89 (Fig. 3c).

$$\mathrm{UCS}={0.0333\mathrm{e}}^{3.05\uprho }$$
(4)

The regression trendline for T29 and T32 databases lies parallel and close to the overall regression trendline obtained. Gentler gradient trend lines were obtained for T13 and T17 databases than the overall trend.

Carbonate

This rock group contains all the carbonate group rocks such as limestone, marlstone, marble and dolostone. To propose a characteristic regression to estimate the UCS from ρ for carbonate rocks, a total of 13 prior studies databases were used (Fig. 3d). With a moderate R-value of 0.63, an exponential overall regression trendline was formed.

$$\mathrm{UCS}={0.1896\mathrm{e}}^{2.1225\uprho }$$
(5)

The derived overall regression trendline is gentler than the trendlines for databases T10, T22, T27, and T27. The T4 and T11 trendlines are nearly parallel to the general trendline, but they predict greater UCS values for corresponding ρ values; the T25 trendline, on the other hand, predicts lower UCS values for comparable ρ values. The derived overall regression trendline is quite near to the T2, T7, T8, T20, and T23 database trendlines.

Plutonic

Granite and other plutonic rocks from the T28 prior study database, such as diorite, granodiorite, syenite, and gabbro, make up this rock group. An exponential regression curve with a moderate R-value of 0.70 was proposed using data from seven prior research (Fig. 3e).

$$\mathrm{UCS}={0.0046\mathrm{e}}^{3.7241\uprho }$$
(6)

The T1, T14, and T26 regression trendlines have a steeper gradient than the overall regression curve, while T19, T20, T21, and T28 regression trendlines have more gentle gradients than the overall trendline.

Volcanites

The rock group consists of different volcanic rocks such as basalt, andesite, dacite, rhyolite etc. A power overall regression trendline has been proposed using five previous studies databases with a good R-value of 0.80 (Fig. 3f).

$${\mathrm{UCS}=2.3131\uprho }^{4.0974}$$
(7)

All the previous studies' regression trendlines (T8, T13, T18, and T28) follow the overall regression trendline except the T12 database, which has a steeper regression trendline.

Comparative analysis

Table 2 and Fig. 4 illustrate all of the regression equations and trendlines based on lithology that were obtained using prior studies' databases. Figure 4 shows the lithological control on the correlation between UCS and ρ. Each of the rock types addressed in this paper has its own characteristic regression trendline. This lithological control was also observed between the correlation of UCS and VP by Rahman and Sarkar (2021). Similar to Rahman and Sarkar (2021), the principal component analysis (PCA) has been applied to the present dataset. The PCA is an unsupervised machine learning technique which is used to classify datasets. It is used to simplify datasets or reduce its dimension or features such that minimum number of features can explain the maximum amount of variance. Here, we have only two features, i.e. UCS and ρ, and the PCA has reduced the dimension of the dataset from 2 to 1 (Fig. 5). The linear results of PC1 have been plotted for each rock type, and it can be observed that each rock type has been classified as per the lithology. Hence, we can conclude from this analysis that there is a clear control of the lithology on the relationship between UCS and ρ.

Table 2 Lithology-based regression equations with their R values for 6 different rock types
Fig. 4
figure 4

Obtained regression curves for six rock types in the present study

Fig. 5
figure 5

The lithological classification of the dataset using the PCA

The regression for pyroclastic rocks exists in the very low-ρ zone but estimates the highest UCS for the corresponding ρ values. Sandstone and carbonate have similar regression trendlines, however sandstone regression estimates higher UCS values for the corresponding ρ values. At a ρ of 2.8 g/cc, their regression trendline converges. The shale and plutonic rocks regressions have the highest gradients and are parallel to one another. For similar values of ρ, the regression for shale predicts greater UCS values than the regression for plutonic rocks. Volcanite rocks have a regression trendline that is parallel to that of carbonate rocks, but it estimates substantially higher UCS values for the corresponding ρ values. At a ρ of 2.8 g/cc, the volcanite and plutonic trendlines converge and cross-cut each other.

Statistical evaluation

Descriptive statistics were used to deduce the lithological control on the UCS and ρ of 6 different rock types. In Tables 3 and 4 for UCS and ρ, respectively, the statistical parameters such as mean, median, mode, standard deviation, variance, skewness and kurtosis give us a clear idea that each rock type which has been classified in this study have its own characteristics.

Table 3 Descriptive statistical parameters for UCS of overall and individual rock types
Table 4 Descriptive statistical parameters for ρ of overall and individual rock types

In Fig. 6, the maximum frequency for UCS was observed to be in the range of 50 to 60 MPa, while the maximum frequency for ρ was observed to be in the range of 2.7 to 2.8 g/cc for all the data points used in this study. The mean UCS for overall rock type was 61.43 MPa, and the mean ρ was 2.43 g/cc (Tables 3 and 4, respectively). Pyroclastic rocks were observed to be the weakest, with a mean UCS of 24.31 MPa, while volcanite rocks were observed to be strongest, with a mean UCS of 113.04 MPa. Similarly, pyroclastic rocks were also observed to be the least dense rock with a mean ρ of 1.59 g/cc, while plutonic rocks were observed to be the densest with a mean ρ of 2.64 g/cc.

Fig. 6
figure 6

Overall frequency bar graph including all rock types for a UCS, and b ρ

The frequency distribution plot for UCS was shown in Fig. 7a for six different lithology groups. The distribution was observed to be positively skewed, which is characteristic of the UCS of the rocks. The main distinguishing criteria for different rock types were observed to be the mean and standard deviation (Table 3). Conversely, the frequency distribution plot for ρ of the six lithology groups showed a negatively skewed characteristic (Fig. 7b).

Fig. 7
figure 7

Lithology based frequency bar graph including different rock types for a UCS, and b ρ

There are numerous classifications of rocks and rock masses on the basis of different rock properties, such as UCS, ρ, φ, and VP. The most significant and widely used parameter is the UCS. The classification of rocks on the basis of UCS has been suggested by many international standard agencies and groups, such as the Geological Society Engineering Group Working Party (1977), the Commission of Engineering Geological Mapping of the IAEG (1979), ISRM Commission on the Classification of Rocks and Rock Masses (1981). The classification of rocks proposed by IAEG on the basis of UCS and ρ has been given in Tables 5 and 6. The IAEG (1979) classification scheme was applied to the dataset of dry UCS and ρ obtained in the present study for six rock types from 26 previous studies. It was observed from the box chart in Fig. 8, the mean UCS of sandstone, shale, and pyroclastic rocks lie in the ‘moderately weak’ class, while the mean UCS of carbonate, volcanite, and plutonic rocks are classified in the ‘strong’ class. The box chart for ρ in Fig. 9 showed that the mean ρ of pyroclastic rock lies in the ‘very low’ ρ class. The mean ρ of sandstone, volcanite, and shale lie in the ‘moderate’ ρ class, while the mean ρ of carbonate and plutonic rocks lie in the ‘high’ ρ class.

Table 5 IAEG (1979) classification on the basis of UCS
Table 6 IAEG (1979) classification on the basis of ρ
Fig. 8
figure 8

Box-plot for the 6 rock types studied in this paper and the IAEG (1979) classification on the basis of UCS

Fig. 9
figure 9

Box-plot for the 6 rock types studied in this paper and the IAEG (1979) classification on the basis of ρ

Artificial neural network (ANN)

Artificial neural network is a robust supervised machine learning technique. This soft computing technique has been used since 1940’s in different fields of science and technology which mimics the functioning of a biological brain. The method has also been used widely in the field of Engineering Geology by many researchers (Ghabousi et al. 1991; Singh et al. 2001; Sharma et al. 2017).

In this study, a two-layered feed-forward network with hidden sigmoid neurons and linear output neurons was used to estimate the UCS using the ρ and rock-type information. The ANN-1 network consists of two neurons of input parameter (800 \(\times\) 2), a single neuron of target parameter (800 \(\times\) 1), and three hidden neurons (Fig. 10a), while the ANN-2 network consists of one neuron of input parameter (800 \(\times\) 1), a single neuron of target parameter (800 \(\times\) 1), and three hidden neurons (Fig. 10b).

Fig. 10
figure 10

The neural structures of a ANN-1, and b ANN-2 models to predict the UCS using ρ and rock type information

The Bayesian regularization (trainbr) algorithm was used to train the network using the neural fitting application of MATLAB, which is effective for such datasets (Rahman and Sarkar 2021). The performance of a trained network was obtained to be 715.71 at epoch 37 for the ANN-1 model. The regression for training and testing of the ANN-1 model produced an R-value of > 0.8 (Fig. 11). The R-value for training, testing and all the dataset, and the performance and epoch of the ANN-2 models for each rock type has been given in Table 7.

Fig. 11
figure 11

Showing the obtained R-values for training, test, and all dataset, and the performance of the ANN-1 model to predict the UCS using the ρ and rock type information

Table 7 Showing the obtained R-values for training, test, and all dataset, and the performance of the ANN-2 models to predict the UCS using the ρ for each rock type

Assessment of the predictive performance of the ANN and SR models

The relationship between the measured and estimated UCS from ANN and SR models for six different rock types have been shown in Fig. 12. The performance of the relationship was assessed using the statistical tools such as, mean absolute percentage error (MAPE), root mean square error (RMSE), and correlation coefficient (R). The MAPE is the measure of the accuracy of a predictive model which is defined by the given Eq. 8. Generally, the MAPE value range from 0 to 100%, but it can exceed above 100% for high estimated values, indicating low error in estimation with values closer to 0.

Fig. 12
figure 12

The comparison in a 1:1 scatted plot of the measured UCS values with the estimated UCS values obtained from ANN and SR models. a pyroclastic, b sandstone, c shale, d carbonate, e plutonic, and f volcanite rocks

$$MAPE= \frac{1}{n}{\sum }_{i=1}^{n}\left\{\frac{\left|{M}_{i}-{P}_{i}\right|}{{M}_{i}}\times 100\right\}$$
(8)

The RMSE is used to measure the differences between the measured and estimated values, which it calculated using the following Eq. 9. The RMSE values range from 0 to \(\infty\), indicating low error in estimation with values closer to 0.

$$RMSE= \sqrt{\frac{1}{n}{\sum }_{i=1}^{n}{\left({M}_{i}-{P}_{i}\right)}^{2}}$$
(9)

where Mi is the measured ith value, Pi is the ith predicted value and n is the total number of iteration.

The precise metric used in a correlation analysis to quantify the strength of the linear relationship between two variables is the correlation coefficient (Eq. 10). The values of R range from -1 to + 1 indicating the strength and the direction of the slope of the correlation. The R value closer to 1 indicate a good correlation, while the R value closer to 0 indicate a poor correlation.

$$R= \frac{cov\left(X,Y\right)}{{\sigma }_{X}{\sigma }_{Y}}$$
(10)

where cov is the covariance, X and Y are measured and estimated values respectively, and σ is the standard deviation.

The performance of the SR and ANN models was analyzed in a 1:1 measured vs estimated UCS scatter plot in Fig. 11. The MAPE values for ANN models for all rock types were observed to be greater than the MAPE values obtained for the SR model. On the other hand, the RMSE and R values were observed to be better for the ANN model than the SR model. The RMSE values were better for ANN models than the SR model in all rock types except pyroclastic and shale, and the R values were also better for the ANN-1 model for all rock types except pyroclastic and sandstone (Table 8). The results of the analyses indicated that the ANN model was much better in the fitting of the dataset, while the SR model was better in predictive accuracy. The inaccuracy in the prediction of UCS using the ANN-1 model can be explained by the lithological control on the correlations. The simple neurons of the ANN-1 model were unable to deduce the lithological control on the estimation of UCS using the ρ and rock type data, whereas, in the ANN-2 model, the accuracy was maximized. Since the SR model was developed for each lithology separately, it shows a greater accuracy in the prediction of the UCS. Hence, the regression equations provided in this section are of great significance and importance in the field of prediction of UCS.

Table 8 Statistical assessment of the estimated values of UCS from ρ using ANN and SR models for different rock types

Conclusion

The study aimed to establish characteristic regression equations for six rock types between UCS and ρ. The lithological control on the correlation between UCS and ρ has been evaluated using simple regression analysis and descriptive statistics.

The six rock types studied are pyroclastic, sandstone, shale, carbonate, plutonic, and volcanite. It was observed from the simple regression analysis that the regression for each rock type is characteristic of that particular lithology. Hence, six different regression equations were obtained for each rock type with a better R-value (except carbonate and plutonic rocks) than the R-value obtained for the overall regression, including all the rock types. The PCA was also used to classify the datasets on the basis of lithology, where it was observed that the PC1 components show characteristic trends on the basis of lithology.

Descriptive statistics were used as a robust tool to simplify the datasets of UCS and ρ in a meaningful and understandable way. The statistical parameters of central tendency (mean, median, and mode), standard deviation, sample variance, kurtosis, and skewness were used to classify the datasets into different rock types. The box plots for UCS and ρ with reference to the IAEG classification of rocks have also been shown to understand the statistical significance of the dataset for different lithological groups.

An ANN model was also developed using the BR algorithm to estimate the UCS using the ρ and rock types as inputs. The models of SR and ANN were compared in a 1:1 scatter plot and statistically tested, it was observed that the SR model was much better in terms of accuracy while the ANN model showed greater strength of the correlation between measured and estimated UCS values.