Introduction

The suitability and durability of rocks for their use in construction are a function of the mineral assemblage and framework, which eventually governs their crucial physico-mechanical properties [1]. The availability of an intuitive and highly significant regression model can be extremely valuable in a situation where quick and reliable prediction of mechanical parameters is needed or conventional laboratory testing of the rock samples is not possible. It is extremely convenient to perform index tests as compared to refined laboratory tests [2]. Properties like density, uniaxial compressive strength (UCS), porosity (n), Young’s modulus (E) and tensile strength (TS) are a few attributes that have direct bearing on the workability and durability of rock structures in engineering projects [3]. Furthermore, these properties are critical for the design and stability assessment of a super- or sub-structure on the rocks. Several authors have reported various equation to predict the mechanical parameters of different rocks [4,5,6,7,8,9]. Vindhyan sandstones have been used as building and construction material for many monuments, and the correlation model can be representative for the prediction of mechanical attributes of rocks of Ganga plane.

In the present work, rock samples from Kaimur district in Eastern India (Fig. 1a) have been assessed through laboratory testing for various strength parameters and subsequent statistical treatment. The rocks belong to Kaimur Group of Vindhyan supergroup. Kaimur Group is 1210 ± 52 million years old with peculiar sedimentary deposits of Vindhyan Super Group [10]. It is assorted into two subdivisions: Lower and Upper Kaimur subdivision. Sasaram Formation, Ghurma shale and Markundi sandstone are subsumed under Lower Kaimur Group, whereas Bijagarh shale, Mangesar formation and Dhandraul quartzite compose the Upper Kaimur Group [11,12,13,14]. Therefore, in the present research, an attempt has been made to test, investigate and reliably predict UCS, E and TS from basic properties like dry density and porosity.

Fig. 1
figure 1

a Geological map of the study area [14], Testing setup b UCS, c TS, d E

Physico-Mechanical Properties

An extensive field investigation was carried out to extract fresh sandstone samples from Bhagwanpur and Chainpur village, Kaimur district, India. Total 10 nearby localities were selected from which blocks measuring 30*30*30 cm3 were taken. A total of 30 core samples of NX size and 30 disks with a diameter of 54.7 mm were retrieved. Eventually, the testing of rock samples was done according to the prescribed ISRM codes [15,16,17].

The samples were dried by heating the samples for 24 h at a temperature of 105 ± 3 °C, and dry density was determined using Eq. (1). Afterward, the samples were submerged in water for 24 h and the saturated mass (Msat) was established, and eventually porosity was ascertained through Eq. (2) [15].

$$\rho_{d} = \frac{{M_{dry} }}{{V_{total} }}$$
(1)
$$n = \frac{{V_{{{\text{void}}}} }}{{V_{{{\text{total}}}} }} = \frac{{\left( {M_{{{\text{sat}}}} - M_{{{\text{dry}}}} } \right)/\rho_{w} }}{{V_{{{\text{total}}}} }}*100$$
(2)

where Vvoid is volume of pore spaces, Mdry is the mass of the dried sandstone sample, Vtotal is the total volume and ρw is density of water.

UCS of a rock sample is the maximum axial load that an NX size cylindrical core can withstand before failure [Eq. (3)] (Fig. 1b). Strain gauges were attached to the samples to measure the strain (ε) as a function of the stress (σ) on the sample to assess E of the rock samples using Eq. (4) (Fig. 1d). Also, disks of NX sized cores were placed in a Brazilian cage and tested for their indirect tensile strength (TS) (Fig. 1c) [16]. The failure load (P) in kN, thickness of the disk (t) and diameter (d) of the samples in mm are recorded to calculate the tensile strength [Eq. (5)].

$$UCS = \frac{4P}{{\Pi *d^2}}$$
(3)
$$E = \frac{\sigma }{\varepsilon }$$
(4)
$$TS = \frac{2P}{{\Pi *d*t}}$$
(5)

The results of the tests are summarized in Table 1. The tested Young’s modulus values lie between 3.14 and 9.6 GPa, with a median value of 6.6 GPa. The laboratory investigations ascertained maximum and minimum tensile strengths of tested samples to be 16.6 MPa and 6.1 MPa, respectively.

Table 1 Data summary of the tested physico-mechanical properties of the sandstone

Statistical Analysis

The laboratory tested physico-mechanical properties have been analysed for correlation and prediction, for design and stability assessment. Linear regression is a statistical evaluation method that calculates the relationship between a response variable y and one or more explanatory variables x [Eq. (6)]. The linear regression uses the sums of the least square method to predict the regression line [18].

$$y = \beta_{0} + \beta_{1} x_{1} + \beta_{2} x_{2} + \cdots + \beta_{n} x_{n} + \epsilon$$
(6)

where y is the response variable, n is the number of observations, (x1, x2, …. xn) are explanatory variables, (β1, β2,… βn) are the coefficients and β0 is the intercept on y-axis and \(\epsilon\) is error [19].

The constructive correlation matrix has been shown to illustrate the distribution pattern of samples, scatter plot and correlation coefficient values together (Fig. 2).

Fig. 2
figure 2

Coefficient of correlation (r) matrix of the tested physico-mechanical properties [20]

Simple linear regression investigates the relation of dependent variable with single predictor (Fig. 3). The strength of correlation has been defined through Pearson’s r [Eq. (7)] [21].

$$r = \frac{{\Sigma \left( {x_{i} - \overline{x}} \right)\left( {y_{i} - \overline{y}} \right)}}{{\sqrt {\Sigma \left( {x_{i} - \overline{x}} \right)^{2} \Sigma \left( {y_{i} - \overline{y}} \right)^{2} } }}$$
(7)

where xi and \(\bar{x}\) are the values ith sample and mean of samples of independent variable (x), whereas yi and \(\bar{y}\) are values ith sample and mean of samples of dependent variable (y).

Fig. 3
figure 3

Simple linear regression plots between UCS, tensile strength, E, density and porosity

The values of r are ranging between 0.68 and 0.83, clearly revealing a strong correlation between parameters. The significance of the analysis is assessed by p value, which represents the probability of determining the observed results to be random [22]. Asterisk symbols *, ** and *** indicate p values < 0.05, < 0.01 and < 0.001, respectively [23].

The multiple regression predicts a response y with multiple predictor variables x1, x2,…, xn (Fig. 4). The value of r-squared increases with the addition of new predictors. Hence, r2 is not a reliable parameter for the multi-variate analysis, and overfitting conditions may arise. Adjusted r2 is more reliable for the analysis of multi-variables [24]. If n is the sample number and k is the number of variables, then Adjusted r2 is defined as [Eq. (8)].

$${\text{Adjusted}}\,\,r^{2} = 1 - \frac{{\left( {1 - r^{2} } \right)\left( {n - 1} \right)}}{n - k - 1}$$
(8)
Fig. 4
figure 4

3D scatterplot of multi-variables

Residuals are defined as the difference between the observed value and predicted value. Standardized residuals are defined as ratio of residuals to standard deviation. Quantile–Quantile (Q–Q) analysis of standardized residuals is performed for verification of the model assumptions of the random error term (ɛ) (Fig. 5). Normal distribution of residuals signifies that model assumptions are true and error functions are independent as well as homoscedastic.

Fig. 5
figure 5

Normal Q–Q plots of residuals against theoretical quantiles

Results and Discussion

Univariate linear regression analysis is executed to find out the correlation of mechanical parameters with index properties (porosity and density). The values of r2 of simple linear equations are ranging between 0.50 and 0.74, defining a reliable correlation between parameters as more than 50% of observed results are reliably predicted by the regression model (Table 2). The values of r2 > 0.5 are moderately effective and reliable [25]. The rocks with same mineralogical composition and density may have different mechanical properties due to variation in moisture content and weathering grade [26,27,28]. Since the samples were collected from 10 nearby locations, heterogeneity in weathering and moisture content may be the possible reason for r2 values not strongly effective. Each simple linear equation has p value less than 0.001, which implies correlation is highly significant, not accidental.

Table 2 Results of univariate linear regression analysis

The predicted equation between UCS and n has r2 of 0.73, whereas a similar equation for limestone had been proposed earlier with r2 of 0.59 [5]. Predicted equation between UCS and ρb has r2 of 0.50 and p value is less than 0.001, indicating results are not random, and they are highly significant statistically, whereas, a similar equation has also been published for limestone with r2 of 0.82 [8].

As the variation in r2 values of univariate analysis is large (32%), hence, multivariate regression analysis was performed to find the correlation of mechanical parameters against two physical variables. The values of Adjusted-r2 range between 0.67 and 0.74, demonstrating the results of predicted bivariant regression equations to have a strong and reliable correlations with observed results (Table 3). The predicted bivariate equations for E and TS yielded a p value with ** for both predictors, this reveals the equations to have a significant correlation whereas predicted bivariate equation of UCS with porosity and density has p value with *** for porosity and no asterisks with density, indicating correlation of UCS with porosity is statistically highly significant but not statistically significant with density. Similarly, predicted equation of UCS with TS and E is showing significant correlation with TS and insignificance with E. The proposed predicted multivariate equation of E has r2 of 0.67 and a significant p value with respect to the independent parameters.

Table 3 Results of multivariate linear regression analysis

Residual analysis is extremely essential for validating the predicted model. The residual analysis of each predicted multivariate equation for UCS, E and TS has been done through Q–Q plots. The most of the data points in each plot are fitting in a straight line, revealing normal distribution which also indicates error is smaller or close to zero (Fig. 5).

Conclusion

Present research emphasizes the correlation for quick and reliable estimation of mechanical properties (UCS, E and TS) used for design through simple index tests for density and porosity of the sandstone. A total of 30 core samples were tested in the laboratory to assess uniaxial compressive strength, Brazilian tensile strength, Young’s modulus test, porosity and density of the sandstone. The sandstones are found to have moderate to high strength as the values range between 41.9 and 83.4 MPa. After determining the physico-mechanical properties of the sandstone, univariate linear regression analysis is performed to find the simple linear equations of mechanical parameters with single physical parameters. The r2 of simple linear regression analysis ranges between 0.50 and 0.73, which is showing a reliable though a not so strong correlation. p value for the simple linear regression is less than 0.001, indicating correlation is statistically highly significant. Multivariate linear analysis is also performed to find linear regression models of mechanical parameters with more than one predictor. Adjusted-r2 of the predicted multivariate equations is between 0.67 and 0.74, and the significance predictor (p value) of less than 0.01 demonstrates reliable correlations among the parameters. The reason of scattering in the data may be due to effect of other parameters (mineral arrangement and weathering) on mechanical properties.

The predictive linear models are intuitive for the users in comparison with complex soft computing models. The outcome of predicted models with significant r2 and p value have enhanced the reliability and validation of the model. During design as well as excavation work of underground structure and open cast mines, these equations could be a steadfast approach for rapid estimation of mechanical parameters when conditions will not be favorable for laboratory testing and quality sampling.