1 Introduction

Dependable prediction of rock mechanical behavior under pressure is one of the most important issues for engineers to design underground structure. Thus, definite estimation of these parameters such as shear strength is extremely required [1]. Resistance deformation of rock under shear stress can be behaved by shear strength parameters [1]. Generally, there are two mechanisms of resistance in rock; the first one is internal friction angle (ϕ) and the second one is cohesion (c). Shear strength parameters could be determined directly from laboratory test (triaxial test) but it is time consuming and expensive. Furthermore, desired quality specimens are very hard to provide mainly in jointed and weak rocks [2]. In consequence, for prediction of rock mechanical behavior, the use of rock index tests has been extensively suggested due to easy procedure for conducting these tests [1, 3,4,5,6]. Additionally, these are cheaper and faster compared to, for example uniaxial compressive and triaxial tests [7, 8].

Many researchers work on shear strength [9,10,11,12,13,14,15] and some of them selected mixture of rock particles with clay and sand to work on as samples. Mostly, previous investigations showed that increasing of shear strength is in a good connection with increasing amount of rock particle in the mixture [16]. For rocks which contain a large number of joints, an improved non-linear Mohr–Coulomb strength criterion received a good performance. Mohr–Coulomb strength criterion has two limitations; the firstone is the linear strength response and another one is lack of consideration of the influence of the intermediate principal stress on the strength behavior. Both of these limitations were propounded in the study conducted by Singh and Singh [17]. The non-linear strength criterion was acquired by applying Barton’s critical state concept [18]. Hajdarwish and Shakoor [19] applied bivariate and multiple regression techniques on different kinds of mudrock containing 45 samples to set up correlation between geological and engineering properties and shear strength parameters. Consequently, they determined some parameters such as clay mineralogy, Atterberg limits, clay content, water content, adsorption, dry density, specific gravity, void ratio, absorption, slake durability, and shear strength parameters. Finally, possibility of estimation of C and ϕ of mudrock samples was reported by Hajdarwish and Shakoor [19] considering the selected parameters.

By applying Mohr–Coulomb and new Hoek–Brown failure criteria [20] on shear strength parameters of shale, these parameters were comparatively researched by Yazdani [21]. The outcomes of this research showed that applying the new Hoek–Brown criterion to acquire failure envelope gives a more appropriate description of field circumstance’s behavior of shale. In addition, these results indicated that using classical Mohr–Coulomb criterion for prediction of intact rock behavior, discontinuities in the rock mass are not considered. Another research conducted by Ghazvinian et al. [22] showed that gradient of schistosity planes (β) within texture of intact rock specimen represent anisotropic shear behavior against external loading, in respect of normal stress orientation. Furthermore, it was exhibited that in respect to angle of β, the effective shear strength values depending on coincident influences of confinement stress and anisotropy differed from a greatest to lowest magnitude. Shale mechanical properties was assessed by Islam and Skalle [23], based on computation for variable confinement pressures, the beddings plane, and drained/undrained test process. High degree of heterogeneity was reported by laboratory tests which showed that Poisson’s ratios have decreased 40% after drainage for shale. Barton [24] indicated that non-linear classical Mohr–Columb criterion gives a better estimation of intact rock behavior of shear strength criteria for rockfill, rock joints and rock masses.

In literatures, it was highlighted that the artificial intelligence (AI) techniques have impressive capability in geotechnical engineering [25,26,27,28,29], specially in rock mechanics [3, 4, 6, 30,31,32]. Artificial neural networks (ANNs) is one of the most innovative branch of knowledge and can be implemented in various fields of engineering and science. Despite ANNs ability of employing all effective parameter in estimating models, there are some restrictions of ANN such as slow rate of learning and entrapment in local minima [33,34,35]. In this regard, advantages of powerful optimization algorithms are being gained to control these limitations. Moreover, to solve discrete and continuous optimization problems, the use of genetic algorithm (GA) to adjust the weight and bias of ANNs for enhancing their performance prediction, is of advantage [36,37,38]. As far as authors know, there is no study developing a hybrid GA-ANN model for prediction of rock cohesion. Therefore, in this paper, to solve this problem, a hybrid GA-ANN-predictive model is constructed and proposed. To do this, a database of 63 datasets was prepared and used in the modelling. In this database, p-wave velocity (V p), uniaxial compressive strength (UCS) and Brazilian tensile strength (BTS) were utilized as model inputs. In the following, after introducing the applied methods and also case study, application of all methods in predicting cohesion will be discussed. At the end, the selected models will be evaluated and introduced for rock cohesion prediction.

2 Methods

2.1 Artificial neural network

The human brain’s procedure of transferring information is simulated as functions of estimation called artificial neural network (ANN). ANN has the capability to use even in very complicated and non-linear contact phenomena among input variable(s) or predictor(s) and network’s output [39, 40]. Many kinds of ANNs have been designed/developed and the multilayer feed-forward ANN is one of the most popular type of them. This method is consisting of hidden nodes (neurons) linking multiple layers to each other with similar connection weights [41]. It is necessary to train ANNs with some learning algorithms to achieve an advantageous result(s). The most famous learning algorithm is the back-propagation (BP) algorithm that could minimize network error among desired and the estimated values [42,43,44]. The hidden node results are determined to implementation of a transfer function (that is mostly sigmoidal function) to the net input of the hidden node. The error is calculated by making a comparison of desired and the estimated results. The error should be smaller and smaller based on the defined system error like root mean square error (RMSE) to end the process or otherwise to modify the connection weights for receiving better results. Figure 1 shows formation of BP-ANN algorithm with one hidden layer.

Fig. 1
figure 1

Formation of BP-ANN algorithm with one hidden layer [45]

2.2 Genetic algorithm

Holland [46] developed a tool for optimizing purposes called genetic algorithm (GA). This technique was affected greatly by biological species evolution and the mechanism of natural selection. GA uses objective function evaluation in every decision variable for proceeding because GA is a probabilistic method thus, it requires no particular data to lead an act of searching [45]. Conventionally, individuals in the populations called candidate solutions that slowly meet the most favorable solutions over time. 0 and 1 s represent chromosomes that make a linear string and this linear string suggests solution of each candidate. Generation is a formed population size of total solution established by optimizing procedure of each iteration. Three basic genetic operators i.e., reproduction, cross-over and mutation are used to create the following generation in GA.

Procedure of selection of the finest chromosomes according to their scaled values considering the provided standard of fitness is defined as a reproduction operator. This operator directly transfers the selected chromosomes to the next generation. The second operator is the cross-over in which specific parts of individuals (parents) merge with each other and make new individuals. There are several ways of applying recombination such as single-point cross-over and two-point cross-over. In spite of that, a random cross-over point and two parents are selected in the procedure of cross-over. The first offspring is created from amalgamation of the first parent’s left side of gene with the second parent’s opposite side of genes and the second offspring is established by repeating a reverse process [47]. In the mutation operator, there is a haphazard substitution in elements of a chromosome. More information/details regarding GA background can be seen in the other works [46, 47].

2.3 GA-ANN combination

GA algorithm usage for increasing the functioning superiority and generalization ability of ANNs has been highlighted by some researchers [36, 38, 47]. In ANN, to increase the prediction ability, a stochastic search algorithm as GA can be employed to modify the biases and weights of the ANN [47]. Normally, ANNs have more chances of convergence at a local minimum in spite of finding a global minimum using GA. Therefore, to increase the network performance, employing the search properties of both algorithms (ANN and GA) as hybrid GA-ANN model is necessary. To achieve the most appropriate outcomes in this model, global minimum discovered in search space by GA will be used in ANN. A hybrid GA-ANN algorithm is demonstrated in Fig. 2.

Fig. 2
figure 2

GA-ANN model [45]

3 Laboratory investigation

Representative rock mass samples were collected and cored in NX size to determine the various physico-mechanical properties. The ends of the core specimens were trimmed as required and then cut into standard size as per ISRM [48] standards for different physico-mechanical properties. After coring the rock specimens, it was further smoothened by the lathe machine to avoid end effects.

3.1 Determination of P-wave velocity

The p-wave velocity of rock was determined using a portable ultrasonic non-destructive digital indicating tester (PUNDIT) as per ISRM [49] standards. In this, a mechanical pulse is generated on prepared specimens by piezo-electric transducers. A High electric voltage pulse of short duration is generated by piezo-electric transducer which converts into mechanical pulse. In this system, the pulses are transmitted from one end and received at another end of the specimen.

3.2 Determination of uniaxial compressive strength

In the present investigation, determination of UCS involves the use of a NX size (54 mm dia) cylindrical specimen of rock, loaded axially between the loading platens of universal testing machine (UTM) as per ISRM [50] standard. The stress value at failure is defined as the compressive strength of the specimen. Uniform stress rate of 1.0 MPa/s was applied till it reaches to failure. The peak value of load deformation curve provides the value of compressive strength at the given rock samples.

3.3 Determination of Brazilian tensile strength (BTS)

Brazilian tensile strength (BTS) is determined in the laboratory by Brazilian test. This test is based on the experimental fact that most rocks in biaxial stress fields fails in tension at their uniaxial tensile strength when one principal stress is tensile and the other finite principal stress is compressive with a magnitude not exceeding three times that of the tensile principal stress [51].

3.4 Determination of cohesion

Cohesion of the rock samples were determined by performing triaxial compression test. The NX size rock sample was compressed at a constant confining pressure and then the axial load was increased until the sample is failed. Generally, the testing system is comprised of hydraulic actuator, load frame, hydraulic pressure unit, controller unit, data acquisition system and various measuring devices. During testing process, the measurements were simultaneously transmitted to the controller (data logger) using a specific testing software. Basically, every conventional triaxial compression test is performed in terms of a selected isotropic confining pressure (σ 3) to approximate the stress state in a rock mass when subjected to overburden load. For this aim, a hydraulic fluid (i.e., oil) is considered as a confinement medium to apply such a confining pressure to the core sample inserted into a triaxial cell. A predetermined amount of confining pressure (also called cell pressure) is kept constant throughout the test. Subsequently, the confined specimen is compressed progressively, axial loading, under the very stiff load frame using a hydraulic actuator. The same process was repeated on a number of similar samples with different confining pressures to allow a number of Mohr circles to be drawn. The intercept of the tangent line drawn through these circles were used to determine the cohesion of the rock samples.

3.5 Database

Review of literature showed that simple rock index tests can be used as inputs to predict shear strength parameters of the rock. This is due to easy procedure for conducting the mentioned tests. As mentioned above, to obtain the goal of this study, a series of rock tests including V p, UCS, BTS and triaxial compression test were carried out on the limestone samples. Totally, a database comprising of 63 datasets where V p, UCS and BTS as model inputs and cohesion (C) as model output were prepared for further analyses. More statistical information regarding the established database i.e., maximum, minimum and average are presented in Table 1. In the following sections, procedures of the statistical and AI models in predicting cohesion are described.

Table 1 Statistical information regarding the established database

4 Developed models

In this section, a series of analyses i.e., simple and multiple regression models, ANN and GA-ANN-predictive models were considered and conducted to have available performance capacity of different predictive models in estimating rock cohesion. Then, these models were compared to introduce the best one among them. Modelling processes of the mentioned models are given in the following sub-sections.

4.1 Simple and multiple regression

To investigate a relationship between model inputs (V p, UCS and BTS) and system output (C), simple regression models were applied. Different types of equation such as linear, power and exponential were tried to evaluate and select the best type in estimating cohesion. Evaluation of these equations was performed based on some performance indices (PIs) i.e., coefficient of determination (R 2), RMSE and variance account for (VAF) which were suggested by many scholars (e.g., [3, 6, 8]). Their formulas can be found in the other studies (e.g., [52]). It is important to note that an equation/model with VAF of 100%, R 2 of one and RMSE of zero is defined as an excellent equation/model. Table 2 presents the developed equations for cohesion prediction together with their PIs. These equations were selected based on their PI results compared to other equation types. As it can be seen, values for R 2 are obtained as 0.804, 0.787 and 0.720 for V p, UCS and BTS, respectively. Graphs of the developed equations to predict C are shown in Figs. 3, 4 and 5, respectively. It was found that the obtained results are statistically meaningful but to get higher performance models in practice, a multiple regression (MR) is also employed.

Table 2 The developed equations for estimating C together with their PIs
Fig. 3
figure 3

Prediction of C using V p

Fig. 4
figure 4

Prediction of C using UCS

Fig. 5
figure 5

Prediction of C using BTS

MR is a statistical technique to develop a multiple equation using two inputs or more. By employing this technique, a relationship between model inputs and output(s) can be found/proposed. The developed MR equation to estimate C is presented in Eq. 2. PIs of the developed MR equation, i.e., R2, RMSE and VAF were obtained as 0.925, 0.369 and 92.483, respectively. Figure 6 displays a graph of measured and predicted cohesion using MR model. Although the obtained results of MR equation are better than those of simple regression equations, there is a need to introduce a model with higher accuracy level. Therefore, as discussed earlier, AI techniques are used in this study to develop a new model for C prediction. To do this, a hybrid model of GA-ANN together with a conventional ANN model are proposed to estimate rock cohesion using the mentioned model inputs. In the following sub-sections, modelling process of these models is described. It should be mentioned that single and MR models were performed using SPSS package software V. 16 [53].

Fig. 6
figure 6

Prediction of C using MR technique

$$C=0.002275 \times {V_{\text{p}}}+0.065 \times {\text{UCS}}+0.006 \times {\text{BTS}}+2.205$$
(1)

where C is cohesion (MPa), V p, UCS and BTS are p-wave velocity (m/s), uniaxial compressive strength (MPa) and Brazilian tensile strength (MPa), respectively.

4.2 ANN

The data used in ANN modelling is the same data presented in Table 1. ANN capabilities are depended directly to its structure as stated by Kanellopoulas and Wilkinson [28] and Hush [29]. Therefore, to have a desirable model of ANN, designing of optimal structure is necessary. The number of hidden layer and number of hidden neurons are considered as structure of an ANN model. According to several studies (e.g., [30]), hidden layer equal to one can estimate any non-linear functions and due to that, in this paper, hidden layer equal to one was selected. In addition, several equations, which can be used for calculation of the number of neuron in hidden layer are presented in Table 3. Based on Table 3 and with Ni (number of inputs) = 3 and No (number of output) = 1, a range of 1–7 should be considered. To achieve the optimum number of neurons, many ANN models were created. Among all ANN training algorithms, Levenberg–Marquardt (LM) was selected and utilized to train the ANN systems. Table 4 presents their results based on R2. In the last column of Table 4, average values of 5 runs for each hidden node can be seen. As a result, model No. 5 with ten nodes (R 2 values of 0.948 and 0.949) provides better performance than the others. Hence, 3 \(\times\) 5 \(\times\) 1 was chosen as ANN structure for approximating rock cohesion. The best ANN model will be selected later. It should be noted that the developed datasets were normalized before ANN modelling using the following equation:

Table 3 Equations for number of neurons in hidden layer
Table 4 Training and testing results of ANN in predicting rock cohesion
$${X_{{\text{norm}}}}=\left( {X-{X_{\hbox{min} }}} \right)/\left( {{X_{\hbox{max} }} - {X_{\hbox{min} }}} \right)$$
(2)

where X and X norm are the measured and normalized values, respectively. X max and X min are the maximum and minimum values of the X.

4.3 GA-ANN

As stated earlier, GA has an effective impact on ANN performance (e.g., [61]). Chambers [62] indicated that an objective function can be chosen by GA or a hybrid GA-ANN model. Based on this process, weights and biases of ANN can be optimized. The most effective GA parameters that were used to construct hybrid GA-ANN models should be selected/determined. Mutation probability values, percentage of recombination were set as 25, and 9%, respectively, in the hybrid GA-ANN. As a cross-over operation, a single point with 70% possibility is used. A series of hybrid models were created to determine best population size (a range of 25–600 population). RMSE values of the mentioned analyses showed that population size of 350 can be performed efficiently. In these combinations, the proposed ANN architecture and generation value of 100 were used. For the next step, the maximum number of generation (G max) should be identified and utilized. A parametric study was conducted to determine G max effect on network performance. For determining the optimum number of generations, a value of 500 generation was assigned as stopping criteria and the obtained RMSE values were considered. As a result, after number of generation = 300, the network performance is unchanged. Therefore, for designing GA-ANN models, the optimum number of generation was applied as 300. In the final step, five GA-ANN models were created again and their results will be discussed later.

5 Evaluation of the results

Results of simple regression analysis showed that there is a possibility to increase performance of predictive model. MR model was also built to predict rock cohesion, however, to get higher performance capacity, two AI models, namely ANN and GA-ANN were constructed. Here, all 63 datasets were chosen randomly and classified as five different sets. As suggested by Swingler [63], classification of 20 and 80% were utilized randomly to separate datasets to testing and training, respectively. Then, five constructed ANN models and five constructed GA-ANN models should be evaluated using some PIs including R2, VAF and RMSE. The values of PIs results for training and testing of datasets are tabulated in Table 5. In this table, it is not easy to identify the best model for rock cohesion evaluation. To solve this problem, as noted before, a simple ranking method developed by Zorlu et al. [5] was used. More explanations regarding ranking methods can be found in the other works such as Zorlu et al. [5] and Armaghani et al. [31]. Amounts of the rankings were calculated for each training and testing datasets, separately (see Table 5). The final amounts of ratings are provided in Table 6. As shown in Table 6, models No. 3 and 4 represent the best performance of rock cohesion for ANN and GA-ANN methods, respectively. Based on the obtained PIs, hybrid GA-ANN network provides better performance for prediction of rock cohesion. The obtained PIs for the chosen models based on R2 are displayed in Figs. 7 and 8 for ANN and GA-ANN, respectively. Network results (R2 = 0.949, R 2 = 0.948 for train and test of ANN and R 2 = 0.976, R 2 = 0.967 for train and test of GA-ANN) showed that GA-ANN model is superior in comparison with ANN model for over break estimation. In fact, developing GA-ANN model, performance prediction can be increased from about R 2 = 0.8, for simple regression to about R 2 = 0.98 which indicate high capability of the mentioned method.

Table 5 PI values in predicting rock cohesion using ANN and GA-ANN models
Table 6 Values of total rank for rock cohesion prediction
Fig. 7
figure 7

Train and test of the ANN model for rock cohesion estimation

Fig. 8
figure 8

Train and test of the GA-ANN model for rock cohesion estimation

6 Sensitivity analysis

Sensitivity analysis was carried out to recognize the relative influence of the each parameter on the system. To undertake this technique, all data pairs were utilized to build a data array X as follows:

$$X=\left\{ {{x_1},{x_2},{x_3}, \ldots ,{x_i}, \ldots ,{x_n}} \right\}$$
(3)

The variable x i in the array X is a length vector of m as:

$${x_i}=\left\{ {{x_{i1}},{x_{i2}},{x_{i3}}, \ldots ,{x_{im}}} \right\}$$
(4)

The following equation presents the strength of the relation \(\left( {{r_{ij}}} \right)\) between the dataset \({X_i}\) and \({X_j}\).

$${r_{ij}}=\frac{{\sum\nolimits_{{k=1}}^{m} {{x_{ik}}{x_{jk}}} }}{{\sqrt {\sum\nolimits_{{k=1}}^{m} {{x^2}_{{ik}}\sum\nolimits_{{k=1}}^{m} {{x^2}_{{ik}}} } } }}$$
(5)

Results of r ij values of 0.999, 0.995 and 0.992 were obtained for V p, UCS and BTS, respectively. They show that all model inputs are effective on cohesion of the rock, however, V p receives the highest r ij value among all inputs.

7 Conclusions

In this research, an idea has been started developing simple and multiple regression and AI models in estimating cohesion of rock material. For the purpose of this study, a database including three inputs i.e., V p, BTS and UCS and an output i.e., cohesion was prepared and used for proposing predictive models. Conclusion remark of simple and multiple regression analyses showed that they are meaningful and applicable in estimation of rock cohesion. Nevertheless, to receive higher performance prediction, intelligence models may be required. Then, several ANN and GA-ANN models were constructed to estimate rock cohesion. The obtained results of ANN models revealed that a structure of 3 \(\times\) 5 \(\times\) 1 received more accurate values in rock cohesion prediction. Using this structure, many hybrid GA-ANN models have been created based on different GA values. Finally, after conducting a series of trial and error procedures, 5 ANN and 5 GA-ANN models were constructed to choose the best one among them based on the obtained PIs. GA-ANN model results (VAF = 97.471 and 96.707, R 2 = 0.976 and 0.967 and RMSE = 0.033 and 0.037 for train and test) were better compared to ANN model results (VAF = 94.795 and 94.773, R 2 = 0.949 and 0.948 and RMSE = 0.058 and 0.047 for train and test). According to the obtained results, GA-ANN predictive model is introduced as a new approach to predict rock cohesion. Furthermore, obtained results from the sensitivity analysis indicated that the effects of V p are higher than other predictors on cohesion of the rock.