1 Introduction

Rock engineering properties have major effects in designing geotechnical structures. Two of these properties are strength and deformability characteristics that could be pre-necessities for investigation, planning, and effective use of the Earth’s resources [1, 2]. Responsibility of the rock’s elastic ability and strength under different pressure conditions can affect on design of structures. The extensional stress–strain ratio is main elastic constant, calls as young’s modulus (E) [3, 4]. In most cases, the unconfined compression test (UCT) could define the strength and deformability of the rock. The International society for Rock Mechanics [5] has standardized UCT test. Straight determinations of these characteristics in the lab are difficult and it takes a long time [6,7,8]. The lab’s strength and elasticity tests are not only tiresome, but also require precise tool that is costly [9,10,11]. In exchange some other tests such as Brazilian tensile test, point load strength test, ultrasonic velocity test, Schmidt hammer test and physical tests are comfortably and cheap to carry out. Those are well-distributed by executing these basic rock index tests [3]. Therefore, it is reasoning to indirectly define the uniaxial compressive strength (USC) and Young’s modulus utilizing index testing of rock.

A lot of correlations to estimate UCS and E have been published [12,13,14,15,16]. Multiple regression (MR) analyses can be acceptable for estimation of UCS and E too; as a replacement for traditional regression models [4, 6, 17]. Nevertheless, these mutual relations are not exact enough; though there is a need of more accurate prediction models of USC and E in rock engineering field. Thus, execution of statistical prediction ways is unreliable on condition that recent data and original one are dissimilar hence the design of the achieved equation requires some changes [4, 18]. In many studies, the necessity and practicability of applying artificial intelligence methods like artificial neutral networks (ANNs), fuzzy systems and gene expression programming in estimating the UCS and E have been showed [19,20,21,22,23].

Meulenkamp and Grima [24] applied a back-propagation neural network to predict UCS of different rock types. In their research, porosity, rock type and grain size, Equotip hardness reading, density were set as inputs. They showed that ANN could be more accurate than statistical models. In one other research, a fuzzy model and regression methods were used to estimate the UCS and Young’s modulus of rocks with difficulties by Gokceoglu and Zorlu [6]. The UCS of sandstone was estimated by Zorlu et al. [19] applying two dissimilar prediction techniques: multiple regression and ANN. They indicated that ANN model compare to the multiple regression model have a higher ability of prediction. An adaptive neuro-fuzzy inference system (ANFIS) model was used and proposed to estimate the E of various rocks to show limitations of ANN and fuzzy logic by Singh et al. [3]. For training the network, 85 datasets and for testing and confirming the rules of the network, 10 datasets were used. Point load, density and water absorption were three geo-mechanical parameters which considered as inputs in their research. At the end, it was indicated that these outcomes were precise and promising in measuring Young’s modulus. As shown above, ANN technique has been widely used and proposed for UC and E prediction.

The ANNs are one of the most innovative area of researches for various related topics of science and engineering [11, 25,26,27,28,29,30,31,32,33,34,35]. However, they still have some limitations: the slow rate of learning and getting trap in local minima [36,37,38]. Using the particle swarm optimization (PSO) and imperialism competitive algorithm (ICA), which are a population-based evolutionary algorithms, contributes to succeed these defects. The successful usage of PSO and ICA algorithms in optimizing divers engineering problem has been reported in several studies [39,40,41,42,43]. In the present study, two hybrid models of PSO–ANN and ICA–ANN are designed and developed for prediction of E in granitic rock material. To have a fair comparison purpose, the obtained results of hybrid models are compared with results of a simple ANN model.

2 Methods

2.1 Artificial neural network

ANN is an artificial computational system that is formed by simulating some organizational rules of the nervous system functions. ANN can learn from the provided training patterns automatically to nearly discover approach connection of input and result for a mapping issue [44]. As pioneers of neural net modelling McCulloch and Walter [45] research guided to a binary threshold logic unit (binary decision unit) to model an artificial neuron behavior. Weighted sum of arriving signals is caught by all artificial joint of the system, so the signals is passed over a certain activation function to present better practical result. Mainly, ANNs look as highly parallel systems that a network of connected with others calculative units, nerve cell or joints are arranged into layers in a row. Additionally, each junction model of nerve cells influences network treat and already describes class of network [46].

Actually, the output error is figured by a squared error function presented below:

$$E=\tfrac{1}{2}\sum\limits_{{i=1}}^{p} {{{({t^{(i)}} - {y^{(i)}})}^2}} .$$
(1)

The number of training patterns is presented by P parameter. Additionally, t and y parameters show the target value and the actual value in a row. Through a gradient-based learning procedure, learning task of network is usually done. It named back-propagation (BP) learning algorithm. This is for multilayer pre-feed nets [44]. Fundamentally in BP learning twofold procedure forms every training period that includes forward and backward stages. If forward stage input signals move forwards through the network, it will be sending out error signal for each output-layer rode. Next stage, the rotes of resulting error will back in the direction of network make network’s weights and biases better [47, 48].

Feed-forward and feedback are two functional groups according to the network architecture in field of ANNs. Multilayer perceptron (MLP) is frequently employed as an option of multilayer feed-forward networks, that successive layers of working elements (neurons) replace and run information (signals) by weighted connections and activation functions, in a row [49, 50]. Some particular net input’s activation functions to present neuron outputs can be performed by hidden and output in a general manner. It is very important to select type of activation function, in terms of complexity of the problem to be solved. Correspondingly, for nonlinear problems, utilizing the sigmoid transfer functions and including log-sigmoid or tangent sigmoid is helpful. Total net input feed each of the hidden neurons; each incoming signal from previous layer is multiplied by an associated adaptive weight coefficient to yield weighted input signals and then a total function is applied to the weighted signals, and at last a little bias is added to the collective signal. The procedure is done for every layer repeatedly until the network general result is built. In mathematical terms, a total net input to every hidden or output neuron is showed as:

$${\text{Ne}}{{\text{t}}_{\mathop h\nolimits_{j} }}= \sum\limits_{{i=1}}^{n} {{w_{ij}}} \, \cdot \,{x_i}+{b_j}.$$
(2)

The resulting total net input is put down to the activation function (for example, sigmoid) for each neuron output. Therefore, the equation below derives for not only every hidden neuron but also output neuron:

$${y_j}={1 \mathord{\left/ {\vphantom {1 {\left( {1+\exp \left\{ { - {\text{ne}}{{\text{t}}_{\mathop h\nolimits_{j} }}} \right\}} \right)}}} \right. \kern-0pt} {\left( {1+\exp \left\{ { - {\text{ne}}{{\text{t}}_{\mathop h\nolimits_{j} }}} \right\}} \right)}}.$$
(3)

2.2 Particle swarm optimization

Kennedy and Eberhart [51] first developed PSO that is a sub-field of the swarm and computational intelligence. The searching for food behavior of many animals like fish and bird motivated that idea, in nature [20, 52]. There are some similarities between PSO with genetic algorithm (GA) and ant colony algorithm (ACO), imperialism competitive algorithm (ICA), but PSO is easier to use compared to them. To have a particles random movement in PSO, however, particles have tendency, they come nearer the current global best (p*) and the best location for itself \((x_{i}^{*})\). After that, in comparison with the previous location, a particle gets better one. There is a current best for all of the n particles, whenever t from repetitions. At last, the global best among the current best answers to stopping process criteria (a certain number of repetition) is searched by particles. Schematically, Fig. 1 displays a movement of particles within, \(x_{i}^{*}\) = the current best for particle i, and \({p^*} \approx {\text{min}}\{ f\left( {{x_i}} \right)\} \) for \((i=1,~2,~ \ldots ,~n)\) is the global best.

Fig. 1
figure 1

Standard flow chart of PSO [51]

A function founded on the best position of each particle and the swarm applying Eq. 4, defines the velocity of the entire particles, after forming suitability of the swarm. Next positions of the particles can be gotten using Eq. 5.

$$\overrightarrow {{v_{{\text{new}}}}} =\vec {v}+{C_1} \times \left( {\overrightarrow {{\text{pbest}}} - \vec {p}} \right)+{C_2} \times \left( {\overrightarrow {{\text{gbest}}} - \vec {p}} \right),$$
(4)
$$\overrightarrow {{p_{{\text{new}}}}} =\vec {p}+\overrightarrow {{v_{{\text{new}}}}} ,$$
(5)

where new velocity, current velocity showed by \(\overrightarrow {{v_{{\text{new}}}}} \), \(\vec {v}\), respectively, and C1 and C2 represent pre-defined coefficients; \(\overrightarrow {{\text{pbest}}} \) signifies personal best position of particle, and \(\overrightarrow {{\text{gbest}}} \) denotes global best position among all particles. Additionally, \(\overrightarrow {{p_{{\text{new}}}}} \), and \(\vec {p}\) present new position, and current position of particles, respectively. Equation 4 was updated adding a new parameter namely inertia weight (w) to have a better performance (see Eq. 6).

$$\overrightarrow {{v_{{\text{new}}}}} =w\, \cdot \,\vec {v}+{C_1} \times \left( {\overrightarrow {{\text{pbest}}} - \vec {p}} \right)+{C_2} \times \left( {\overrightarrow {{\text{gbest}}} - \vec {p}} \right).$$
(6)

2.3 Imperialism competitive algorithm

Atashpaz-Gargari and Lucas [53] introduced the imperialism competitive algorithm (ICA) which is considered as a global search algorithms for solving optimization problems of science and engineering. ICA begins with countries, which is an accidental starting population, the same as other optimization algorithms (OAs) such as GA and PSO. After producing N countries (also named as Ncountry), some of them with the lowest costs or objective functions, for example, root mean square error (RMSE), are selected as the imperialists (Nimp). So, the remaining countries are delineated as colonies or Ncol. All countries are dispersed among the empires in accordance with the initial power of them. Clearly, more colonies can be attracted to more powerful imperialists (lowest RMSE). Assimilation, revolution and competition are three algorithm operators of ICA. A colony is able to achieve better condition than its imperialist condition and take power of entire empire, while assimilation and revolution [43, 53,54,55].

In contest, operator imperialists try to get more colonies and whole empires try to take control of other empires colonies. All the empires have the chance to take possession of a colony’s minimum of the weakest empire, depend on their competence. Whenever entire empires, but the strongest one, fail or a user-defined end principles (desirable RMSE or maximum number of decades) is reached optimistically, this process finish. It is valuable to mention that the number of decades, which is called Ndecade in ICA, is like the iterations number of PSO method in theory. This article is not planned to prepare ICA mathematical formulation, hence, more information and feature about the ICA and its process are discussed in related articles (e.g., [42, 53, 56]). Figure 2 demonstrates the ICA flowchart for better understanding its process.

Fig. 2
figure 2

ICA flowchart [53]

2.4 Hybrid algorithms

A lot of researchers utilized the optimization methods such as GA, PSO and ICA to improve the performance of ANNs in engineering problems (e.g., [47, 54, 57,58,59,60,61,62,63,64,65,66,67]). The optimal search procedure of ANN could be unsuccessful and give back dissatisfied result because BP is not a learning method of global search [68]. Accordingly, for adapting the ANN’s bias and weight to make its performance better, OAs could perform. Since OAs can find a global minimum convergence at a local minimum is more probable by ANNs. Thus, hybrid systems such as PSO–ANN and ICA–ANN gain search feature of all ANN and OAs. Combining these algorithms for optimizing ANN models have received attention because of their capability in solving problems.

In the next parts, a short description about experimental framework is explained then more detail about modelling process of intelligent systems including PSO–ANN and ICA–ANN are given. The PSO and ICA were selected due to their successful ability in optimizing ANN. This point has been emphasized by many researchers in field of computer sciences.

3 Experimental framework

To achieve objectives of this study, the block samples were collected from a water transfer tunnel in Malaysia. This tunnel has the duty of transferring water demand between two states in Malaysia. The tunnel was excavated in mountain area with rock type of granite. A maximum overburden of 1400 m was measured for the mentioned tunnel. The strength of representative rock in this tunnel was between 150 and 200 MPa. There were three sections to be excavated by tunnel boring machine (TBM) with lengths of 11.7, 11.7 and 11.3 Km for TBM1, TBM2 and TBM3, respectively.

Many granitic block samples were collected from the face of tunnel in different TBMs, to construct a method for predicting E. After moving the samples to the lab, coring and cutting the specimens, every sample was flattened perpendicularly at the end of it. Both sides of samples were softened and improved then, the appearance of each sample were inspected for cracks, breaks and other flaws. Then, perfect samples were selected for conducting point load test, Schmidt hammer test, p-wave velocity test and UCS test. After carried out the UCS tests, elastic modulus was calculated from the stress–strain diagram of rock using the tangent method. Two linear variable differential transformer (LVDT) were performed to indicate axial strain of rock samples. All of the conducted tests were performed utilizing the recommended by ISRM [5]. In this research, point load index (Is50), p-wave velocity (Vp) and Schmidt hammer rebound number (Rn) were selected and used as inputs to estimate E. It should be mentioned that a total number of 71 datasets were prepared to design intelligent systems of this study. Table 1 shows the results of laboratory tests (inputs) and output of the system together with their ranges.

Table 1 Some descriptions of the utilized datasets

4 Intelligent systems

4.1 ANN

Liou et al. [68] reported that at first step of ANN modeling, to obtain a better performance, the datasets should be normalized. This normalization removes the complexities from the process of the design using equation below:

$${X_{norm}}=\left( {X-{X_{\hbox{min} }}} \right)/\left( {{X_{\hbox{max} }} - {X_{\hbox{min} }}} \right),$$
(7)

where X is the measured and Xnorm is normalized form of X. Xmax means maximum of X and Xmin means the least amount of X.

All data sets divided into training and testing parts for achieving better result and advanced modelling. Testing datasets in the inspection guided by Nelson and Illingworth [69] recommends a range of (20–30%) of all data sets. In this case testing datasets got 20% of all datasets (71 data sets). According to several researchers [27, 70], an ANN with one hidden layer is able to estimate any continuous function, so, a hidden layer was used in this study. Hornik et al. [71] determined the number of hidden node to ≤ 2 × Ni + 1 where Ni is number of input layers. For solving problem of the present study, with Ni = 3, it seems that a range of 1–7 can be utilized. Many ANN models with on output (E), number of 1, 2, 3, 4, 5, 6 and 7 as hidden node and three mentioned inputs (Is50, Vp and Rn) were considered and designed. The results of analyzing ANN models were evaluated based on coefficient of determination (R2) and RMSE and based on average results of them, a model with four hidden nodes found better. The results of (0.753 and 0.712) for R2,and (0.113 and 0.090) for RMSE were obtained for the best ANN model. So, the optimum selected architecture for predicting the E of the rock was as 3 × 4 × 1.

4.2 PSO–ANN

This section presents developing process of a hybrid intelligent PSO–ANN to predict E of the rock samples. As mentioned previously, many parameters such as coefficient of velocity equation, number of particle, number of iteration and inertia weight have a deep effect on PSO algorithm. Based on several related studies such as Keneddy and Eberhart [51] and Tonnizam Mohamad et al. [20] and Clerc and Kennedy [72], an acceptable results are achieved when coefficients of velocity equations are equal to 2 and inertia weight is 0.25. Thus, in all PSO–ANN models these values will be used. For choosing optimum number of iteration, different models with swarm size value of 50, 100, 150, 200, 250, 300, 350, and 400 were generated and designed based on their RMSE results. It is important to note that thousand number of iterations were considered for all models. According to Fig. 3, the results of vertical axis (RMSE) are not changed after swarm size of 400 for all hybrid models. In addition, the minimum error was obtained by swarm size of 200. Therefore, values of 400 and 200 were set in this study for maximum number of iteration and swarm size, respectively. Results of the optimum PSO–ANN model with swarm size = 200 and iteration number = 400 are obtained as R2 values of 0.943 and 0.949 for train and test datasets respectively. Selected PSO–ANN model will be evaluated more in the following part.

Fig. 3
figure 3

RMSE values verses iteration number for different sizes of swarm

4.3 ICA–ANN

For ICA–ANN model, it is necessary to inspect important factor/parameters to achieve the best model in terms of accuracy. Determination of ANN architecture is coming before inspection of ICA where an ANN architecture of 3 × 4 × 1 gets more desirable output (see ANN section). So, all hybrid ICA–ANN intelligent structure in this research used referred architecture. Parameters such as Ncountry, Ndecade and Nimp have a great impact on ICA. A lot of models, which used various values of Nimp, i.e., 5, 10, 15, 20, 25 and 30, were planned to determine this parameter. Ncountry = 300 and Ndecade = 100 were applied in these models. The results demonstrated that higher implementation system capacity can be achieved when Nimp = 5. In the next step, different models with Ncountry values of 50, 100, 150, 200, 250, 300, 350 and 400 were built to choose the best value for Ndecade and Ncountry. The obtained results of these analyses can be seen in Fig. 4 according to RMSE values. As shown in Fig. 4, number of country = 200 received the lower RMSE and RMSE values were constant after number of decade equal to 500. Therefore, values of 200 and 500 were set in this study for numbers of country and decade, respectively. Results of the best ICA–ANN model with the determined parameters are obtained as R2 values of 0.952 and 0.955 for train and test datasets, respectively. Considering results of PSO and ICA parameters, it was found that modelling time of ICA–ANN (with 500 number of decades) is longer than that of PSO–ANN model (with number of iteration = 400), while the performance prediction obtained by ICA–ANN model is higher than PSO–ANN predictive model. Selected ICA–ANN and PSO–ANN models will be evaluated more in the following part.

Fig. 4
figure 4

RMSE values verses number of decades for different number of countries

5 Results and discussion

The target of this research is to estimate Young’s modulus of the granitic rock samples. In consequence, several index tests were conducted and their results were considered and prepared to try for developing hybrid predictive models. In the present paper, PSO and ICA as two of the strongest OAs were used to optimize weight and bias of the ANN. Therefore, many combinations of ANN, PSO and ICA were proposed considering the most effective parameters of PSO and ICA. Apart from them, a series of ANN analyzing were conducted for comparison purposes. After developing the mentioned models, they need to be evaluated through several performance indices such as R2, RMSE and variance account for (VAF). The definition related to these indices can be found in some other researches [43, 62, 73, 74].

After a precise evaluation, higher performance capacity was provided by hybrid models in terms of all VAF, RMSE and R2 values of both training and testing phases (see Table 2). RMSE values of (0.050, 0.049 and 0.113) and (0.066, 0.035 and 0.090) were obtained for training and testing of PSO–ANN, ICA–ANN and ANN models, respectively. In addition, VAF values near to 100 (94.344, and 93.943, and 95.182 and 95.143 for train and test of PSO–ANN and ICA–ANN, respectively) were achieved for a new developed hybrid models. These results demonstrated that minimum system error can be achieved advancing hybrid models, while ICA–ANN model provided slightly higher performance prediction compared to PSO–ANN predictive model. Figures 5, 6, 7, 8, 9 and 10 shows predicted E values together with their actual values for ANN, PSO–ANN and ICA–ANN models. Both of training and testing datasets are showed in these figures. As shown, the developed hybrid models give a higher level of capability in prediction of Young’s modulus of the granitic rock samples.

Table 2 The obtained results of intelligent methods
Fig. 5
figure 5

Training dataset results obtained by ANN model

Fig. 6
figure 6

Testing dataset results obtained by ANN model

Fig. 7
figure 7

Training dataset results obtained by PSO–ANN model

Fig. 8
figure 8

Testing dataset results obtained by PSO–ANN model

Fig. 9
figure 9

Training dataset results obtained by ICA–ANN model

Fig. 10
figure 10

Testing dataset results obtained by ICA–ANN model

Moreover, the obtained results of this study are better than some other related studies such as Yilmaz and Yuksek [4] with R2 = 0.91, Beiki et al. [75] with R2 = 0.67 and Gokceoglu and Zorlu [6] with R2 = 0.79. Therefore, the developed predictive models could be used for similar condition in the future.

6 Conclusions

To prepare a good database for prediction of Young’s modulus, results of p-wave velocity, Schmidt hammer and point load tests were set as inputs of the system. Then, three intelligent models, i.e., ANN ICA–ANN and PSO–ANN were considered and developed for prediction of E. With respect to the related previous studies, the most important parameters of PSO and ICA were identified and determined in the present study. To estimate E, many ICA–ANN, PSO–ANN and ANN models were applied and the best ones among them were selected to be introduced in this study. Considering the most famous performance indices, all proposed models were carefully evaluated. After evaluation, it was found that in terms of both train and test, the I-ANN model receives better results in solving E problem. R2 values of (0.952, 0.943 and 0.753) and (0.955, 0.949 and 0.712) were obtained for training and testing of ICA–ANN, PSO–ANN and ANN models, respectively. In addition, VAF values near to 100 (95.182 and 95.143 for train and test) were achieved for a developed ICA–ANN hybrid model. These results demonstrated that although both hybrid models are applicable for E prediction, ICA–ANN model can be performed better with the lowest error among the applied models.