1 Introduction

Ground movement is one of the important topics in landslide researches. Prevention and control of this phenomenon can reduce the risks for facilities and humans [1,2,3,4]. However, the phenomenon of landslide is not easily predictable because of the various parameters affecting it. Important parameters affecting landslide can be referred to the geological and climate conditions according to several scholars such as Crosta and Agliardi [5]. So far, various models for assessing landslide have been developed based on the mechanism that governs this phenomenon [2, 5,6,7,8,9]. In general, these studies can be categorized as statistical models, numerical, physical, and non-linear simulations [10]. Due to the fact that the landslide phenomenon has several complications and the relationship between them is really complex, non-linear models are able to provide better performance than other available techniques. In non-linear and simulation techniques, an indirect assessment will be introduced to predict problems that are complicated in nature [11, 12].

Nowadays, more modern and advanced methods have been introduced in science and engineering fields, among which artificial intelligence techniques can be mentioned [13,14,15,16,17,18]. These intelligent computational methods are able to present various models in different fields of engineering and present appropriate relations and predictions using those models [19]. In civil engineering, artificial intelligence approaches have been employed/proposed for various predictions and optimizations purpose [13, 16, 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45]. Several intelligent studies have been proposed for solving problems related to landslide [3, 6, 10, 46]. Developing artificial neural networks (ANNs) can provide a solution that increases accuracy level of predictive models [47,48,49,50,51,52]. However, different models on the basis of artificial intelligence can affect the performance of different calculations. One of these new methods, called gene expression programming (GEP), received an excellent capability in solving problems in engineering sciences [53,54,55]. This method, which is a combination of genetic algorithm (GA) and genetic programming (GP), can present/provide a mathematical equation for prediction as well as solving complex problems and increasing accuracy of predictive models [24, 56]. Several researchers highlight the successful application of GEP in various fields of civil engineering such as environmental issues of blasting [39, 57], piling [58], tunneling and rock mechanics [59, 60], concrete technology [61, 62], highway construction [63] and river engineering [64, 65].

The aim of this research is to propose proper models of artificial intelligence for predicting and subsequently optimizing the movement of landslide. To gain prediction models, different data were collected and the effective parameters on ground movement were investigated. Then, using these data, different models of GEP were developed and implemented. Afterward, their performance with ANN networks was investigated for comparison purposes. Eventually, to obtain the minimum risk level, the artificial bee colony (ABC) algorithm was employed and the optimum values were introduced.

2 Methodology

2.1 Data collection

In the present research, various data which have been used for movement determination of landslides in study conducted by Neaupane and Achet [66] were collected and considered. The effective parameters on landslide/slope movement include groundwater surface (m), antecedent rainfall (mm), rainfall intensity (mm/h), infiltration coefficient, shear strength (kN/m2), and slope gradient of the area monitoring (°). It is important to mention that these parameters were selected based on previous researches [46, 66]. These data were used to train and test prediction networks, and values of the movement in landslide were predicted and evaluated using these parameters. Figure 1 shows a map which includes the geology structure/formation of the studied area. The statistical distribution of the utilized data in modelling process is given in Table 1. In the following sections, different models will be developed from these data for predicting landslide movement and their modelling procedures will be explained. The statistical distributions of data are presented in Figs. 2, 3, 4, 5, 6, and 7. More details regarding data collection and study area are available in the original study [66].

Fig. 1
figure 1

Geology structure of studied area [66]

Table 1 The statistical distribution of the used data
Fig. 2
figure 2

Statistical distribution of used data (Groundwater surface)

Fig. 3
figure 3

Statistical distribution of used data (antecedent rainfall)

Fig. 4
figure 4

Statistical distribution of used data (infiltration coefficient)

Fig. 5
figure 5

Statistical distribution of used data (shear strength)

Fig. 6
figure 6

Statistical distribution of used data (slope gradient)

Fig. 7
figure 7

Statistical distribution of used data (slope movement)

2.2 Artificial neural network

The concept of neural networks was first introduced in 1950s by the well-known psychologist, Donald Hebb [67], after the introduction of simple learning mechanism. He introduced this method by investigating brain neurons and the effect of learning on them. Since these neurons do not have a specific instruction for data processing, they investigate the relations they obtain between input and output data for learning [68,69,70]. Neural networks function like a biological neuron. In fact, in each neuron, dendrites receive information from the previous neuron, and axons transfer the results to the next section (i.e., next neuron) after an initial processing. Chemical signaling is done through synopses between the cells. The performance of a computational neuron, which is used in neural networks, is similar to a biological neuron (including inputs and outputs). An ANN contains two or more layers, and each layer has a series of neurons. The relation between the layers is associated with the weights constituting a network. These different coefficients in each layer are multiplied by each other and connect to other layers using the functions known as activation functions (see Fig. 8). Two algorithms, i.e., feed-forward multilayer and back-propagation are used in neural networks. Back-propagation is more common and is recommended by different researchers [71,72,73]. Using the pathway of its method in each layer, this algorithm trains the amounts of weights and functions it uses to reach the minimum error in the system. This training process is repeated for a few times so that it can reach the amount determined by the system or termination criterion (see Fig. 9). The back-propagation phase is associated with conditions in which gradient is calculated for non-linear multilayer networks (the networks that are used to solve most of the engineering problems). The sigmoid transfer function receives the input values and presents it as an interval of 0–1 regardless of the initial input interval [50, 51, 68, 74].

Fig. 8
figure 8

The structure of different coefficients in ANN network

Fig. 9
figure 9

The multilayer structure of ANN

2.3 Gene expression programing

Gene expression programing (GEP) is one of the new methods in artificial intelligence which is, in fact, the developed version of genetic algorithm (GA) and genetic programing (GP). GEP, which presents proper solutions for various problems, is based on different parts [75]. GEP benefits from two main chromosomes, and the expression tree provides solutions for removing the limitations of two older algorithms. The codifications are shown in the form of a string in GEP, which is in fact obtained from Karva programing language and can present a behavior like ETs. One of the interesting functions of GEP is that it can present its own models using mathematical equations. In fact, mathematical equations create relation between independent parameters. Creating models that can provide equations will be very helpful and practical in engineering. Therefore, these methods can be utilized in the place of ANN models in problems. These problems have caused researchers to conduct and develop more work to expand such methods. In GP method, different mathematical functions such as −, +, × and sin are written and implemented for variables so that a mathematical set can be obtained from a combination of them for problem examination. In multigene chromosomes, each gene expresses a sub-ET and consists of a head and a tail. These parts are the areas to which genetic operators are applied to create new solutions. According to Fig. 10, like other EAs, the GEP modeling process begins with the random creation of chromosomes for determined numbers, which follows Karva language (Karva is a symbolic language to introduce chromosomes). These symbolic chromosomes should be then defined as trees with different sizes and shapes [expression trees]. These points are investigated by the functions that are responsible for controlling models and their adaptability. These functions have different types that can be defined by different criteria. Some examples are root mean square error (RMSE), mean absolute error (MAE), and root relative squared error (RRSE). Next, if the termination criterion (in other words, maximum iteration or appropriate fitness value) does not occur, the best chromosomes that have been selected through the Roulette Wheel method for the first process enter the next structure. Afterward, the main genetic operators consisting of mutation, transfer (RIS, IS, and gene transfer), and reconstruction (one point, two points, and gene reconstruction) are applied to the chromosomes based on their proportions, which can be defined using the codes and experts of GP method. This way, the new chromosomes replace the remains, and the process goes on until termination criteria or conditions are reached [76,77,78]. Given the expansion of this method, more information and details about GEP method and the way of its initial implementation can be found in previous studies [79,80,81].

Fig. 10
figure 10

A view of GEP system

2.4 Artificial bee colony

One of the new optimization methods which was developed based on bees group life is the artificial bee colony (ABC) algorithm. This algorithm was first introduced and implemented by Karaboga [82] to optimize complicated science and engineering problems. Three important parts of this algorithm include employed, onlookers, and scouts [83, 84]. In the first stage, searching for food sources is done by two scouts. During these searches, a large number of bees are assumed as onlookers. A type of movement, called waggle dance, is made by the bees to make connections. In this movement, the scouts inform the employed bees of the quality of food sources (problem solutions). In these conditions, different bees can use the obtained information and select the required sources of the beehives. Quality of the presented solution is evaluated based on the amount nectar available as food source.

Different parameters can be effective in ABC algorithm including the number of scout bees (N), amount of food source (M), number of elected food source, number of bees dispatched to the elected food source (Nre), number of bees dispatched to other food source (Nsp), radius of the search area (Ngh), and number of iteration (Imax). With these conditions, the initial solutions (locations of food source) are presented within the defined problem for this algorithm:

$$X_{ij} = X_{j}^{\rm{min} } + ran(0,1)\left( {X_{j}^{\rm{max} } - X_{j}^{\rm{min} } } \right),$$
(1)

where i = 1,…, N and j = 1,…, D area is defined in the equation. Parameters N and D are the amount of food source and number of variables, respectively. In the following, ABC algorithm creates a new solution Vjk within Xk area for every presented solution:

$$V_{jk} (t + 1) = X_{jk} (t) + \phi_{jk} (t)\left( {X_{jk} (t) - X_{wk} (t)} \right),$$
(2)
$$K = \text{int} (rand \times N) + 1,$$
(3)

where the two parameters \(\phi_{jk}\) and Xjk represent the uniform distribution of random numbers and the jth solution from among the solutions set of the kth parameter. However, the \(\phi_{jk}\) area and parameter k are randomly selected from domains [1 and − 1] and [1 and N], respectively. Under these conditions, each solution that can solve the problem in a better way will replace the previous one. If the new solution is more adaptable, it will replace the previous one. After that, the scout bee selects a solution per each bee using Eq. 34 and possibility of the calculations. This problem is provided by the onlooker bee, and from among these solutions, the one presenting the most appropriate result will be selected. Figure 11 presents a flowchart of ABC algorithm.

Fig. 11
figure 11

The presented flowchart for ABC algorithm [51]

3 Prediction results

3.1 ANN modeling

As explained in the previous sections, the neural network can solve linear and non-linear engineering problems by presenting appropriate solutions [47, 48, 52, 83]. In this section, neural network models have been presented so that their results can be compared with new GEP models that are implemented in the following. To design the networks, 80 percent of all of data were dedicated to the training section and 20 percent of them were allocated to the testing section. Using this classification, the performance of artificial models can be assessed for predicting movement of landslides.

In general, one of the important criteria used for ANNs is root mean square error (RMSE) which is used for the initial termination criterion of the process of network training. RMSE is obtained from the values that are from the network and measured values. The best value is when RMSE is equal to zero.

$$\Delta = t - E_{\text{st}} ,$$
(4)
$${\text{RMSE}} = \sqrt {{\text{Average}}\left[ {\sum\limits_{k = 1}^{\text{NT}} {\Delta_{k}^{2} } } \right]} ,$$
(5)

where parameters t, Est, Δ, and k are the predicted values, measured values, error, and number of network outputs, respectively. In addition to this criterion, the regression value is also used, which determined the correlation between the predicted and measured values. This criterion is in the best condition when its value is equal to 1, and the closer it gets to zero, the lower prediction ability these models have. To investigate the prediction models developed in this research, these two criteria are employed. The results of the ANN section presented to be compared with the new method have been given in the following. Considering various explanations, a variety of models of ANN have been designed and created so that the solutions obtained from this model can be used to predict movement of landslide. Models of this method have been shown in Figs. 12 and 13. As can be seen, the best performance has been reached when the iteration value is 400 and the number of neurons is set as 10. Investigating the main model developed through this research will be discussed later.

Fig. 12
figure 12

Performance of ANN prediction model for training

Fig. 13
figure 13

Performance of ANN prediction model for testing

3.2 GEP modeling

After obtaining the results of ANN network, GEP prediction models are implemented in this stage. The used data are similar to those of the previous stage. The purpose is to determine/estimate the movement values in landslides. The values and way of implementing them until obtaining the results of GEP models as well as presenting the relations will be in mathematical route. The process used in this research for implementing GEP is as follows:

  1. 1.

    In the first step, the fitness function is selected as a criterion for each chromosome’s merit occurrence. RMSE is the common fitness function that is used in modeling process of GEP. However, based on the problem’s conditions, different modes can be used for investigating the models’ performance more accurately. Hence, each chromosomes’ fitness is determined as follows:

    $${\text{RMSE}}^{\prime } = \frac{1}{{1 + {\text{RMSE}}}} \times 1000.$$
    (6)
  2. 2.

    The second step is to allocate two important sections called the set of terminals (T) and functions (F) to the chromosomes’ structure, which creates a mixture of them. The independent variables (parameters of Table 1) are considered as the terminal set, and the function set is usually defined according to the main core of the problem. In the current study, trigonometry and mathematical functions have been used as follows:

    $$F = \left\{ { + , - , \times ,/,{\text{Sin}},{\text{Cos}},{\text{ArcTan}},\tanh ,{\text{sqrt}}} \right\}.$$
    (7)
  3. 3.

    In the third step, structural parameters of GEP (i.e., head size, number of genes, and number of chromosomes) have to be introduced and applied to the system. The number of gene parameter is introduced for ET subsections specified for each chromosome. According to Ferreira’s investigation [78, 80, 81] and some other studies, the best way to obtain proper values for structural parameters of GEP is the method of trial and error. In other words, the analysis starts with the increasing values of abovementioned parameters of GEP, and then the prediction of GEP models’ performance is checked in both training and testing phases. This way, several GEP models are designed and implemented with different parameters for predicting compressive strength of composite columns. Finally, after executing these processes several times, values of the number of chromosomes, head size, and number of genes are found to be 40, 5, and 3, respectively, for this section.

  4. 4.

    The fourth step is to select the rates of genetic operators. In this step, assuming the proposed values by previous researchers ([78, 80, 81]), some other GEP models are created using the trial and error method. The obtained values of GEP parameters are presented in Table 2.

    Table 2 GEP model parameters
  5. 5.

    In the final step, defining the linking function for connecting the created genes is required. There are various linking functions such as subtraction (−), addition (+), division (÷), and multiplication (×). In the present study, addition of different sections has been used to connect sub-ETs because it provides a better connection in comparison with other functions.

To evaluate the prediction performance of GEP models, R2 was used as well as RMSE values. These functions were selected because they had been used by different researchers for artificial networks and identified to be an appropriate criterion. Several parameters of the GEP model were examined in this section to determine its impact on the performance of models. One of the most important parameters of GEP model is the number of generations. Figure 14 shows their changes in predicting landslide movements. The effect of gene and size of head parameters on the performance of the GEP model is shown in Figs. 15 and 16.

Fig. 14
figure 14

The changes result of generation in predicting landslide movements

Fig. 15
figure 15

The effects of number of genes in predicting landslide movements

Fig. 16
figure 16

The effects of size of head in predicting landslide movements

According to the results presented in these figures, generation, gene and size of head parameters were considered as 3500, 5 and 5, respectively. Eventually, after the aforementioned implementation, the results of five different models are presented in Table 3. Two different scoring techniques were used to select superior models. The first technique is based on the sum of scores for sections of training and testing. In this way, if R2 achieves the high value, the higher score is given and vice versa. The same process will be applied for RMSE. If the amount of RMSE is lower, it will get a higher score. The same two parameters were also used for the second scoring technique. In this technique, if the parameter is more suitable, the more color (red) assign it. At the end, model number 4 was chosen as the selected model based on two scoring techniques.

Table 3 The presented models for the prediction of landslide movement using GEP model

According to Fig. 17, the expression tree of each gene of model 4 has been presented in which d(0) = groundwater surface, d(1) = antecedent rainfall, d(2) = infiltration coefficient, d(3) = shear strength and d(4) = slope gradient. In addition to the variables, several constant values are obtained as shown in Table 4. All functions and terminal sets have been illustrated in the circles. To extract mathematical equations, reading the circles from left to right and top to bottom is recommended. After extracting the equation of each gene, the final predicting model of GEP is obtained by adding all of model 4. Figures 18 and 19 show the results of model 4 for training and testing sections, respectively. As presented, GEP model can provide high level of accuracy level for prediction of landslide movement.

Fig. 17
figure 17

The tree expression of model 4

Table 4 Values and parameters of the tree expression for the selected mode
Fig. 18
figure 18

The GEP result of model 4 for training section

Fig. 19
figure 19

The GEP result of model 4 for testing section

4 Optimization process

To examine ABC algorithm, which is used in this research for minimizing movement values in landslides, the selected functions were employed. Here, two functions are presented according to Figs. 20 and 21 as follows. The minimum values of these functions in the mentioned intervals are 0 and − 5, respectively. Figures 20 and 21 illustrate the three-dimensional graph of these two functions in the specific interval. These figures demonstrate results obtained through ABC algorithm, which is for these two figures. As can be seen, the written code of this algorithm can identify the minimums well. That is why this code can be run for this research’s conditions obtained in the previous section.

Fig. 20
figure 20

The three-dimensional graph of sample 1

Fig. 21
figure 21

The three-dimensional graph of sample 2

To optimize the movement values, the previous section’s prediction models were used. As it was mentioned, the best model of the previous section is model 4 of GEP. This model is considered as a function. Different models of ABC algorithm were designed, each of which was executed by adjusting the parameters of the optimization algorithm.

After a set of analyses were carried out, the most appropriate parameters of ABC algorithm were obtained. The best parameters that can deliver well the performance of ABC algorithm for optimizing this problem have been achieved in Table 5.

Table 5 The effective parameter for optimization of the problem

Using the results of the best model, the optimum parameters that can provide movement were determined. The best cost function is presented in Fig. 22 for this problem. The proposed parameters are given in Table 6. It should be noted that the changes in these parameters are assumed to be the values considered for modeling (Table 1). As can be seen, in cases where optimization has been done, appropriate minimum value has been gained in performance of the problem. As an example, optimum values of − 10.5, 400.1, 89.8, 59.65 and 24.95 for groundwater surface, antecedent rainfall, infiltration coefficient, shear strength and slope gradient, respectively, will cause no movement (or movement of zero) in landslide. So, different patterns of designing can be applied under various conditions and the best performance can be reached. In this way, the risks of landslide can be controlled.

Fig. 22
figure 22

The best cost of ABC for optimization the problem

Table 6 The proposed values for optimization the problem

5 Conclusions

Assessing and controlling the risks that occur due to landslide is one of the most important discussions in this field. For this reason, this research used new intelligent methods to predict and propose different models for this value. The data used in this research were collected from several real-case studies. These data included parameters of the groundwater surface, antecedent rainfall, infiltration coefficient, shear strength, and slope gradient. The neural networks and new model (GEP) were used for prediction. The GEP model was implemented and developed with different conditions to predict movement of landslide. Each model finally ended in an equation. To investigate the performance of this new model, ANN networks were also implemented in a developed way. The models were compared with each other, and the best model was selected for optimization. The best model (with R2 = 0.8623 and 0.8594 for training and testing section), which was developed through GEP method, was combined with ABC optimization algorithm, and the optimum conditions for specifying movement in landslide were applied. The optimum parameters allow engineers to reach the best performance for decrease the movement of landslides. Finally, the results showed that the ABC algorithm can control the risk level of landslide movements according to their effective parameters.