1 Introduction

Landslide and slope stability are major geohazards that need to be handled, predicted, and mapped in worldwide [1,2,3]. It has an essential significance in controlling the natural hazards induced by landslides and slope instability. Many scholars have succeeded in predicting and mapping landslides based on data science [4,5,6,7]. However, soils/rocks shear strength parameters and their effectiveness have not been studied for all-natural materials, as well as different locations [8,9,10]. Meanwhile, shear strength parameters (e.g., friction angle and cohesion) of natural materials are considered as essential factors in assessing the deformation and stability of geotechnical structures such as slopes, landslide, foundations, dams, and retaining walls [11,12,13,14]. Of those, cohesion is taken into account as the most important factor in evaluating the stability of slopes in cases of cemented rocks and soils, and the friction angle for the cases of uncemented rocks and soils [15,16,17]. To determine the values of cohesion and friction angle, experiments are usually carried out in laboratories based on standard methods, such as Mohr–Coulomb theory, Bishop [18,19,20]. Nevertheless, they were recommended in a limited range [21,22,23]. Furthermore, laboratory experiments are often complex and costly due to the preparation of samples and experimental conditions [24]. In addition, the experimental results often do not reflect all the characteristics of rocks and soils in reality [25, 26]. Therefore, a new method capable of predicting shear strength parameters of rocks and soils in practice with high reliability is a challenge for engineers and scientists.

Regarding the prediction of rocks/soils shear strength parameters, significant efforts have been made in the literature [27, 28]. Garven and Vanapalli [29] evaluated the performance of nineteen empirical methods to forecast the shear strength of unsaturated soils (SSoUS). The suitability of the nineteen empirical equations has been recommended and highlighted for predicting SSoUS. However, these empirical equations resulted into sub-optimal accuracy [30, 31].

To overcome the drawbacks of empirical equations and experimental tests in laboratories, in recent years, artificial intelligence (AI) has been introduced as a robust technique. Many AIs techniques have been successfully applied in practical engineering [32,33,34,35,36,37,38,39,40,41]. For predicting friction angle, Das and Basudhar [42] developed an artificial neural network (ANN), and clay was the objective considered in their study. They concluded that ANN could explain the physical effect of clay characteristics and friction angle. In another study, Das et al. [43] developed different AI techniques (e.g., ANN-based and SVM-based) for predicting the friction angle of clay. Finally, the SVM model was confirmed as the best AI technique in their study to predict friction angle of clay. Khan et al. [44] also predicted the friction angle of clay based on the same dataset of Das et al. [43], using a functional network (FN). Their results were then compared with the obtained results in the paper by Das et al. [43] to prove the accuracy of the FN model. Their comparisons indicated that FN model was a good candidate for predicting friction angle of clay. Based on the stochastic approach of the Monte Carlo algorithm, Casagrande et al. [45] successfully predicted the shear strength of rock discontinuities. Positive outcomes were then presented in their study with high reliability. In another study, Pham et al. [46] successfully developed four AI models for predicting shear strength of soft soil, including ANN, SVM, and ANFIS based on particle swarm optimization (PANFIS) and genetic algorithms (GANFIS). Finally, the PANFIS was introduced as the best model for their purposes. Matos et al. [47] also predicted the shear strength of unfilled rock joints based on a novel AI approach, namely the first-order Takagi–Sugeno fuzzy (FOTSF). Eventually, they introduced the FOTSF model as a useful tool to predict the shear strength of unfilled rock joints.

Systematic literature review showed that state-of-the-art AI techniques have been efficiency used and successfully applied terms of shear strength parameters prediction of rocks and soils and their effectiveness are undeniable. However, they have not been studied for all-natural materials, as well as different locations. Moreover, novel AI models with the improved accuracy are always the goal of scientists/researchers. Hence, this work developed and proposed a novel paradigm based on the deep learning technique and optimization algorithm for predicting friction angle of clay. Accordingly, the deep learning techniques were combined with the Harris Hawks optimization (HHO) algorithm to train and develop a multiple layer perceptron (MLP) neural network for this aim, called HHO–DMLP model. An MLP neural network (without optimization), SVM, and random forest (RF) models were then investigated and evaluated with that of the proposed HHO–DMLP model aiming to highlight the obtained results of the HHO–DMLP model.

2 Methodology

The focus of the present study is to propose a novel paradigm based on the hybridization techniques, i.e., HHO–DMLP for predicting friction angle of clay. Therefore, this section only focusses on the details of MLP neural network and HHO algorithm, as well as proposing the framework of the HHO–DMLP model. The detail of SVM and RF can be referred to the following papers [48,49,50,51,52,53].

2.1 Deep neural network (deep learning for MLP neural network)

As one of the most common types of ANN used in many applications in real life, MLP is well known as a flexible neural network with the structure consists of multiple layers [54, 55]. In each layer, neurons are the main components, and they connect to form a network capable of transmitting the information [56]. The process of transferring information between layers and neurons is performed by training algorithms, such as feed-forward, back-propagation, and Levenberg–Marquardt [57,58,59]. In MLP neural networks, weights are the main information, and they are used to assess the quality of the network. They have a significant effect on the training performance and accuracy of the network. An MLP neural network with multiple hidden layers is capable of improvement of information between neurons, called deep neural network (DNN) [60].

The concept of DNN has been introduced in recent years for complex issues that require a high degree of accuracy. However, training a DNN to achieve the desired effect is not easy. Therefore, the concept of deep learning has been introduced and proposed to train DNN to achieve better results [61]. In MLP neural networks, deep learning can perform many tasks to get the optimal network structure, higher accuracy level, faster computing speed, and more stable predictive results [62, 63].

Literature review shows that DNN has been successfully applied in many fields, especially in mining and geotechnical engineering [64,65,66,67,68,69]. In this study, deep learning was considered training a deep MLP neural network for predicting friction angle of clay, for determining the optimal structure and loss function of the MLP neural network. Besides, the activation functions between layers also play an important role in understanding the connection as well as the performance of the network [70, 71]. Review of the literature shows that the rectified linear activation function (ReLU) has been used as a result in many works [72,73,74,75,76]. By the use of the ReLU activation function, MLP can overcome the problem of vanishing gradient. Furthermore, it allows the MLP model to train faster and get better performance. Thus, it is considered as the most widely used activation function in deep learning models and it works great in most applications. Herein, we used the ReLU active function to discover the connections of neurons in the hidden layers. Also, a linear activation function was applied for the output layer of the network to evaluate the quality of the outcomes. The structure and flowchart of an MLP neural network as well as the activation functions used for predicting friction angle of clay are shown in Fig. 1.

Fig. 1
figure 1

Structure and flowchart of the MLP neural network for predicting friction angle

2.2 Harris Hawks optimization (HHO) algorithm

HHO is one of the swarm-based algorithms which was developed by Heidari et al. [77]. Inspired by the predatory behavior of hawks, the HHO algorithm implements strategies to optimize its goals, including two main steps: exploration and exploitation (Fig. 2). In exploration step, Harris Hawks can perch at random locations of other hawks to explore prey. Then, they can apply the soft or hard besieges strategies for attacking the prey (exploitation step) (Fig. 3). In fact, the prey can detect hawk attacks and escape before they are attacked. Therefore, HHO algorithm applied the soft or hard besieges with progressive rapid dives strategies to eliminate the ability of the prey to escape. Figure 2 presents step by step of the HHO algorithm for optimization problems. Please note Eqs. (1–6) in the optimization sequence of the HHO algorithm [77] are presented in supplementary materials. More details of HHO algorithm are described in Heidari et al. (2019).

Fig. 2
figure 2

Mechanism, phases, and the optimization sequence of the HHO algorithm (modified after Heidari et al. [77]). a Strategies of Harris Hawks and b the optimization sequence of the HHO algorithm

Fig. 3
figure 3

The strategies of the HHO algorithm [77]

Literature review shows that the HHO algorithm has been successfully applied for many problems [78,79,80,81,82]. Herein, the HHO algorithm was used as a nature-based optimization algorithm for the optimization and improvement of a deep MLP neural network (i.e., HHO–DMLP), for predicting friction angle of clay.

2.3 Proposing the framework of HHO–DMLP

To propose the framework of the HHO–DMLP model, three models are prepared: deep learning techniques, MLP neural network, and HHO algorithm. Accordingly, MLP neural network was selected as the key model for predicting friction angle of clay in this study. Deep learning was then applied to finding the optimal structure of the MLP neural network (e.g., hidden layers and neurons). Furthermore, learning rate and batch size of the MLP neural network were also optimized by deep learning. Finally, the HHO algorithm was applied as a meta-learning model to optimize the weight values of the MLP neural network. Herein, MSE was selected as the objective function for deep learning and optimization of the HHO algorithm. The lowest MSE was considered as the best performance of the HHO–DMLP model. The proposed framework of the HHO–DMLP model for forecasting friction angle of clay is introduced in Fig. 4.

Fig. 4
figure 4

Introduction of the HHO–DMLP flowchart for forecasting the friction angle of clay in this study

3 Data acquisition and preparation

The focus of the present study is to propose a novel paradigm based on the hybridization techniques that can predict and represent for the friction angle of clays at different areas/locations. Therefore, a database containing 162 observations was collected from the previous studies [30, 83,84,85,86] at different areas. Four input variables were taken into account to predict friction angle (ϕr), including clay fraction (CF), liquid limit (LL), plasticity index (PI), and deviation from A-line in Casagrande’s classification chart (∆PI). The range as well as properties of the collected dataset is listed in Table 1.

Table 1 Statistical indices of the dataset used

Before developing the predictive models, some data analyses are necessary to ensure the accuracy as well as the stability of the models. Based on the properties of the dataset in Table 1, it is clear that the range of all inputs and output is widely varied. They predict a result with great variability in this study. Therefore, scaling features are necessary to make for them to a particular range (e.g., [0,1], [-1, 1]). Also, the correlation between inputs and output should be carefully checked to evaluate the effects of inputs on the output, as well as the overlap of the inputs in the dataset used. A correlation matrix is analyzed in Table 2 to show those points.

Table 2 Correlation matrix of the friction angle database used

In data mining, correlation between variables is a crucial parameter to evaluate the quality of the dataset used, as well as having a good plan for models’ development. Accordingly, the acceptable correlation of the variables should be in the range of − 0.8 to 0.8 [32]. It can be seen that most of the variables have an acceptable correlation as shown in Table 2. In particular, the correlation between LL and PI is highest (i.e., 0.782). However, this is still acceptable since their correlation with the other variables is low, and the value of 0.782 is not too high to remove one of them from the dataset collected. Therefore, we used all four input variables (i.e., LL, PI, ∆PI, and CF) for predicting the friction angle (ϕr) in this study.

For training the prediction models, as well as evaluating the performance of the friction angle predictive models in practical engineering, a data split procedure was conducted with 70% of the whole dataset which was used for training the prediction models, and the remaining 30% was used for evaluation purposes. The details of the datasets are listed in Tables 3 and 4.

Table 3 Statistical indices of the training dataset used
Table 4 Statistical indices of the testing dataset used

4 Results

Once the dataset was well prepared, the procedures for developing friction angle predictive models were applied. In this study, two data scaling methods, such as MinMax and BoxCox, were used to normalize the dataset aiming to avoid over-fitting. Accordingly, the MinMax scaling method was applied during developing the DMLP and HHO–DMLP models, whereas the BoxCox scaling method was applied for the SVM and RF models. To do end, Fig. 4 is applied for developing the HHO–DMLP model. Note that some deep learning techniques were applied to develop the initial MLP neural network model. Accordingly, a deep learning procedure was employed for the selection of the optimal number of hidden layers in the MLP neural network based on the MSE values. To find out the optimal results, 500 epochs were used for this task. Finally, as shown in Fig. 5, the results showed that 3 hidden layers are the best structure for the MLP model in this work.

Fig. 5
figure 5

The error of the MLP paradigm with a different number of hidden layers and epochs

Once the optimal number of hidden layers is well determined, the optimal number of neurons is also determined based on the similar deep learning techniques in the range of 8–30. Eventually, the optimal number of hidden neurons for each hidden layer was determined as 18, 16, and 8, for the first, second, and third hidden layers, respectively (Fig. 6a). The optimal structure of the MLP neural network was defined as DMLP 4-18-16-8-1, and its performance is illustrated in Fig. 6b.

Fig. 6
figure 6

a The error of the MLP neural network with different neurons. b The error and performance of the selected deep MLP neural network (DMLP)

Once the DMLP model was well defined, the HHO was applied as a robust optimization algorithm to improve the performance of the DMLP model through the weight’s adjustment. The different number of Harris Hawks was set as 50, 100, 150, 200, 250, 300, 350, 400, 450, 500 to examine the performance of the HHO algorithm in optimization of the DMLP model. Furthermore, 1000 iterations were established to find out the best result with the lowest MSE for the DMLP model. Figure 7 shows the performance of the HHO algorithm with a different number of Harris Hawks (population size). The results in Fig. 7 show that the DMLP model achieved the optimal value with the population size of 200 at the iteration of 799 (MSE = 13.493).

Fig. 7
figure 7

Performance of the HHO algorithm in training the DMLP neural network

To prove the performance of the HHO–DMLP model, the DMLP without optimization by the HHO algorithm was also employed based on the same structure and datasets. Furthermore, two forms of the conventional models, namely SVM and RF, were also considered and evaluated in terms of modeling and accuracy to have a comprehensive assessment of the DMLP and HHO–DMLP models. It is worth noting that all the predictive models are developed based on the same training dataset. Herein, radius basis function (RBF) and a grid search of the parameters were applied for the SVM model, whereas 2000 trees and 4 randomly predictors were applied for the RF modeling to ensure the robustness of the model. Feature scaling and tenfold cross-validation method was applied for the development of the SVM and RF model to improve the accuracy of the models. Ultimately, the results of four AI models developed are shown in Table 5. Both performances on the training and testing datasets were computed and are discussed in Table 5 to evaluate the accuracy and stability of the models.

Table 5 Performance of the friction angle predictive models

5 Discussion

Considering the performance of the models on the training dataset, it is clear that the HHO–DMLP and DMLP models are much better than the SVM and RF models. Of those, the HHO–DMLP model is the most outstanding model with the highest performance. Remarkable, R2 values of the SVM and RF are low (0.554 for SVM model and 0.377 for the RF model), reflecting the unsuitable of the dataset for these models. The dataset was collected from different areas/locations, and their properties are dissimilar. Therefore, with an RMSE of 3.673, R2 of 0.777, and MAPE of 0.195%, the HHO–DMLP model can represent for the properties of the clay and friction angle of clay at different areas in this study. However, this conclusion needs to verify through the dataset in practical engineering, i.e., testing dataset. Note that the testing dataset was not used for training and developing the friction angle predictive models. In other words, they can be considered as the unseen dataset in practice.

Considering the testing dataset and the performances of the developed models, it is very interesting to note that all the predictive models are good with high performance. It is worth mentioning that the RF model provided highest accuracy on the testing dataset with an MSE of 2.786, RMSE of 1.669, R2 of 0.961, MAPE of 0.069%, and VAF of 95.125. However, compared with the performance of the RF model on the training dataset, it is clear that the RF model was over-fitted on the testing dataset although several techniques have been applied to prevent over-fitting. Therefore, it is not reliable for predicting friction angle of clay from different areas/locations in practice. Three remaining models (i.e., HHO–DMLP, DMLP, and SVM) performed good and reliable on the testing dataset. Of those, a positive result was also found for the HHO–DMLP model on the testing dataset. With an RMSE of 3.470, R2 of 0.796, and MAPE of 0.182% on the testing dataset, it can be concluded that the HHO–DMLP performed very well and stable in practice. Figures 8 and 9 reflect the reliability and correlation of the prediction models.

Fig. 8
figure 8

Distribution of the friction angle on AI models developed (training phase)

Fig. 9
figure 9

Distribution of the friction angle on AI models developed (testing phase)

It is easy to recognize that the HHO–DMLP was fitted with the dataset over the other models (Fig. 8). Values in the range of 6 to 12 of friction angle are not fitted with the HHO–DMLP model. They should be carefully taken in predicting friction angle of clay in practical sense. On the SVM and RF models, it is clear that the dataset is not fitted with these models and we can see that most of the observations are not converged on the regression line or 80% confidence level. Similar recommendations are achieved with the models (Fig. 9). Notably, the RF model is over-fitted in practice and it should be eliminated in predicting friction angle of clay. Based on the distribution of the dataset (Figs. 8 and 9), it can be claimed that the 80% confidence level of the proposed HHO–DMLP model can represent the friction angle of clay from different areas/locations. A comparison of the accuracy of the HHO–DMLP, DMLP (without optimization), and SVM models in predicting friction angle of clay is illustrated in Fig. 10, and further evaluation of them through the Taylor diagram is shown in Fig. 11. Note that since the RF model was over-fitted in this study, therefore, it is not compared in these figures.

Fig. 10
figure 10

Comparison of the HHO–DMLP, DMLP (without optimization), and SVM models in predicting friction angle of clay

Fig. 11
figure 11

Taylor diagram for proving the accuracy of the proposed HHO–DMLP model

From Fig. 10, it can be seen that the orange points (i.e., HHO–DMLP model) are closer to the blue points (i.e., actual values) than the other points. They indicate that the accuracy of the proposed HHO–DMLP model is higher than the remaining models in practice. Furthermore, observing the models on the Taylor diagram, we can confirm the accuracy and performance of the HHO–DMLP model as mentioned above. It is clear that the standard deviation of the actual model is high. It shows the high volatility of friction angle of clays compared to the average value, and finding a general model capable of representing the friction angle of clays at different areas/locations is not easy. The Taylor diagram showed that the HHO–DMLP model also provided high standard deviation with highest correlation. The visualization of the Taylor diagram showed that the HHO–DMLP model was closer to the actual model than the other models.

6 Conclusion

Friction angle of clays is an essential parameter to evaluate the stability of slopes and landslide. Different areas with different clay properties have a significant influence on the stability of the slopes and landslide, especially the friction angle. Therefore, a generalized model capable of predicting friction angle of clays from different areas/locations with high reliability is ideal for assessing slope stability and landslide. This study proposed a novel generalized artificial intelligence model for estimating the friction angle of clays from different areas based on deep MLP neural network and HHO algorithm (i.e., HHO–DMLP). The robustness and consistency of the model’s prediction were checked by testing with various datasets having different geological and geomorphological setups. The results showed that the proposed HHO–DMLP model can predict fiction angle of clays from different areas/locations with high reliability. It can be used in practice instead of experimental tests in a laboratory to save time and costs.

Although the obtained results are highly reliable from this study, the future work is identified with more database from other areas/locations. Future studies with more databases are useful in improving the predictive models. Such models will contribute to the current knowledge in this field and can be applied in any geographical territories.