Introduction

An essential characteristic of soil that describes its capacity to carry water through it is known as hydraulic conductivity. In hydrogeology, soil science, and civil engineering, hydraulic conductivity is a crucial quantity because it aids in understanding and predicting groundwater flow, drainage, and soil–water interactions. The type of soil, porosity, particle size distribution, compaction, and other factors all affect hydraulic conductivity. Sands and gravels, which contain larger particles, often have higher hydraulic conductivity, but clays, which contain smaller particles, have lower hydraulic conductivity and obstruct water flow. Hydraulic conductivity can be assessed in the lab or in the field, and it is frequently used in groundwater modeling, investigations of soil permeability, the design of drainage systems, and groundwater remediation strategies. The value of saturated hydraulic conductivity (K) for a specific soil is critical in assessing its suitability for various engineering and environmental applications. The experimental procedures for determining the hydraulic conductivity of fine-grained soil are arduous. Therefore, several investigators and scientists evolved and employed different empirical and advanced computational approaches.research on the assessment of hydraulic conductivity

Teng et al. (2023) implemented the Kozeny-Carman (KCn) equation to assess the K of coarse-grained soil during frozen conditions. Tan et al. (2023) established an artificial neural network (ANN), gradient boosting decision tree (GB_DT), multiple linear regression (MLR), and random forest (RF) utilizing 329 soil sample results to predict K. The researchers predicted K with 93% accuracy using plasticity index (PI), degree of saturation (DS), specific gravity (SG), fine contents (FG), and void ratio (e) as input variables. Zeitfogel et al. (2023) used a Feedforward neural network (FNN) and extreme gradient boosting (XGBoost) using organic matter content, clay, silt, and sand content. The authors noted that XGBoost predicts K more efficiently than the FNN model with a test RMSE of 12.0%. Azarhoosh and Koohmishi (2023) employed RF, ANN, and adaptive neuro-fuzzy inference system (ANFIS) computational models to assess K of coarse-grained soil. The investigators reported that air void content is the most influential variable in assessing K. Peters et al. (2023a) assessed KUST using a water retention curve. The researchers proposed a function for unsaturated K (KUST) by replacing the soil-specific saturated K. The mean error between real and predicted KUST was less than half an order of magnitude. Mufti and Das (2023) implemented a hybrid approach using pore-network and discrete element methods to estimate the KUST of granular soils. Wang et al. (2023) proposed a model capable of assessing K with RMSE of 0.78 cm/day. Zhang and Wang (2023) assessed saturated hydraulic conductivity (K) soil (having bulk densities of 1.25, 1.3, 1.35, 1.4, 1.45, 1.5, 1.55, 1.6, 1.65, 1.7, 1.75, and 1.8 g/cm3) using CT scanning technology for mining areas. The investigators found that the number of macro-porosity and macro-pores decreases due to increased bulk density. Using CT scan technology for soil samples prepared at different bulk densities, the model predicted K with a determination coefficient (R2) of 0.84. Li et al. (2023) predicted K of clay and sand using a modified KCn equation based on Poiseuille's law. Peters et al. (2023b) employed Mualem (Mual), Chlds and Collis-George (CGG), Alexander and Skaggs (AS), and Burdine (Br) models to measure the K of soil. It was noted that the Mualen model measured K better than other models. Zhang et al. (2023) used the K results of 329 soil samples to employ the random forest model. The proposed model predicted K with high precision (i.e., 92%) using compaction parameters, hydraulic characteristics, and soil physical properties. Kim et al. (2023) used a regional database of 68 soil samples with geotechnical, geological, and hydrological parameters to assess K. The authors noted that the K is directly affected by the infiltration process of rainfalls into the soil. Also, the authors have mapped the best correlation between groundwater level and the moving precipitation average. Chandel et al. (2023) constructed a feedforward neural network (FFNN) to assess the K of porous media. The researchers predicted K with root mean square error (RMSE) of 0.016, mean bias error (MBE) of 0.006, and determination coefficient (R2) of 0.94 using the FFNN model. The performance of the FFNN model was compared with MLR and KSOM models, and it found that the FFNN model is robust in predicting K. Khaja et al. (2023) conducted research to identify the relationship among K, field dry density, porosity, and gradational parameters of sandy soil. For this aim, the authors used the results of 60 soil samples. It was concluded that (i) particle size at 50% fine (D50) has a strong relationship with K, (ii) coefficient of uniformity (CU) and curvature (CC) poorly correlate with K. In addition, a significant relationship has been mapped between K, field dry density, and porosity. The regression model attained an RMSE of 0.67 and a mean square error (MSE) of 0.45. Piri et al. (2023) compared RF, Chi-Squared Automatic Interaction Detection (CHAID), and Geo-statistics models to assess the K using 130 soil sample results. It was noted that the RF model outperformed the CHAID and other models with the least residuals of 0.0019. Singh et al. (2023) employed support vector machine (SVM), RF, GPR, gene expression programming (GEP), and multivariate adaptive regression splines (MARS) to assess the KUST. The investigators used the results of 240 soil samples to complete the published work. The investigators concluded that Pearson VII universal kernel (PUK) based SVM models are highly capable of assessing the KUST of soil. The published research used moisture content, bulk density, silt, clay, and sand as input variables. Tseng et al. (2023) constructed GPR and Bayesian models to compute the K in a watershed. The authors found that the model's accuracy depends on the high- and low-fidelity data and location distribution. Báťková et al. (2023) predicted K for agricultural soil using the pedo-transfer function. The researchers developed ten models using 56 data points, including results of organic matter/ organic carbon content, dry bulk density, clay, silt, and sand particles. Singh and Sharma (2023) implemented Zamarin, NAVFACDM7, Sauerbrei, Kruger, Slitcher, Hazen, Terzaghi, and KCn equations for assessing K of soil using surface NMR porosity and particle size distribution. The researchers noted that the modified KCn equation assessed K with R2 of 0.904 and RMSE of 6.36. Emberga et al. (2023) predicted K for aquifer based on the grain-size database using the MLR technique. The authors reported that Slitcher, ANN, and MLR models estimated K with RMSE of 5.14, 2.57, and 1.00, respectively. Veloso et al. (2022) estimated K using MARS, RF, SVR, and k-nearest neighbors (kNN). The researchers prepared different combinations of input variables, i.e., sand, silt, clay, bulk density, particle density, total porosity, microporosity, microporosity, soil moisture at the permanent wilting point, and soil moisture at field capacity. The investigators noted that the RF and SVR models predicted K with higher R2 and the least residuals using all input variables. Chandel et al. (2022) derived seven empirical equations using grain size parameters for predicting K. Using the Hazen equation, the authors noted a good agreement between real and predicted K. Khalili-Maleki et al. (2022) used Hybrid Wavelet-ANN (WANN), Least square support vector machine (LSSVM), and Larsen Fuzzy Logic (LFL) models to predict the K using grain size database. Model WANN was identified as a more accurate model than LSSVM and LFL models in predicting K. Albalasmeh et al. (2022) employed an optimized ANN model to compute the K for arid and semi-arid regions. The investigators implemented a generalized regression neural network (GRNN) model using depth, texture, organic matter, pH, bulk density, and electric conductivity as input variables of 165 soil samples. The investigators concluded that the GRNN model gives a reliable prediction of K with a limited database. Ruan and Fu (2022) assessed the K of compacted bentonite in confined conditions using a modified KCn equation. Hedayati-Azar and Sadeghi (2022) developed a semi-empirical model to assess the K of clayey soil. The authors reported that the estimation of K becomes erroneous if solute concentrations of permeating fluid are ignored. Shan et al. (2022) employed Weibull distribution models to estimate the relative K. Faloye et al. (2022) constructed MLR, ANN, and ANFIS models using biochar levels and soil moisture content. The models ANFIS, ANN, and MLR attained R2 of 0.95, 0.98, and 0.92 in the validation phase. Therefore, the authors concluded ANN models are the most potent tool for predicting KUST of biochar-amended soil. Singh et al. (2022) developed a genetic algorithm-optimized ANN and SVM models to estimate soil K. Furthermore, the pedo-transfer function (PTF) was implemented with developed models. The performance comparison demonstrated that the SVM_GA PTF model is more capable of predicting K than the ANN models. Using empirical relationships, Chandel and Shankar (2022) predicted K for borehole soil samples. The authors found that the KCn equation has better agreements between predicted and real K values than other equations, i.e., Alyamani & Sen, Hazen, and Beyer. ur Rehman et al. (2022) compared the multi-expression programming, GEP, and ANN in assessing K using a large database. The researchers concluded that GEP predicted K with high accuracy. In addition, it was noted that particle size at 10% finer (D10) is the most influencing input variable in assessing K. Granata et al. (2022) used ANN, RF, and SVM approaches to compute the K of soil. Based on the performance comparison, the authors found that SVM and RF models are more accurate than ANN. Pham and Won (2022) reported that the extreme gradient boosting (XGB) approach based on PTF is highly capable of predicting soil K. It was also noted that clay content is the most significant variable in assessing the K. Hosseini et al. (2022) used soil texture to predict the K of soil by applying genetic and neural network approaches. The authors recorded a residual of 1.22 and regression coefficient of 0.997 for the neural network model in predicting K of soil, comparatively better than geo-statistics and genetic models. Thakur et al. (2022) assessed the hydraulic conductivity of porous media using ANFIS, triangular, GPR, and SVM models. In this published work, model ANFIS outperformed the triangular, GPR, and SVM models with an RMSE of 0.0010. Tan et al. (2022) predicted K of geosynthetic clay liners with a validation performance of 85%. More et al. (2022) applied extreme learning machine (ELM), SVM, and ANFIS approach to estimate saturated hydraulic conductivity for tropical semi-arid zones. The researchers reported that model ELM achieved Nash–Sutcliffe efficiency (NSE) of 0.90, better than the other two approaches. Tao et al. (2022) mapped the relationship between particle size and the KUST of soil. Gupta et al. (2021) used RF to assess soil K. The RF model predicted the K of soil with an accuracy of 79% and RMSE of 0.72. Williams and Ojuri (2021) compared ANN and MLR models in predicting the hydraulic conductivity of soil. The authors reported that model ANN has gained an accuracy of 95.5%, higher than the MLR model. Mujtaba et al. (2021) mapped a relationship between hydraulic conductivity and gradational parameters of sandy soil. The researchers noted that D10 has a healthy relationship with K. Peters et al. (2021) estimated the hydraulic conductivity of medium to dry soil using a water retention curve. Yan et al. (2021) predicted the effect of biochar on the saturated K of natural and artificial media. The researchers noted that the hydraulic conductivity decreases because of an increase in inter-porosity due to bio-char and a decrease in mean pore radii. Rout and Singh (2021) introduced empirical models using hydraulic conductivity and basic soil properties. The proposed empirical model predicted K with ± 20% intervals.

Chen and Zhang (2020) estimated the K of frozen soil. A discontinuous noncircular capillary bundle model was introduced for this aim using modified Hagen-Poiseuille, Kelvin, and Campbell equations. Kashani et al. (2020) implemented MARS, M5 tree, SVM, ELM, and ANN approaches to assess the hydraulic conductivity of soil using electrical conductivity, pH, bulk density, organic matter, clay, and silt parameters as input variables. Based on the performance metrics, model ANN achieved the highest performance compared to other models, i.e., NSE = 0.939 (in training) and = 0.917 (in the testing). Arshad et al. (2020) derived empirical models and reported that void ratio and grain size characteristics are significant parameters in predicting the hydraulic conductivity of sandy soils. Trejo-Alonso et al. (2020) introduced a pedo-transfer function using 900 data points to assess the K of the soil. The proposed models assessed K with over 99% accuracy. Babaoglu and Simms (2020) improved K estimation for soft clayey soil. The authors reported that (i) the K – a high-void ratio can improve the void ratio, and (ii) the compressibility curve can be a predictor. Sihag et al. (2020) employed ANN, GPR, GEP, and GRNN approaches to predict the infiltration process using 155 data points. Ming et al. (2020) assessed the K of frozen soil from the soil freezing characteristics curve. Sihag et al. (2019a) employed ANFIS, firefly (FFA), and particle swarm (PSO) algorithm-optimized ANFIS models to assess the hydraulic conductivity. These models were trained and tested by 170 and 70 data points. The ANFIS-PSO model outperformed the ANFIS-FFA and traditional ANFIS models with a correlation coefficient of 0.9816 in the testing phase. In addition, Sihag et al. (2019b) compared RF, M5P, and regression models in estimating the KUST field. The RF model attained the highest performance, i.e., 0.819 in the testing phase, then other models. Sihag et al. (2019c) mapped a comparison between regression analysis, ANN, and ANFIS and found that the regression model MLR (RMSE = 4.5578) is better than other models. Naganna and Deka (2019) compared SVM, ANN, and ANFIS models to introduce the best prediction approach. The comparison of performance metrics shows that the SVM model is the best approach for predicting streambed hydraulic conductivity. Al-Dosary et al., (2019) implemented GPR, linear regression (LR), and multilayer perceptron (MLP) approaches to assess the KUST of sandy loam soil. In the published work, the GPR model outperformed the LR and MLP models. Sihag (2018) estimated the KUST of soil by implementing fuzzy logic-FL (based on triangular and Gaussian) and ANN models. The researcher concluded that the fuzzy logic model based on Gaussian attained R of 0.9270 and RMSE of 7.4393, better than ANN and fuzzy logic (based on triangular) models. More and Deka (2018) employed fuzzy neural networks (FNN), ANN, FL, and MLR using 175 data points to measure the K for murum soils. The authors concluded that the FNN model attained an accuracy of over 85%, higher than the accuracy of ANN, FL, and MLR models. Nematolahi et al. (2018) employed GA and PSO-optimized fuzzy inference system (FIS) models to assess the K. The PSO-optimized FIS model attained an accuracy of over 70%, higher than conventional FIS and GA-optimized FIS models.

Also, Mady and Shein (2018), Qaderi et al. (2018), Fatoba et al. (2018), and Shi and Yin (2018) reported that the SVM, nonlinear regression, GMDH, harmony search-optimized GMDH, and ANN models can predict the K of soil. Table 1 summarizes the published research on the assessment of hydraulic conductivity of soil.

Table 1 Summary of ML models used in the literature

The published research reveals that most researchers employed MLR, GPR, GEP, MEP, SVM, DT, MARS, ANFIS, LSSVM, GMDH, ANN, and hybrid (WANN, SVM_GA, ANFIS_FFA, ANFIS_PSO, and HS_GMDH) approaches to predict the K of soil. These researchers also concluded that the ANN approach gives the most promising results of soil hydraulic conductivity. Still, the effect of structural multicollinearity on the performance of ANN models in predicting hydraulic conductivity has not been studied and analyzed. In addition, the backpropagation algorithms of neural networks have not been compared for designing the optimal performance ANN model. Also, the effect of multicollinearity levels on the ANN model has not been studied and analyzed. Based on the gap identified in the published work, the present research has the following novelty:

  • This research illustrates the effect of structural multicollinearity, considering the one to five hidden layers interconnected with each 5, 10, and 15 neurons, on artificial neural network models in predicting the hydraulic conductivity of clayey soil.

  • This research compares Gradient Descent with Adaptive Learning (GDA), Gradient Descent (GD), Gradient Descent with Momentum (GDM), Scaled Conjugate Gradient (SCG), Broyden, Fletcher, Goldfarb, and Shanno (BFGs), and Levenberg–Marquardt (LM) backpropagation algorithms to design an optimal performance ANN model.

  • The effect of multicollinearity levels is studied and analyzed for each artificial neural network in predicting the hydraulic conductivity of clayey soil.

  • This research introduces an optimal performance ANN model with the best hyperparameters for predicting the hydraulic conductivity of clayey soil.

The hydraulic conductivity of clayey soil is determined by performing the falling head test. The falling head hydraulic conductivity test is time-consuming. Therefore, several investigators applied traditional and advanced methods to assess the hydraulic conductivity of soil. These advanced methods are based on machine learning. However, an artificial neural network is an ML technique that can predict accurately. Still, selecting the number of hidden layers and neurons is much more important to achieve certain accuracy. The present research helps engineers choose the number of hidden layers and neurons for artificial neural networks to assess the hydraulic conductivity of compacted clayey soil. This research will also reduce the laboratory efforts of the geotechnical engineers in assessing hydraulic conductivity. This research also introduces the best backpropagation algorithm for developing neural network models.

Research methodology

This research introduces an optimal-performance artificial neural network model for predicting the hydraulic conductivity of clayey soil. In addition, this research compares the predictive capabilities of Gradient Descent with Adaptive Learning (GDA), Gradient Descent (GD), Gradient Descent with Momentum (GDM), Scaled Conjugate Gradient (SCG), Broyden, Fletcher, Goldfarb, and Shanno (BFGs), and Levenberg–Marquardt (LM) backpropagation algorithms to find the best backpropagation algorithm. For this aim, a database with results of soil texture, consistency limits, compaction parameters, and hydraulic conductivity of 104 soil specimens has been compiled from the published articles by Benson et al. (1994) and Benson and Trast (1995). The multicollinearity analysis has been performed to determine the collinearity levels for input variables. In addition, ANOVA and Z tests have been performed to determine the hypothesis for the present research. A cosine amplitude sensitivity analysis has been performed to determine the significant input variables in predicting the hydraulic conductivity of soil. The training, validation, and testing databases have been created by arbitrarily selecting 80, 12, and 12 data points (soil samples). One to five hidden layers interconnected with 5, 10, and 15 neurons have been selected for developing ANN models. Thus, fifteen ANN models have been developed for each backpropagation algorithm. Fourteen performance metrics, RSR, LMI, MBE, WI, NMBE, BF, PI, NS, WMAPE, VAF, MAPE, R, MAE, and RMSE, have measured the performance and accuracy of learned ANN models. In addition, three novel performance metrics, a20-index, index of scatter, and index of agreement, have been implemented for measuring performance. One best architectural model is identified from each backpropagation algorithm by comparing the performance metrics. Thus, the six best architectural ANN models have been obtained and further analyzed by REC curve, rank, uncertainty, and Wilcoxon analysis. The research hypothesis (HR) for the normality of predicted hydraulic conductivity of clayey soil has been checked by performing the Anderson–Darling (AD) test. Finally, one optimal performance ANN model has been identified for predicting the hydraulic conductivity of clayey soil. The accuracy of the optimal performance ANN model has been validated by published models. The robustness of the optimal performance ANN model has been determined by cross-validation (cost computation) and external validation (generalizability). The logic behind this methodology is to select the hyperparameters to design the optimal performance ANN model to predict the K of soil without the hit and trial method. Figure 1 depicts the flow chart for the execution of the work.

Fig. 1
figure 1

Flow chart of the work

Data collection and analysis

A raw database from the published research by Benson et al. (1994), and Benson and Trast (1995) has been compiled to execute this research (refer Appendix, Table D). The database consists of soil texture (S, M, C), consistency limits (LL, PI), compaction parameters (OMC, MDD), and hydraulic conductivity (K) results of clayey soil. Most researchers implemented ML models using plastic limits in their published work to assess the geotechnical properties of fine-grained soil. Still, the plasticity index has not used to predict the hydraulic conductivity of soil. It is known that high PI shows low hydraulic conductivity of soil. Therefore, this research uses the plasticity index (PI) as an input variable for ML models. From removing outliers and missing data points, one hundred and four data points have been collected and used in this research. Three databases, training, validation, and testing, have been constructed by arbitrarily selecting 80, 12, and 12 data points, respectively. The descriptive statistics of the 104, 80, 12, and 12 databases are summarized in Table 2, along with the frequency plot of data points, as shown in Fig. 2.

Table 2 Descriptive statistics of databases
Fig. 2
figure 2

Distribution of variables

Figure 2 depicts the frequency distribution of variables using the Lorentz curve. This curve helps to understand the distribution of the data variables. Gamma represents the anticipated change in Delta, with a maximum value of 1. The Gini coefficient (ɣ) varies from 0 to 1, presenting no inequality to complete inequality. Moreover, the complete database, i.e., 104, has been classified as per IS 1498: 1970, as illustrated in Fig. 3.

Fig. 3
figure 3

Classification of database

Figure 3 shows that the database consists of results of inorganic silts with none to low plasticity (ML), inorganic clays of low plasticity (CL), organic silts of low plasticity (OL), inorganic silts of medium plasticity (MI), inorganic clays of medium plasticity (CI), organic silts of medium plasticity (OI), inorganic silts of high compressibility (MH), inorganic clays of high plasticity (CH), and organic clays of medium to high plasticity (OH). Because of the number of different soils available in the database, Pearson's product-moment correlation coefficient has been calculated for each variable and presented in Fig. 4.

Fig. 4
figure 4

Correlation coefficient for variables available in the complete database

Figure 4 illustrates the relationship between the variables in terms of correlation coefficient. A correlation of ± 1.0 to ± 0.81, ± 0.80 to ± 0.61, ± 0.60 to ± 0.41, ± 0.40 to ± 0.21, and ± 0.20 to ± 0.00 represents the very strong, strong, moderate, weak, and no relationship between the variables (Hair et al. 2017). Figure 4 shows that (a) S content very strongly (= -0.9323) correlates with F content, (b) LL (= 0.7537), PI (= 0.7089), OMC (= 0.6612), and MDD (= -0.6663) strongly correlates with F content, (c) S content also strongly correlates with LL (= -0.7668), PI (= -0.7207), OMC (= -0.7315), and MDD (= 0.7167), (d) LL (= -0.2851) and PI (= -0.2516) weakly correlates with specific gravity, (e) OMC (= -0.1474) and MDD (= 0.1675) have no relationship with SG, (f) LL (= 0.9190) very strongly correlates with PI, (g) OMC (= 0.7532) and MDD (= -0.6962) strongly correlates with PI, (h) OMC, and MDD very strongly (= -0.8703) correlates with each other, (i) F content (= -0.4122), LL (= -0.4972), PI (= -0.5620), OMC (= -0.4354), and MDD (= 0.5295) moderately correlate with hydraulic conductivity of clayey soil. The pairwise scatterplot, correlation coefficient matrix, variance inflation factor (VIF), and eigenvalue methods are used to determine the multicollinearity levels of the database (Shrestha 2020). The correlation coefficient values for independent and dependent variables show multicollinearity. Therefore, another method, variance inflation factor (VIF), has been used to determine the multicollinearity levels for independent variables.

Multicollinearity analysis

Multicollinearity or collinearity occurs between the variables during regression analysis. However, an extensive database is used for artificial intelligence techniques, increasing the chances of multicollinearity (Chan et al. 2022). The reasons for occurring multicollinearity are as follows: (a) variables are not significantly correlated, (b) multiple regression analysis is performed, and (c) variables are highly correlated. For determining the multicollinearity levels, the variance inflation factor (VIF = \(1/(1-{R}^{2})\)) method has been used. Gareth et al. (2013) and Vittinghoff et al. (2006) introduced problematic multicollinearity levels if a VIF value is more than 10. Menard (2002) suggested a considerable multicollinearity level based on VIF value. Khatti and Grover (2023a) introduced five multicollinearity levels, i.e., problematic multicollinearity (10 < VIF), moderate multicollinearity (5 < VIF ≤ 10), considerable multicollinearity (2.5 < VIF ≤ 5), weak multicollinearity (0 < VIF ≤ 2.5), and no multicollinearity (0 = VIF) based on VIF values using the published statement. Table 3 presents the multicollinearity levels for F, S, SG, LL, PI, OMC, and MDD variables in predicting the hydraulic conductivity of clayey soil.

Table 3 Multicollinearity levels for variables

Table 3 reveals that F (%), S (%), PI (%), and OMC (%) variables have moderate multicollinearity. Conversely, specific gravity and MDD (g/cc) have weak and considerable multicollinearity levels, respectively. The liquid limit of clayey soil shows the problematic multicollinearity in predicting the hydraulic conductivity of the soil.

Hypothesis analysis

The hypothesis analysis is performed for decision-making, inference, quality control, decision evaluation, risk assessment, and statistical inference. So, hypothesis testing is necessary to make informed judgments, draw inferences from data, and ensure that outcomes do not result from chance (Khatti and Grover 2023d, 2023e, 2023f). It does all of these things systematically and thoroughly. The following statements have been mapped for the present research for selecting the research hypothesis:

  • The soil textures, i.e., F and S contents, are the significant variables in assessing the hydraulic conductivity of clayey soil.

  • The liquid limit of soil increases due to an increase in fine content and a decrease in sand content.

For this purpose, ANOVA and Z tests have been performed in this research. The statistical test, Analysis of Variance (ANOVA), examines the variations in group means in a sample (Khatti et al. 2023). When comparing more than two groups, it is beneficial. ANOVA evaluates whether there is a statistically significant difference between these groups' means. The results of the ANOVA test are summarized in Table 4.

Table 4 ANOVA test results

Table 4 demonstrates that each input variable, i.e., F (3302.25 > 3.89), S (241.84 > 3.89), SG (88.20 > 3.89), LL (2329.37 > 3.89), PI (1258.38 > 3.89), OMC (1957.13 > 3.89), and MDD (4.55 > 3.89), follows the research hypothesis (HR) clause (F > F crit). Hence, the ANOVA test ACCEPTS the HR for the present work. Moreover, another statistical hypothesis test, the Z test, has been performed to determine whether the sample mean is significantly different from a known population mean when the population standard deviation is known (Hosseini et al. 2023). The Z test results are summarized in Table 5.

Table 5 Z test results

Table 5 presents that each input variable follows the research hypothesis clause, i.e., Z > Z critical two-tail > Z critical one-tail and P one-tail < 0.05 > P two-tail. Hence, the Z test confirms the REJECTION of the null hypothesis for the present research.

Cosine amplitude sensitivity analysis

This analysis reveals the most significant input variables in predicting the hydraulic conductivity of clayey soil. The nonlinear cosine amplitude method (CAM) has been used for this aim. The sections mentioned earlier show that this work uses F, S, LL, PI, OMC, and MDD as input variables to assess the hydraulic conductivity of clayey soil. The sensitivity of input variables is determined by applying the following equation (Hasanzadehshooiili et al. 2012):

$$x= \left\{{x}_{1},{x}_{2},{x}_{3}\dots {x}_{m},\dots ,{x}_{n}\right\}$$
(1)

The variable \({x}_{i}\) in array, X is a length of vector m as:

$${x}_{i}= \left\{{x}_{i1},{x}_{i2},{x}_{i3}\dots {x}_{im}\right\}$$
(2)

The relationship between CAM (strength of the relation) and database of (\({x}_{i}\)) and (\({x}_{j}\)) is presented by the following equation (Ghorbani et al. 2020):

$$CAM=\frac{\sum_{k=1}^{m}{x}_{ik}{x}_{jk}}{\sqrt{\sum_{k=1}^{m}{x}_{ik}^{2}\sum_{k=1}^{m}{x}_{jk}^{2}}}$$
(3)

The CAM value close to one presents that the specific input variable is highly significant in the prediction. The CAM value close to zero shows the least significance. Figure 5 depicts the sensitivity of input variables F, S, LL, PI, OMC, and MDD in predicting the hydraulic conductivity of clayey soil. Figure 5 shows that sand (= 0.7187), specific gravity (= 0.7150), and MDD (= 0.7343) are significant input variables in predicting the hydraulic conductivity of clayey soil. It can be seen that the fine content (= 0.6548) also influences the hydraulic conductivity of clayey soil, followed by OMC (= 0.6423), LL (= 0.6324), and PI (= 0.5986).

Fig. 5
figure 5

Depiction of sensitivity analysis

Performance metrics

Performance metrics, which are diverse statistical factors, are used to assess the efficiency of soft computing. Both linear and nonlinear indicators of performance are used. Sixteen performance metrics have been used in this study to evaluate the performance of machine learning models and check the reliability of the best architectural model. The following is how the performance determined expressed mathematically (Kumar and Samui 2020; Asteris et al. 2021a, 2021b; Khatti and Grover 20212023b, 2023c):

$$RMSE= \sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left(\alpha -\beta \right)}^{2}}$$
(4)
$$MAE= \frac{1}{n}\sum_{i=1}^{n}\left|\left(\omega -\alpha \right)\right|$$
(5)
$${R}^{2}=\frac{\sum_{i=1}^{r}{\left(\alpha -\beta \right)}^{2}-\sum_{i=1}^{r}{\left(\alpha -\omega \right)}^{2}}{\sum_{i=1}^{r}{\left(\alpha -\beta \right)}^{2}}$$
(6)
$$R= \frac{\sum \left({\alpha }_{i}-\overline{\beta }\right)\left({\omega }_{i}-\overline{\omega }\right)}{\sqrt{\sum {\left({\alpha }_{i}-\overline{\beta }\right)}^{2}\sum {\left({\omega }_{i}-\overline{\omega }\right)}^{2}}}$$
(7)
$$MAPE= \frac{1}{n}{\sum }_{i=1}^{n}\left|\frac{\alpha -\omega }{\alpha }\right|*100$$
(8)
$$WMAPE=\frac{{\sum }_{i=1}^{n}\left|\frac{\alpha -\omega }{\alpha }\right|*\alpha }{{\sum }_{i=1}^{n}\alpha }$$
(9)
$$VAF= \left(1-\frac{var\left(\alpha -\omega \right)}{var\left(\alpha \right)}\right)*100$$
(10)
$$PI= {R}^{2}+\left({~}^{VAF}\!\left/ \!{~}_{100}\right.\right)-RMSE$$
(11)
$$RSR=\frac{RMSE}{\sqrt{\frac{1}{N}} {\sum }_{i=1}^{N}{\left(\alpha -\beta \right)}^{2}}$$
(12)
$$NS=1\frac{{\sum }_{i=1}^{n}{\left(\alpha -\omega \right)}^{2}}{{\sum }_{i=1}^{n}{\left(\alpha -\beta \right)}^{2}}$$
(13)
$$BF=\frac{1}{n}{\sum }_{i=1}^{n}\frac{\omega }{\alpha }$$
(14)
$$a20 index=\frac{m20}{H}$$
(15)
$$IOA=1-\frac{{\sum }_{i=1}^{n}\left(\omega -\alpha \right)}{2{\sum }_{i=1}^{n}\left(\alpha -\beta \right)}$$
(16)
$$IOS=\frac{RMSE}{Avg.\;of\;Actual \;Values}$$
(17)
$$LMI=1-\left[\frac{{\sum }_{i=1}^{n}\left|\alpha -\omega \right|}{{\sum }_{i=1}^{n}\left|\alpha -\upbeta \right|}\right]$$
(18)
$$NMBE=\frac{\frac{1}{N}{\sum }_{i=1}^{n}{\left(\omega -\alpha \right)}^{2}}{\frac{1}{N}{\sum }_{i=1}^{n}\alpha }$$
(19)
$$MBE=\frac{1}{N}{\sum }_{i=1}^{n}\left(\omega -\alpha \right)$$
(20)

where α and \(\omega\) are the real and assessed ith value, n is the total number of data, β presents the mean of the real values, \(\overline{\omega }\) presents the mean of the assessed value, \(k\) presents the number of independent variables, m20 is the ratio of real to the assessed value, varies from 0.8 to 1.2, and H presents the total data samples. On the other hand, the index of agreement (IOA) is bounded by -1.0 and 1.0 (Willmott et al. 2012). Moreover, the scatter index (IOS) value close to zero presents an excellent prediction and accuracy (Mentaschi et al. 2013). A computational model is reliable and accurate if it achieves R2 over 0.95. Also, a weak, good, and strong relationship between actual and computed data is presented if a pair of data has R less than 0.2, between 0.2 to 0.8, and more than 0.8 (Smith 1986). A perfect predictive model always has performance indicators' values equal to the ideal value, as in Table 6 (Bahmed et al. 2024; Daniel et al. 2024).

Table 6 Ideal value of the different performance indicators

Computational approach

A computational model called an Artificial Neural Network (ANN), also known simply as a neural network, is modeled after the structure and operation of biological neural networks, such as the human brain. Artificial intelligence (AI) and machine learning techniques, including neural networks, are utilized for various tasks, such as classification, regression, pattern recognition, and more. The fundamental components of an artificial neural network are the neurons, layers, weights/ Biases, activation functions, feedforward process, backpropagation algorithm, loss functions, and cost functions. A backpropagation algorithm (BA) must be carefully selected because it distributes the prediction error by updating the neuron's weight. The GDM, GD, GDA, SCG, BFG, and LM algorithms have been compared to find the best BA. Ninety artificial neural network models (fifteen for each BA) have been developed, learned, and analyzed. Each ANN model has been configured with a min-gradient of 10e-7, max fail of 6, momentum of 0.001, multilayer perceptron class, feedforward backpropagation, 1000 epochs, train: valid: test of 76%: 12%: 12%, sigmoid activation function at hidden layers, linear activation function at output layer, log function for output normalization, min–max function for input normalization, one to five hidden layers (HL) interconnected with each 5, 10, and 15 neurons (N). The reasons for selecting the sigmoid function in this research are (a) most of the published work used the sigmoid function, (b) the smoothness of the sigmoid allows for stable and continuous updates to model parameters, and (c) it is easy to set a threshold (e.g., 0.5) for decision-making. Table 7 summarizes the designations of the ANN model for LM, BFG, SCG, GDM, GD, and GDA algorithms.

Table 7 Designation of ANN models

Results and discussion

Simulation of results

The training (TRG), validation (VDN), and testing (TSG) performance results for developed, learned, and analyzed ANN models are summarized and presented in Appendix – I (Tables A, B, and C). Table A presents that model LM_K15 has an excellent TRG performance, i.e., R = 0.9924, NS = 0.9846, PI = 1.8038, BF = 0.9923, WI = 0.9758, MBE = 0.0175, LMI = 0.1390. Model LM_K15 assesses the hydraulic conductivity with a minor prediction error, i.e., NMBE = 0.0204, WMAPE = 0.0978, MAPE = 0.1984, MAE = 0.1321, and RMSE = 0.1658. The TRG performance comparison reveals that model LM_K15 achieves VAF of 98.48, RSR of 0.1241, IOA of 0.9305, IOS of 0.1227, and a20 of 68.75. The hydraulic conductivity of twelve soil samples is computed to validate model LM_K15. The VDN performance comparison reveals that model LM_K15 gains R of 0.9719, IOA of 0.8834, IOS of 0.3258, a20 of 72.00, RSR of 0.2601, LMI of 0.2332, MBE of 0.0227, WI of 0.9791, BF of 1.1231, PI of 1.3859, NS of 0.9668, and VAF of 96.89, better than other LM_ANN models and close to ideal values. Model LM_K15 predicts the hydraulic conductivity with the least validation residuals, i.e., NMBE = 0.1202, RMSE = 0.4928, WMAPE = 0.2168, MAE = 0.3212, and MAPE = 0.4536. The TSG phase reveals that model LM_K15 estimates the hydraulic conductivity with R of 0.9959, a20 of 100, IOA of 0.9364, IOS of 0.0306, VAF of 99.16, NS of 0.9882, PI of 1.9348, RSR of 0.1084, LMI of 0.1272, MBE of 0.0260, WI of 0.9339, BF of 1.0163. The remaining TSG performance metrics confirm the prediction capabilities of model LM_K15, i.e., WMAPE = 0.0264, MAPE = 0.0279, NMBE = 0.0015, RMSE = 0.0487, and MAE = 0.0419. A thorough analysis demonstrates that model LM_K15, configured with five hidden layers and 15 neurons, attains higher TRG, VDN, and TSG performance. The overall performance analysis for LM_ANN models reveals that the predictive capabilities increase with an increasing number of neurons (i.e., 15) and hidden layers (i.e., 5). Figure 6 (a) presents a statistical relationship between experimental and predicted hydraulic conductivity using the LM_K15 model in the TRG, VDN, and TSG phases.

The performance comparison for the BFG_ANN models reveals that model BFG_K9, configured with three hidden layers interconnected by 15 neurons, attains higher performance (PI = 1.2661, NS = 0.8670, BF = 1.3353, WI = 0.9390, a20 = 43.75, R = 0.9394, RMSE = 0.4233, NMBE = 0.1203, MAPE = 0.6616, MAE = 0.3947, and WMAPE = 0.2921) in the TRN phase. Similarly, model BFG_K9 has VDN (BF = 1.3080, NMBE = 0.3505, WI = 0.9642, MBE = 0.1959, LMI = 0.4289, RSR = 0.3617, a20 = 56.67, IOA = 0.7955, IOS = 0.4719, PI = 1.0595, NS = 0.8692, WMAPE = 0.3295, VAF = 89.83, MAPE = 0.7589, R = 0.9613, MAE = 0.5213, RMSE = 0.7428) and TSG (R = 0.9508, PI = 1.6514, a20 = 100, RMSE = 0.1466, VAF = 89.39, NMBE = 0.0135, MAE = 0.1057) performances higher than other BFG_ANN models and close to ideal values. Figure 6 (b) depicts the relationship between actual and predicted hydraulic conductivity using model BFG_K9. The TRG, VDN, and TSG comparisons show that the performance of BFG_ANN models increases with an increasing number of hidden layers and neurons, up to 3 hidden layers interconnected with 15 neurons. The performance of BFG_ANN models decreases with increasing the hidden layers and neurons.

The performance comparison for SCG_ANN models shows that model SCG_K10 achieves a good predictive performance in the TRG (R = 0.9556, VAF = 90.52, NS = 0.8923, PI = 1.3797, BF = 1.1227, a20 = 46.25, WI = 0.9386, RMSE = 0.4386, IOA = 0.8079, WMAPE = 0.5843, MAE = 0.3651, IOS = 0.3246, MAPE = 0.5843), VDN (RMSE = 0.8138, VAF = 88.91, PI = 0.8838, a20 = 36.67, MAE = 0.6906, IOS = 0.5181, NS = 0.8320), and TSG (a20 = 100, IOS = 0.0797, IOA = 0.8444, R = 0.9728, MAPE = 0.0663, MAE = 0.1026, WMAPE = 0.0646, NS = 0.9203, PI = 1.7593, RMSE = 0.1267, RSR = 0.2822) phases. Figure 6 (c) illustrates the regression relationship between actual and assessed hydraulic conductivity of clayey soil using model SCG_K10, configured with four hidden layers interconnected with five neurons.

In the case of the GDM_ANN models, the TRG performance comparison presents that model GDM_K12 assesses the hydraulic conductivity with a minimum residual (NMBE = 0.0576, WMAPE = 0.1683, MAE = 0.2274, RMSE = 0.2790, MAPE = 0.3413) and high performance (RSR = 0.2088, MBE = 0.0145, WI = 0.9605, NS = 0.9564, PI = 1.6372, R = 0.9797). Model GDM attains RMSE of 0.5344, R of 0.9704, VAF of 94.08, PI of 1.3482, RSR of 0.2602, IOA of 0.8627 and MAE of 0.3487, comparatively better than other GDM_ANN models, in the VDN phase. Model GDM_K12 also attains outstanding performance (RMSE = 0.1706, MAE = 0.1039, MAPE = 0.0683, WMAPE = 0.0688, NMBE = 0.0179, R = 0.9511, VAF = 86.22, PI = 1.5908, RSR = 0.3192) compared to other GDM_ANN models in the TSG phase. Model GDM_K12 is configured with three hidden layers interconnected with fifteen neurons. A statistical relationship between experimental and predicted hydraulic conductivity is shown in Fig. 6 (d).

Furthermore, model GD_K7 outperforms the other GD_ANN models with an acceptable TRG (IOS = 0.3937, RSR = 0.3980, BF = 1.3788, VAF = 85.02, R = 0.9240, NMBE = 0.2094, LMI = 0.4525, IOS = 0.7737, RMSE = 0.5319, PI = 1.1720, WMAPE = 0.3183), VDN (NMBE = 0.2765, WI = 0.9703, LMI = 0.4085, RSR = 0.3212, VAF = 90.73, R = 0.9711, MAE = 0.5186, RMSE = 0.6597, a20 = 35.00 and IOS = 0.4191), and TSG (NS = 0.7913, PI = 1.5055, BF = 0.9718, NMBE = 0.0264, WI = 0.8006, LMI = 0.4991, RSR = 0.4568, a20 = 100, MAE = 0.1645, RMSE = 0.2050) performance, close to the ideal values. Figure 6 (e) depicts the relationship between actual and predicted hydraulic conductivity using the GD_K7 model.

Moreover, the performance comparison for model GDA_ANN reveals that model GDA_K12 predicts hydraulic conductivity with a good TRN (RMSE = 0.3088, MAE = 0.2515, R = 0.9730, MAPE = 0.3451, VAF = 94.66, WMAPE = 0.1862, NS = 0.9466, PI = 1.5846, BF = 1.0411, NMBE = 0.0706, WI = 0.9530, MBE = -0.0035, LMI = 0.2647, RSR = 0.2310, a20 = 48.75, IOS = 0.2285 and IOA = 0.8677), VDN (R = 0.9650, VAF = 89.62, a20 = 66.67, RMSE = 0.6639, PI = 1.1634, IOS = 0.4218, MAE = 0.5150, IOA = 0.7972, NS = 0.8955, WMAPE = 0.3271), and TSG (R = 0.9685, WAF = 90.29, NS = 0.9029, PI = 1.7009, a20 = 91.67, RMSE = 0.1399, MAPE = 0.0551, RSR = 0.3117) performance. The overall performance analysis for GDA_ANN models reveals that the predictive capabilities increase with an increasing number of neurons (i.e., 15) and hidden layers (i.e., 4). Figure 6 (f) presents a statistical relationship between experimental and predicted hydraulic conductivity using the GDA_K12 model in the TRG, VDN, and TSG phases. Finally, the six best architectural ANN models (one from each backpropagation algorithm) have been identified to predict the hydraulic conductivity of clayey soil.

Fig. 6
figure 6

Presentation of statistical relationship between experimental and predicted hydraulic conductivity of clayey soil using model (a) LM_K15, (b) BFG_K9, (c) SCG_K10, (d) GDM_K12, (e) GD_K7, and (f) GDA_K12

In the continuity of results simulation, a visual interpretation of results has been graphically presented, analyzed, and discussed. The regression error characteristics (REC) curve has been discussed and analyzed, followed by rank, uncertainty, and Wilcoxon analysis. The Anderson–darling test has been performed to determine the normality of predicted hydraulic conductivity concerning the actual hydraulic conductivity of clayey soil.

REC curve

Regression Error Characteristic (REC) curves show how well the regressor model functions. The REC graph compares the percentage of exemplars correctly predicted within the tolerance interval against the absolute deviation tolerance. As a result, a curve is produced that calculates the error's cumulative distribution function. A biased estimate of the predicted error is provided by the area over the REC curve (AOC), created by subtracting the area under the REC curve from one. The coefficient of determination (R2) concerning the AOC can also be computed (Tahmassebi et al. 2018; Bi and Bennett 2003). The MATLAB R2020a framework has been used to plot the REC curve for the best architectural models, i.e., LM_K15, BFG_K9, SCG_K10, GDM_K12, GD_K7, and GDA_K12. Figure 7 (a-c) illustrates the REC plot of the best architectural models in the TRG, VDN, and TSG phases and AOC values for each best architectural model in Table 8. The AOC values given in Table 8 demonstrate that model LM_K15 has predicted the hydraulic conductivity of clayey soil with the least AOC (TRG = 4.09E-04, VDN = 3.38E-03, and TSG = 2.94E-05) and recognized as an optimal performance neural network model.

Table 8 AOC details for the best architectural models
Fig. 7
figure 7

Representation of REC curve for the best architectural models in (a) training, (b) validation, and (c) testing phase

Rank analysis

Another easy method for contrasting model performance is "Rank Analysis." (Khatti et al. 2024) In this technique, the model with the best value for each performance parameter is given a score of "n" (in this study, n = 6; this refers to the number of computational models that are taken into account in the analysis), and the model with the worst value for the same performance parameter is given a score of 1 (one), separately for training and testing results. The next step is to add up each model's scores to determine the final score of the models. The model's final score is calculated using the combined scores from the training and testing phases (Asteris et al. 2021c). Table 9 presents the details of rank analysis for the best architectural model for the TRG, VDN, and TSG phases. Table 9 demonstrates that model LM_K15 has obtained 101, 97, and 93 ranks in the training, validation, and testing phases, comparatively higher than other best architectural models. Model GDM_K12 has secured second rank with 83, 78, and 43 scores in the TRG, VDN, and TSG phases, followed by models GDA_K12, SCG_K10, BFG_K9, and GD_K7. Figure 8 illustrates the overall rank of the best architectural models, and it is noted that model LM_K15 has the highest rank in predicting the hydraulic conductivity of clayey soil, i.e., 291. Model GD_K7 has the lowest rank, i.e., 103.

Fig. 8
figure 8

Rank analysis for the best architectural models

Table 9 Score of the best architecture model for rank analysis

Uncertainty analysis

Any predictive model's credibility must be evaluated to estimate predictive outputs with accuracy and reliability. Uncertainty analysis (UA) has been used in the current work to quantify the error of the top architectural models in forecasting the hydraulic conductivity of soils. The training, validation, and testing datasets containing 80, 12, and 12 experimental data points related to clayey soils have been subjected to UA. Therefore, comparing prediction results with these experimental datasets is important in determining how reliable the constructed models are, and UA is perfectly suited for this task (Bardhan et al. 2021). The results of the UA analysis are summarized in Table 10.

Table 10 Details of parameters of UA

Table 10 presents the UA results of the six best architectural models, i.e., LM_K15, BFG_K9, SCG_K10, GDM_K12, GD_K7, and GDA_K12. It is noted that model LM_K15 has the lowest value of the width of confidence bend (WCB) in the TRG, VDN, and TSG phases. The other parameters, margin error (ME), standard deviation (SD), mean of error (MOE), square error (SE), upper bound (UB), and lower bound (LB), have also obtained less than BFG_K9, SCG_K10, GDM_K12, GD_K7, and GDA_K12 models. Based on these parameters, model LM_K15 has gained first rank and shows superiority in predicting the hydraulic conductivity of clayey soil. Hence, model LM_K15 is the best architectural model recognized in this work.

Wilcoxon analysis

A non-parametric statistical test called the Wilcoxon test, commonly called the signed-rank test, compares the means of two related or paired groups. It is frequently applied when paired data or when the data deviates from the assumption of normalcy. In this research, the Wilcoxon analysis has been performed for models LM_K15, BFG_K9, SCG_K10, GDM_K12, GD_K7, and GDA_K12 to find the optimal performance model in predicting the hydraulic conductivity of clayey soil. Table 11 summarizes the results of the Wilcoxon analysis. The comparison of results reveals that model LM_K15 has an excellent confidence level, is close to actual values, and presents superiority in predicting the hydraulic conductivity of clayey soil.

Table 11 Results of the Wilcoxon analysis

Anderson darling test

In addition, the "Anderson–Darling" test (AD) has been run as a non-parametric statistical test to give a more in-depth understanding of the divergence of the results. The A-D test is a statistical procedure used to determine if a sample of data originated from a population with a particular distribution. In order to determine if the actual and anticipated values in the current study fit a normal distribution, the A-D test is used to assess the data. The Minitab Statistical Tool has been used to perform the AD test for the best architectural models. Figure 9 depicts the AD test results for models LM_K15, BFG_K9, SCG_K10, GDM_K12, GD_K7, and GDA_K12.

Fig. 9
figure 9

Comparison of AD test results

Figure 9 demonstrates that the best architectural model assessed the hydraulic conductivity of clayey soil with a p-value of 0.005, which is less than the significance value (p = 0.05). Still, model LM_K15 has an AD value of 6.517, close to the AD value of actual data, i.e., 6.405. Hence, model LM_K15 rejects the null hypothesis of normality and presents superiority over the other ANN models.

Analysis of results

This research uses artificial neural network models to predict the hydraulic conductivity of clayey soil. Ninety ANN models have been developed using one to five hidden layers interconnected with each 5, 10, and 15 neurons. The six backpropagation algorithms have been implemented and compared to find the best algorithm. This section analyses the performance of ANN models concerning the number of hidden layers and neurons. For that purpose, a statistical linear relationship has been drawn for varying numbers of hidden layers and constant neurons. Figure 10 (a-q) shows the relationship for LM_ANN, BFG_ANN, SCG_ANN, GDM_ANN, GD_ANN, and GDA_ANN models in the TRG, VDN, and TSG phases. Figure 10 (a-c) shows that the performance of LM_ANN models increases with hidden layers. It has also been observed that five and fifteen-neuron-based LM_ANN models give the most promising prediction. Figure 10 (d) reveals that the performance of BFG_ANN models increases with hidden layers interconnected with five neurons. Figure 10 (g-i) illustrates that the performance of SCG_ANN models increases with hidden layers and neurons. Figure 10 (j-l) demonstrates that GDM_ANN models give the most promising results with five and fifteen neurons. However, the performance of GDM_ANN models increases with hidden layers. Still, the GDM_ANN models based on 15 neurons achieve higher performance than the GDM_ANN models based on five neurons. Figure 10 (m–o) does not show a significant performance concerning hidden layers and neurons. Therefore, increasing the epochs to analyze the effect of hidden layers and neurons may be suggested. Figure 10 (p-r) shows insignificant results for GDA_ANN models. Figure 10 (r) depicts that the GDA_ANN model requires many neurons to attain high performance in predicting the hydraulic conductivity of clayey soil. The epochs may also be increased for GDA_ANN models to analyze the impact of hidden layers and neurons. The overall analysis presents that the LM_ANN models achieve excellent performance with a significant number of hidden layers and neurons.

Fig. 10
figure 10figure 10

Illustration of the relationship between hidden layers and performance for models (a-c) LM_ANN, (d-f) BFG_ANN, (g-i) SCG_ANN, (j-l) GDM_ANN, (mo) GD_ANN, and (p-r) GDA_ANN. Note: Fig. 10 (a, d, g, j, m, p), (b, e, h, k, n, q), and (c, f, i, l, o, r) show the performance of ANN models configured with 5, 10, and 15 neurons

Furthermore, the database used in this research has been simulated to validate the model's capabilities. For this aim, one input variable varies to create the simulated database, while the other variables remain constant. For example, to ensure the trend of fine content and hydraulic conductivity, only the value of fine content linearly varies, and the other input variables are constant. However, a soil sample consists of 100% soil particles. These particles are gravel, sand, silt, and clay, and their summation can't exceed 100%. This research uses fine content terms, a summation of silt and clay content. In this study, the simulated hydraulic conductivity of clayey soil has been created using F (%), S (%), LL (%), PI (%), OMC (%), and MDD (g/cc). Table 12 presents the details of the simulated database.

Table 12 Details of simulated database
Fig. 11
figure 11figure 11

Illustration of simulated hydraulic conductivity of clayey soil by varying (a) fine content, (b) sand content, (c) specific gravity, (d) liquid limit, (e) plasticity index, (f) optimum moisture content, and (g) maximum dry density

Figure 11 presents the relationship between the simulated database of F, S, SG, LL, PI, OMC, and MDD with the hydraulic conductivity of clayey soil. Figure 11 (a1-a4) shows that the hydraulic conductivity of clayey soil is inversely proportional to fine content. Figure 11 (b1-b4) illustrates that the hydraulic conductivity decreases with increased sand content. Figure 11 (c1-c4) shows that specific gravity increases by decreasing the hydraulic conductivity. It can be seen that the specific gravity starts increasing with a decrease in fine content and an increase in sand content. Figure 11 (d1-d4) demonstrates that the liquid limit of clayey soil significantly affects the hydraulic conductivity of clayey soils. Figure 11 (e1, e2, e4) illustrates that hydraulic conductivity increases with the plasticity of clayey soil. Figure 11 (f1-f4) reveals that the hydraulic conductivity of clayey soil decreases due to an increase in optimum moisture content. Figure 11 (g1-g4) presents a significant impact of maximum dry density on the hydraulic conductivity of clayey soil. The sensitivity analysis has also reported that MDD of clayey soil influences the prediction of hydraulic conductivity of soil.

Validation of optimal performance model

Literature validation

The literature study presents many computational models used for the hydraulic conductivity of soil. These published studies illustrate that researchers used soil textures, combined soil texture with PI, and combined soil texture with dg, Sg, Db, and WCs. For the first time, soil texture, LL, PI, and compaction parameters have been used to predict clayey soil's hydraulic conductivity and model's performance compared with published models, as shown in Table 13.

Table 13 Comparison of model LM_K15 with published models

Cross validation

This research introduces the six best architectural ANN models, one from each backpropagation algorithm, in predicting the hydraulic conductivity of clayey soil. These ANN models, i.e., LM_K15, BFG_K9, SCG_K10, GDM_K12, GD_K7, and GDA_K12 have been configured with 5 k-fold. The same models have been configured with ten k-fold for the cross-validation, and the computational costs have been computed and compared with five k-fold-based ANN models. The comparison of computational cost is presented in Table 14.

Table 14 Comparison of computational cost for cross-validation

Table 14 shows that each ten k-fold-based model attains a higher computational cost than five k-fold-based models. The comparison of computational cost (program runs in MATLAB R2020a version with i3-2350 M @ 2.3 GHz, 4 GB RAM) reveals that model LM_K15 achieves the desired prediction at significantly less cost. Hence, model LM_K15 is identified as an optimal performance model in predicting the hydraulic conductivity of clayey soil.

External validation

A model's generalizability is evaluated, and external validation is carried out to make sure the model isn't just overfitting the training set. Finding the most accurate model for predicting ground vibrations is made easier by the findings of external validation. Accuracy is the capacity of the model to correctly identify patients as having or not having the desired outcome. External validation checks for overfitting and guarantees that models are reliable. When a model is too tightly suited to the training data and does not generalize effectively to new data, it is said to overfit. By contrasting the model's performance on the training data with the test data, external validation can help to spot overfitting. The Golbraikh and Tropsha (2002) theory, which was proposed, is an accurate model in this investigation. Table 15 provides a summary of the theory's various mathematical expression-related aspects.

Table 15 Mathematical expression and condition of validation factors

Where \({d}_{i}\) denotes the experimental hydraulic conductivity and \({y}_{i}\) denotes the predicted hydraulic conductivity, \(k\) and \(k{\prime}\) represent the slopes of the predicted versus actual hydraulic conductivity and actual versus predicted hydraulic conductivity with respect to the origin. \({R}_{o}^{2}\) and \({R{\prime}}_{o}^{2}\) denotes the coefficients of determination of the predicted versus actual hydraulic conductivity and actual versus predicted hydraulic conductivity. \(m\) and \(n\) represent the factors for estimating the predictive power of the proposed models. The external validation results are presented in Table 16 for all proposed models in the training, validation, and testing phase. Table 16 demonstrates that model LM_K15 has attained excellent generalizability, showing superiority over all ANN models employed in this work.

Table 16 Result obtained from external validation

Conclusions and summary

The hydraulic conductivity of clayey soil, an essential parameter for any Civil Engineering project, must be determined experimentally. The experimental procedure for determining the hydraulic conductivity of clayey soil is arduous and time-consuming. Therefore, the present work is motivated to replace the tedious laboratory procedures with computational models for predicting the hydraulic conductivity of soil. It is important to note that estimating hydraulic conductivity accurately and reliably can avoid the need for costly and time-consuming laboratory testing. For this aim, the hydraulic conductivity database of clayey soil has been compiled from published research. The database consists of the hydraulic conductivity of CH, CI, CL, OH, OL, OI, MH, OH, MI, and ML soils and is utilized to develop artificial neural network models. The following conclusions are mapped based on the novelty statements.

  • The performance analysis reveals that the prediction performance and accuracy increase with neurons and hidden layers. It is also noted that the number of hidden layers increases the prediction accuracy compared to neurons. Still, the ANN model achieves a performance of over 85% by increasing hidden layers in moderate to problematic multicollinearity.

  • It is noted that the Levenberg–Marquardt (LM) algorithm-based ANN model has achieved the highest performance in predicting the hydraulic conductivity of clayey soil. LM employs the notion of the neural neighborhood to enhance the behaviour of both memory and time limitations. Hence, this research suggests configuring the ANN models with the LM algorithm to solve geotechnical issues.

  • Based on the VIF values, it is noted that the input variables, excluding specific gravity, have moderate to problematic multicollinearity. The impact of such multicollinearity has been observed on the accuracy of SCG, GDM, and GD backpropagation algorithm-based ANN models.

  • The overall analysis of the ninety ANN models reveals that model LM_K15 is an optimal performance model in predicting the hydraulic conductivity of clayey soil. Model LM_K15 has achieved the highest testing performances, i.e., RMSE = 0.0487, a20 = 100, VAF = 99.16, R = 0.9959, IOA = 0.9364, and PI = 1.9348.

To conclude, this research introduces a highly capable of predicting the hydraulic conductivity of clayey soil neural network model. The concept developed in the present study may be implemented to assess the hydraulic conductivity of unsaturated soil. Also, the same configured neural network model may be used to solve the other geotechnical issues. This research is limited by determining the effect of hidden layers and neurons. This research may extend by drawing a relationship between epochs and the number of hidden layers. Also, the impact of the activation function may be analyzed in predicting the hydraulic conductivity of clayey soil. A comparative study for the configuration of non-optimized and optimized ANN models may be carried out in predicting the hydraulic conductivity of fine-grained soil. The present study may also be extended by implementing tanh, ReLU, leaky ReLU, PreLU, and ELU activation functions and comparing the results. For the first time, the effect of hidden layers and neurons has been studied on ANN models in predicting the hydraulic conductivity of soil.