Introduction

Accurate measurement or estimation of soil hydraulic parameters is needed to determine the availability of soil water to plants, and to model water flow and solute transport in the vadose zone. The most important soil hydraulic properties are hydraulic conductivity and soil water retention characteristics (SWRC). Unfortunately, direct measurement of these properties is time consuming and expensive, especially for research studies conducted at regional scales (Vereecken 1995). Despite the availability of many empirical relations (e.g., Brooks and Corey 1964; Campbell 1974; van Genuchten 1980; Russo 1988) to model the SWRC, accurate estimate of the parameters required for these models is cumbersome and time consuming. During the last decades, several efforts have been made to develop low-cost, accurate, and quick methods to estimate soil hydraulic parameters from easily measured and available soil data (Nielsen et al. 1986). Mathematical models, called pedotransfer functions (PTFs), are employed to relate soil water retention properties with easily measured soil properties such as sand, silt, and clay contents, organic matter, and bulk density (Bouma 1989). PTFs can be categorized into two main groups: point and parametric PTFs (Tietje and Tapkenhinrichs 1993). Point PTFs predict soil water retention properties at a specific water potential (e.g., − 10, − 33, and − 1500 kPa). Parametric PTFs predict the parameters of a soil water retention model such as those of van Genuchten (1980). The two most common methods used to develop PTFs are multiple linear regression (MLR) and artificial neural networks (ANN). MLR methods (Gupta and Larson 1979; Rawls et al. 1982; Lin et al. 1999; Mayr and Jarvis 1999; Tomasella et al. 2000) determine the most fundamental input data related to the soil water retention using stepwise regression techniques. On the other hand, ANN methods (Vereecken et al. 1989; Tamari et al. 1996; Schaap and Leij 1998; Schaap et al. 1998, 2001; Haghverdi et al. 2014) use an iterative calibration procedure to obtain the optimal and possibly nonlinear relations between input data (soil properties) and output data (hydraulic parameters), without the need for a priori regression model (Schaap and Leij 1998).

The validity of the developed PTFs is largely dependent on the data set that was used to generate the functions (Wosten et al. 1999), which in turn represents the geographic origin and the basic soil properties in the sampling area. Most of the available PTFs were developed for temperate regions, and consequently are expected to provide poor estimation of the hydraulic soil parameters when used in regions of different edaphoclimatic characteristics (Hodnett and Tomasella 2002). In this regard, Tietje and Tapkenhinrichs (1993) tested the accuracy of 13 PTFs in estimating the water retention parameters of a wide range of soils and found large variability in the performance of the tested PTFs, which ranged from good to poor, and the lowest accuracy was observed with the estimation of the residual water content (θr) and the shape parameter (α) of the van Genuchten equation. Cornelis et al. (2001) compared nine PTFs to estimate the SWRC using a data set of 298 undistributed soil samples collected from 69 different soils covering a wide range of textures. They found that most PTFs provided better prediction accuracy for the water content near saturation and at the dry end of the SWRC, especially when the PTFs were developed and calibrated based on data sets with average soil properties similar to those of the collected soil samples. They concluded that caution should be considered when PTFs are used in a geographical context different from the one where the data set used to develop the PTFs was collected. Khodaverdiloo et al. (2011) estimated the SWRC of calcareous soils using a number of PTFs that were developed with and without CaCO3 as one of the predictors of the PTFs. They reported no significant differences in the accuracy of the estimation of the SWRC when CaCO3 was included or eliminated from the developed PTFs. The accuracy of the developed PTFs was compared with the Rosetta PTFs of Schaap et al. (2001) and was found to provide more accurate estimate of the SWRC as indicated by a lower average RMSE (Khodaverdiloo et al. 2011). Abbasi et al. (2011) evaluated the accuracy of 13 point and parametric PTFs in estimation of SWRC and total available water in saline and saline-alkali soils, and found that the PTFs of Saxton et al. (1986) and Campbell (1985) provided the best estimates of SWRC and total available water, respectively.

Most of the studies that evaluate PTFs focus on the assessment of the estimation error in comparison with some measured values of the SWRC at specific water potentials and limited soil locations. However, the main sources of the estimation error remain unclear as it can be attributed to the differences between data sets used to derive the PTFs, or differences in the algorithms and the predictors used to develop the PTFs (Nemes et al. 2003). The determination of the SWRC is not a goal itself, rather it is needed as a basic input data for the assessment of plant available water, and simulation models in agricultural, hydrological, and environmental research (Minasny and Hartemink 2011). Therefore, any errors generated during the estimation of the parameters of the SWRC will propagate when implementing the values into simulation models, leading to lower accuracy of the output of the simulation process (Botula et al. 2012; Wang et al. 2015). Regardless of the size and variability of the data set used to develop and calibrate the PTFs, it is important to evaluate the applicability and accuracy of the PTFs when applied outside the range of data used to develop them. For accurate evaluation of the PTF performance, it is preferable to use a small data set of relevant (local) soil properties rather than a large data set of general soil properties (Nemes et al. 2002). Therefore, the objective of this study was to develop two site-specific PTFs for the Jazan region southwest of Saudi Arabia, and to compare the performance of the developed PTFs with that of other existing PTFs. The specific objectives were to (a) compare the performance of two developed site-specific PTFs with that of four existing PTFs to determine the SWRC, and (b) evaluate the applicability of using point and parametric PTFs to predict soil water content at field capacity (FC) (− 10 and − 33 kPa) and permanent wilting point (PWP) (− 1500 kPa), and estimate available water content (AWC) in soils of Jazan region, southwestern Saudi Arabia.

Materials and methods

Soil sampling and analysis

The study was carried out in Jazan region (latitude 16° 54′ 34.8″ N and longitude 42° 33′ 5.4″ E) located in southwestern Saudi Arabia (Fig. 1). The total area of the region is 8764 km2. The region is characterized by hot dry climate with average temperatures of 40 and 21 °C during summer and winter time, respectively. The average annual rainfall is 232 mm, mostly during Nov–Jan and some seasonal rainfall during April–May. The data set utilized in this research was collected from a previous study (Al-Turki et al. 2008), and consisted of 219 surface (0–30 cm) soil samples that were collected from the region (Fig. 1). Particle size distribution (PSD) (sand (Sa), 2–0.05 mm; silt (Si), 0.05–0.002 mm; clay (C), < 0.002 mm) and bulk density (BD) were measured on the soil samples according to the principles of Dane and Topp (2002). Total calcium carbonate (CaCO3) was determined according to the method of Loeppert and Suarez (1996). Saturation percent (SP) was measured gravimetrically in saturated soil paste, and the electrical conductivity (EC) was determined in the saturated soil paste extract (Rhoades 1996). Organic carbon (OC) was determined according to Nelson and Sommers (1982), and a conversion factor of 1.724 was used to estimate the percentage of organic matter (OM) in the soil samples.

Fig. 1
figure 1

Location map of the Jazan region and the sampling locations. Both red and green dots represent the large dataset, and the red dots only represent the small dataset

A small data set (43 soil samples) representative of the large data set was used to determine the water retention data at matric potentials of − 10, − 33, − 61, − 102, − 306, − 510, − 714, − 1020, and − 1500 kPa using pressure plate extractor (Soilmoisture Equipment, Santa Barbara, CA). The SWRC was determined by fitting the retention data to the equation of van Genuchten (1980):

$$ \theta (h)={\theta}_r+\frac{\theta_s-{\theta}_r}{{\left(1+{\left|\alpha h\right|}^n\right)}^{\left(1-1/n\right)}} $$
(1)

Where θ(h) is the volumetric water content (cm3 cm−3) at a specified matric potential, h(cm); θr and θs are the residual and saturated water contents (cm3 cm−3), respectively; α (cm−1) and n (−) are shape parameters of the SWRC. The nonlinear least-squares fitting procedure in the RETC program (van Genuchten et al. 1991) was used to estimate the unknown parameters (θr, θs, α, and n) from measured soil water retention data. For each soil sample, the water content at FC was calculated as the water content at a matric potential of − 10 and − 33 kPa for light (sand; loamy sand) and medium (sandy loam; sandy clay loam; loam; silt loam; clay loam) soil textures, respectively. The PWP was calculated as the water content at a matric potential of − 1500 kPa for all soil samples. The AWC was calculated as the difference between FC and PWP.

Pedotransfer functions

Site-specific PTFs

The MLR method was used to develop two site-specific PTFs based on measured basic soil properties (Sa; Si; C; BD; SP; EC; CaCO3; OM). Logarithmic and reciprocal values of these soil properties were also included in the MLR, and the number of variables was reduced using a backward stepwise method to ensure that the final equation includes the most significant variables that explain a large portion of the parameter variability. Selection and elimination criteria were set to be P < 0.05, and P > 0.1, respectively. The first site-specific PTF that was developed is a point PTF (MLRP) that was used to estimate the soil water content at − 10, − 33, and − 1500 kPa. The second site-specific parametric PTF (MLRF) was developed to estimate the hydraulic parameters (θr, θs, α, and n) of the van Genuchten equation using the parameters that were obtained from the fitting of the measured soil water retention data.

Selected point and parametric PTFs

Four well-documented PTFs were also included in this study (Gupta and Larson 1979; Rawls et al. 1982; Vereecken et al. 1989; and Schaap et al. 2001), referred to hereafter as Gupta and Larson, Rawls, Vereecken, and Schaap PTFs. The first three PTFs use MLR to link water content at specific matric potential (Gupta and Larson; Rawls) to basic soil properties, or to estimate the hydraulic parameters (Vereecken) of the SWRC. The Schaap PTF uses ANN to estimate the hydraulic parameters of the van Genuchten equation using five hierarchical approaches based on available soil properties. The five hierarchical approaches range from texture class; sand, silt, and clay percentages; bulk density; WC at − 10 kPa; and WC at − 1500 kPa. The Schaap PTF is implemented in a computer program (Rosetta) that combines ANN analyses with the bootstrap method to assess the uncertainty in the predicted estimates of the hydraulic parameters. Out of the five hierarchical approaches available in the Rosetta program, three were used in this study: H1 (texture class); H2 (Sa, Si, C); H3 (Sa, Si, C, BD, θ−10kPa, θ−1500kPa). Model inputs of basic soil properties, along with the number and origin of soils used in the development of the selected PTFs, are listed in Table 1.

Table 1 List of the selected pedotransfer functions (PTFs) and their inputs

Evaluation criteria

Several statistical indices were used to evaluate the performance of the selected and developed PTFs, which include the root mean square error (RMSE), the geometric mean error (GMER), the mean relative error (MRE), the D-index, and the Akaike information criterion (AIC) (Akaike 1974; Loague and Green 1991) defined as:

$$ \mathrm{RMSE}={\left\{\frac{1}{n}\sum \limits_{i=1}^n{\left({y}_i-{\hat{y}}_i\right)}^2\right\}}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.} $$
(2)
$$ \mathrm{GMER}=\exp \left\{\frac{1}{n}\sum \limits_{i=1}^n\ln \left(\frac{{\hat{y}}_i}{y_i}\right)\right\} $$
(3)
$$ \mathrm{MRE}=\left\{\frac{\sum \limits_{i=1}^n\left({y}_i-{\hat{y}}_i\right)}{n}\right\}\times 100 $$
(4)
$$ \mathrm{D}\hbox{-} \mathrm{index}=1-\frac{\sum \limits_{i=1}^n{\left({\hat{y}}_i-{y}_i\right)}^2}{\sum \limits_{i=1}^n{\left\{\left|{\hat{y}}_i-{\overline{y}}_i\right|+\left|{y}_i-{\overline{y}}_i\right|\right\}}^2} $$
(5)
$$ \mathrm{AIC}=n\ln \left\{\frac{1}{n}\sum \limits_{i=1}^n{\left({y}_i-{\hat{y}}_i\right)}^2\right\}+2k $$
(6)

where yi and \( {\hat{y}}_i \) are the measured and predicted values, respectively; \( {\overline{y}}_i \) is the average measured value; n is the total number of observations; and k is the number of parameters in the PTF. The accuracy of the estimates of soil water content at the different matric potentials (− 10, − 33, and − 1500 kPa) also was assessed using the Nash-Sutcliffe coefficient of efficiency (NSCE) described as (Nash and Sutcliffe 1970):

$$ \mathrm{NSCE}=\frac{\sum \limits_{i=1}^n{\left({y}_i-\overline{y}{}_i\right)}^2-\sum \limits_{i=1}^n{\left({y}_i-{\hat{y}}_i\right)}^2}{\sum \limits_{i=1}^n{\left({y}_i-{\overline{y}}_i\right)}^2} $$
(7)

The RMSE is a measure of the accuracy of the PTFs, and the estimation improves when the RMSE value is close to 0. Both the GMER and the MRE are a measure of the bias in predicted values with underestimation for less than 1 and negative, or overestimation for more than 1 and positive values of the GMER and MRE, respectively. D-index equals to 1 indicates perfect matching between measured and predicted values. The AIC is a useful statistic measure when comparing the performance of different models as it takes into consideration the number of adjustable parameters in each model. The model with the lower AIC value is the best model. The PTFs were ranked based on the value of each statistical index. All indices were given equal weight, and a final ranking was calculated as the mean of the rankings of all indices. For the NSCE coefficient, a value of 1 indicates perfect model performance, while 0 or negative values indicate less accuracy (i.e., the mean of the measured data is representing the property better than the model predictions).

Results and discussion

Soil properties and measured water retention

Soil property ranges of the large data set are presented in Table 2. Results showed that the collected data set contained ample variations in the measured soil properties. For example, sand content varied from 9.6 to 91.1%, silt from 3.9 to 59.3%, clay from 5.1 to 31.2%, OM from 0.25 to 0.95%, and BD from 1.31 to 1.69 g cm−3 (Table 2). The existence of large variability in the soil properties of a data set collected for the development of PTFs is favorable to ensure better accuracy in the estimates of the developed PTFs (Pachepsky and Rawls 1999). Table 2 shows that all texture classes that were found in the large data set were also represented in the small data set. The only exception was the SiCL texture class. All basic soil properties of the small data set, which was used to obtain water retention data, fit well within the ranges of the soil properties of the large data set. Table 3 shows the water retention parameters and soil water content at − 10, − 33, and − 1500 kPa per soil textural class for the small data set of the Jazan region. The van Genuchten equation was used to fit the measured water retention data. R2 values of the fitting were in the range 0.95–0.99, indicating that the van Genuchten equation adequately described the water retention characteristics of the different soils in the small data set. Mean values of θr and θs were larger in medium (SaL, SaCL, L, SiL, and CL) as compared to light (Sa and LSa) textured soils. The strong retention of the medium-textured soils is related to its larger total porosity, which increases the ability of the soil to hold water. Values of α increased with the increase in the sand percentage, or in the presence of well-structured soils. Maximum α values reached 0.044, 0.041, and 0.035 cm−1 for the Sa, LSa, and CL soils, respectively. Larger α values indicate the occurrence of sudden change in the water content as pores are emptying faster in response to small changes in the matric potential (Hodnett and Tomasella 2002). Values of n were highly dependent on the texture of the soil, and showed a gradual increase as the texture changed from CL to Sa as a result of the increase in the size of the soil particles.

Table 2 Basic soil properties of the large data set and the small data set used to generate the soil water retention data
Table 3 Water retention parameters of the van Genuchten equation and measured water content at − 10, − 33, and − 1500 kPa

The development of site-specific PTFs

The number of basic soil properties that was retained in the final MLR equations varied between 2 and 9 out of the 24 predictors used in the MLR analyses. The accuracy of the generated site-specific PTFs was assessed by the value of R2, which ranged between 0.61–0.71 and 0.37–0.67 for the MLRP and MLRF, respectively (Table 4). For the MLRP, the equation for the water content at − 10 kPa (θ−10kPa) contained only SP and BD as predictors, and explained 68% of the variance in the water content. The equations for the water content at − 33 kPa (θ−33kPa) and − 1500 kPa (θ−1500kPa) contained 3 (SP, Si, C) and 4 (Si, OM, CaCO3, BD) soil properties as predictors, and explained 71 and 61% of the variance in the water content, respectively. Similar findings were reported by Obalum and Obi (2013). For the MLRF, more basic soil properties were retained in the regression equations (Table 4). However, the percentage of the variance explained was, in general, smaller than that for the MLRP, especially for the equation to determine the n parameter, which explained only 37% of the variance. Pearson correlations analysis showed consistent results and indicated that water content at the three matric potentials was always negatively correlated with Sa, and positively correlated with Si and C contents (Table 5). In addition, BD and SP showed consistent negative and positive correlations, respectively, with the water content at the three matric potentials. The maximum negative (− 0.81) and positive (0.62) correlations with the BD and SP, respectively, were obtained with the water content at − 10 kPa, indicating that the water content at this high potential is largely influenced by the soil structure. Correlations between basic soil properties and the hydraulic parameters of the van Genuchten equation revealed that the residual water content (θr) was positively correlated with Si (0.41), C (0.61), and SP (0.40), and negatively correlated with Sa (− 0.57) and BD (− 0.58), respectively. The shape parameter α was found to be positively correlated with Sa (0.52) and BD (0.50), and negatively correlated with Si (− 0.49) and SP (− 0.34). No significant correlations were found between the saturated water content (θs) or the shape parameter n and the measured soil properties (Table 5).

Table 4 Regression parameters and coefficients of the site-specific point multiple linear regression (MLRP) and parametric multiple linear regression (MLRF) PTFs
Table 5 Pearson correlations between basic soil properties and the hydraulic parameters of the van Genuchten equation and the measured water content at − 10, − 33, and − 1500 kPa

Evaluation of the performance of the PTFs

The accuracy of the site-specific and selected PTFs to estimate soil water content at − 10, − 33, and − 1500 kPa was evaluated using several statistical indices (Table 6). At − 10 kPa, the lowest RMSE between measured and predicted water contents was observed with the site-specific MLRP PTF (0.028 cm3 cm−3). The second and third best PTFs were the Schaap (H3) and the site-specific MLRF PTFs, which showed intermediate RMSE (0.053 and 0.095 cm3 cm−3, respectively) as compared to the other evaluated PTFs. All PTFs overestimated the soil water content at − 10 kPa as indicated by the > 1.0 and the positive GMER and MRE, respectively. The MLRP and the Schaap (H3) PTFs showed the minimum overestimation in soil water content (1.02, 0.01, and 1.16, 2.42 for the GMER and the MRE, respectively). The values of the statistical indices D-index and AIC also showed that the MLRP and the Schaap (H3) were the best PTFs to estimate the soil water content at − 10 kPa (0.898, − 303.7, and 0.802, − 240.8, respectively). The final ranking listed in Table 6 showed that the MLRP PTF performed the best followed by the Schaap (H3) PTF. The performance of the Gupta and Larson and Rawls PTFs was close to that of the MLRF PTF with an average RMSE that ranged between 0.122 and 0.136 cm3 cm−3, and average AIC that ranged between − 174.6 and − 161.4. The Vereecken PTF showed the worst performance as indicated by the largest RMSE and AIC (0.194 cm3 cm−3, − 137.1, respectively). The Vereecken PTF ranked eighth among all the tested PTFs, this can be mainly attributed to the fact that the properties of the soils used in this study are not in the range of those that were used to develop this PTF. These results are consistent with Liao et al. (2011) who reported that the Vereecken PTF produced the worst estimate (largest RMSE and absolute mean error) of the soil water retention of 107 soil samples characterized by mean clay content and organic matter of 17.1 and 1.4%, respectively. In addition, Mermoud and Xu (2006) evaluated the Vereecken et al. (1989) and three other PTFs and found that all of the evaluated PTFs failed to accurately estimate water retention in light-textured soils characterized by low clay and organic matter contents. Most of the soils in the Jazan region are characterized by low clay and organic matter contents (Table 2), which could explain the larger error in the estimation of soil water content by the Vereecken PTF. Inclusion of OM as input to PTFs greatly improves the estimates of soil water retention (Rawls et al. 2003). OM greatly affects the pore size distribution through the enhancement of the soil structure. Therefore, soils containing larger contents of OM tend to retain more water than soils with lesser OM at the same matric potential (Nemes et al. 2003). Most of the soils in the Jazan region are characterized by low organic matter. The mean value for the organic matter content ranged between 0.25 and 0.95% (Table 2). Nevertheless, the Gupta and Larson, Rawls, and Vereecken PTFs were developed based on soils that contain a much larger organic matter content that ranged between 0 and 23%, 0.1 and 12.5%, and 0.02 and 11.4%, respectively. This could explain the general trend of these PTFs to overestimate soil water content in the soils of the Jazan region.

Table 6 Statistical evaluation of the measured and predicted water contents (cm3 cm−3) at − 10, − 33, and − 1500 kPa

At − 33 and − 1500 kPa, the same trend was observed and the MLRP PTF provided the lowest RMSE (0.033 and 0.023 cm3 cm−3, respectively) values, followed by the Schaap (H3) PTF (0.039 and 0.026 cm3 cm−3, respectively). The overall trend to overestimate the soil water content also continued, except with the MLRP PTF at − 33 kPa and the Schaap (H3) PTF at − 1500 kPa where slight underestimation was observed. Previous studies have reported the tendency of the Gupta and Larson, Rawls, and Vereecken PTFs to overestimate soil water content at both FC and PWP (Kern 1995; Nguyen et al. 2015). The shape of the SWRC in the saturated range is highly dependent on soil structure, texture, and clay mineralogy, whereas in the dry range soil texture mostly defines the shape of the SWRC (Bruand 2004). The Gupta and Larson, Rawls, and Vereecken PTFs were developed for temperate soils characterized by medium-textured soils, high organic carbon, and well-aggregated structure. Therefore, it is expected that the application of these PTFs to predict water content in soils of dry regions will yield higher estimation errors, especially in the saturated range at matric potentials > − 33 kPa. The D-index (0.903, 0.862 and 0.893, 0.875), and the AIC (− 284.6, − 313.4, and − 268, − 300.5) values also confirmed that the MLRP and the Schaap (H3) PTFs provided the best estimate of soil water content among all tested PTFs at − 33 and − 1500 kPa, respectively (Table 6). The final ranking of the accuracy of the evaluated PTFs to estimate soil water content at − 33 and − 1500 kPa was in agreement with the performance of the PTFs at − 10 kPa, and showed that the MLRP and the Schaap (H3) PTFs gave better predictions. The ranking of the five statistical indices was quite consistent for most of the evaluated PTFs, only slight modifications in the final ranking order of the Rwals and MLRF PTFs were observed (Table 6).

The site-specific point PTF (MRLP) provided more accurate estimation of the soil water content at − 10, − 33, and − 1500 kPa than the parametric PTFs (MLRF). The MLRF PTF contained more input variables, which possibly could increase the estimation error due to the larger spatial variability of the data collected from a relatively large area. In addition, the relationship between the parameters of the van Genuchten equation and soil properties is not linear. Therefore, regression analysis could produce larger errors in the estimation of the van Genuchten parameters (Minasny et al. 1999). Both PTFs provided better estimate of the soil water content close to saturation (− 10 kPa) and at the dry end of the SWRC (− 1500 kPa), whereas the largest error was observed close to FC (− 33 kPa). The RMSE of the MLRP PTF reached 0.028, 0.033, and 0.023 cm3 cm−3 for soil water contents at − 10, − 33, and − 1500 kPa, respectively (Table 6). The same trend was also observed with the MLRF PTF, which had the largest RMSE (0.139 cm3 cm−3) at FC, and much less RMSE at − 10 kPa (0.095 cm3 cm−3) and at − 1500 kPa (0.038 cm3 cm−3). Similar results were reported by Cornelis et al. (2001) who evaluated 9 PTFs and found that the largest errors in predicting SWRC was obtained at a matric potential between − 10 and − 31 kPa (i.e., close to FC).

The NSCE values of the accuracy of the different PTFs in estimating soil water content (Fig. 2) showed that only the MLRP and the Schaap (H3) PTFs gave positive values that ranged between 0.41 and 0.72 at all three matric potentials (− 10, − 33, and − 1500 kPa). All other PTFs had negative NSCE values, with the exception of the Schaap (H1) and (H2) PTFs at − 1500 kPa. Consistent with previous results, the Vereecken PTF showed the lowest accuracy in predicting the soil water content at all matric potentials, with NSCE values in the range of − 14.4 to − 23.4 (Fig. 2). Assuming the average soil water content, based on measurement at discrete locations, at any matric potential is known, this average will produced an NSCE value of 0 when used to represent the soil water content at any location other than the measurement locations. This indicates that only the MLRP and the Schaap (H3) PTFs can reliably be used to predict the soil water content at FC (− 10 and − 33 kPa) and at PWP (− 1500 kPa) in the soils of the Jazan region. Based on the assessment at all three matric potentials, the final ranking of the accuracy of the evaluated PTFs ordered from larger to smaller was MLRP > Schaap (H3) > Schaap (H1) > Schaap (H2) > MLRF > Rawls > Gupta and Larson > Vereecken PTFs.

Fig. 2
figure 2

The Nash-Sutcliffe coefficient of efficiency (NSCE) of the soil water content at − 10, − 33, and − 1500 kPa as estimated by the selected different PTFs

Prediction of AWC

The estimates of the soil water content that were generated by the different PTFs at FC (− 10 and − 33 kPa) and at PWP (− 1500 kPa) were used to predict the AWC in the surface layer of the soils of the Jazan region. Predicted and measured AWC were compared and the accuracy of the prediction was evaluated using the statistical performance criteria defined earlier. Concerning the overall accuracy of the evaluated PTFs in predicting the AWC, Table 7 presents the statistical indices used to compare between measured and predicted AWC. The best agreement between measured and predicted AWC was obtained with the Schaap (H3) PTF as indicated by the lower RMSE (0.014 cm3 cm−3), and maximum D-index and AIC values of 0.934 and − 359.9, respectively. The MLRP PTF performance was the second best with RMSE of 0.027 cm3 cm−3, and D-index and AIC values of 0.826 and − 302.1, respectively (Table 7). The lowest accuracy in predicting the AWC was observed with the Vereecken, Gupta and Larson, and the MLRF PTFs, with larger RMSE (0.069–0.133 cm3 cm−3), AIC (− 220.4 to − 163.8), and smaller D-index (0.641–0.341) values. The prediction bias as measured by the GMER and the MRE values was mostly positive indicating an overestimation in the prediction of AWC, and ranged between 1.05–2.39 and 0.43–11.39, respectively. The only exception was with the MLRP and the Vereecken PTFs, which showed a slight (GMER = 0.91, MRE = − 0.81) and moderate (GMER = 0.61, MRE = − 1.62) underestimation in the prediction of AWC, respectively. Figure 3 presents the predicted vs. measured AWC as estimated by the different PTFs for all the 43 soil samples collected in the small data set of the Jazan soils. For the Schaap (H3) and the MLRP PTFs, most of the points are aligned along the 1:1 line, which confirms the good agreement between measured and predicted AWC, and subsequently the reliability of the application of these PTFs to predict AWC in the investigated soils. For all other PTFs, the values are more scattered indicating that these PTFs failed to predict the AWC in the study area with acceptable accuracy. In addition, most of the points are located above the 1:1 line, which confirms the overestimation in the prediction of AWC. From Fig. 3, it can be clearly seen that the Gupta and Larson and the MLRF PTFs showed the maximum overestimation in the prediction of the AWC. This is also confirmed by the larger GMER and MRE of (2.34, 2.39) and (8.69, 11.39) that were obtained for the prediction of AWC with the Gupta and Larson and the MLRF PTFs, respectively.

Table 7 Statistical evaluation of the measured and predicted available water contents (cm3 cm−3)
Fig. 3
figure 3

Measured vs. predicted available water content (cm3 cm−3) as estimated by the different PTFs

The Schaap (H3) PTF was used to predict the AWC for all 219 soils in the large data set of the Jazan region. Based on the coordinates of the sampling locations, the predicted AWC values were georeferenced to a base map of the Jazan region in ArcGIS 10.3. The geostatistical analyst function within ArcGIS was used to interpolate the AWC values, and to produce a general map showing the AWC distribution in the Jazan region (Fig. 4a). The absolute error in the prediction of AWC was also calculated and presented as a distribution map for the Jazan region (Fig. 4b). AWC in the soils of the Jazan region ranged between 0.043 and 0.178 cm3 cm−3. Larger AWC values (0.125–0.178 cm3 cm−3) were observed in the mountain soils located in the eastern part of the Jazan region. These soils are characterized by medium-texture and larger amounts of silt, and therefore are expected to retain more water especially at FC. The lowest AWC values (0.043–0.08 cm3 cm−3) in the Jazan region were mostly observed in the western costal area, where soils are predominately sand in texture with low levels of organic carbon (Fig. 4a). Soils in approximately 90% of the areal extent of the Jazan region had an average absolute error in the prediction of AWC that ranged between 4.8 and 14.4%. The maximum calculated absolute error reached 33.6–38.3%, and was observed in soils characterized by larger amount of clay and high salinity levels (Fig. 4b).

Fig. 4
figure 4

Predicted available water content (cm3 cm−3) (a), and absolute error of the prediction (%) (b) in the Jazan region as predicted by the Schaap (H3) parametric PTF

Summary and conclusions

In this study, MLR was used to develop two site-specific PTFs, a point (MLRP) and a parametric (MLRF), using basic soil properties of 219 soil samples that were collected from the Jazan region in the southwest of Saudi Arabia. We evaluated the accuracy of the two developed PTFs and four existing PTFs, two point (Gupta and Larson; Rawls) and two parametric (Vereecken; Schaap), to determine the SWRC, predict soil water content at − 10, − 33, and − 1500 kPa, and to estimate AWC in soils of Jazan region. The accuracy of the generated site-specific PTFs was determined by the R2 value, and ranged between 0.61 and 0.68 and 0.37 and 0.67 for the MLRP and MLRF, respectively. Clay and silt contents had significant positive correlation with the soil water content at FC (− 33 kPa) and PWP (− 1500 kPa). On the other hand, sand content and BD had significant negative correlation with the soil water content at FC and PWP. No significant correlations were observed between EC, CaCO3, and OM and the soil water content at FC and PWP.

Correlations between basic soil properties and the hydraulic parameters of the van Genuchten equation revealed that the residual water content (θr) was positively correlated with Si (0.41), C (0.61), and SP (0.40), and negatively correlated with Sa (− 0.57) and BD (− 0.58), respectively. The shape parameter α was found to be positively correlated with Sa (0.52) and BD (0.50), and negatively correlated with Si (− 0.49) and SP (− 0.34). No significant correlations were found between the saturated water content (θs) or the shape parameter n and the measured soil properties. The Schaap (H3) and the site-specific MLRF PTFs were able to accurately fit the measurement data to the van Genuchten equation, and provided adequate estimate of the SWRC over the entire range from high (saturation) to low (dry) water potentials. The Schaap (H1) and (H2) PTFs failed to estimate water retention at potentials higher than − 60 kPa, whereas the Vereecken PTF failed to accurately estimate the SWRC and provided an overestimation of water content at all water potentials.

The MLRP and the Schaap (H3) PTFs produced the best estimate of soil water content in the Jazan region, with smaller RMSE and larger D-index and AIC values. The performance of Schaap (H1), (H2), and MLRF PTFs was intermediate, whereas the Rawls, Gupta and Larson, and Vereecken PTFs had the largest RMSE and always showed an overestimation in the estimates of soil water content at all matric potentials. In general, the largest prediction errors in the estimation of soil water content were observed at matric potential close to FC (− 33 kPa). The reason for the better performance of the Schaap (H3) PTF might be related to the fact that it was developed and calibrated based on a large database of soils from North America and Europe covering a wide range of soil variability. Therefore, it is highly possible that most of the variability found in the soils of the Jazan region were represented to a certain degree in this database. In addition, the Schaap (H3) PTF is the only PTF, among all the evaluated PTFs, that takes into account water content at two water potentials (− 33 and − 1500 kPa) as inputs in the PTF, which significantly enhances the accuracy of the prediction. The previous assessment of the accuracy of the evaluated PTFs was confirmed by the predictions of the AWC, which revealed that the Schaap (H3) and the MLRP provided the best estimate of the AWC in the soils of the Jazan region.

The site-specific point PTF (MLRP) proved to be more efficient than all other tested PTFs in the prediction of soil water content in the Jazan region at both FC and PWP. In contrast, the parametric site-specific PTF (MLRF) showed acceptable performance at saturation and at the dry end of the SWRC, but failed to estimate water content at FC. Despite the relatively large error in the estimation of SWRC based on parametric PTFs, the predicted hydraulic parameters are usually accurate enough for the simulation of water flow and transport processes in the environment, especially if direct measurement of the hydraulic parameters is not available. The pedoclimatic context of the data sets used to derive the PTFs greatly influence the performance of the PTFs. Therefore, caution should be considered when PTFs developed in this study are being used in soils with basic soil properties outside the range of soil properties that were used to develop and validate the PTFs.