Introduction

Investigating the spatial variability of soil properties in the cultivated fields is an essential tool for the proper handling of agricultural lands. Spatial variability is a complex interaction between geology, topography, climate and land use, and it may also be resulted by the interaction of soil use and handling strategies. These variables are significant parts of accurate agricultural arguments and topics concerning large-scale agricultural handling and planning (Dharumarajan et al. 2017; Ebrahimzadeh et al. 2021).

Different soil variables such as pH, salinity, available potassium, CaCO3, and soil texture can directly and indirectly affect soil productivity and the subsequent handling of agricultural lands (Shit et al. 2016; Yusuf et al. 2020). Such variables are highly variable in time and space affecting crop production (Bogunovic et al. 2017). Soil variables including the fractions of soil texture (sand silt, and clay) are also highly variable in the field affecting the spatial distribution of soil variables in the field. Understanding spatial variability and precisely mapping soil variables are particularly important and beneficial for a more efficient handling of agricultural fields. This may result in a more precise field fertilization, food production and environmental pollution control, as the essential components of sustainable agriculture (Zhou et al. 2020; Li et al. 2021).

Spatial interpolation is an effective method for quantifying the spatial distribution of soil properties. Such analyses are required to indicate the patterns and processes of soil spatial variation that is the combined effect of soil chemical and physical processes, managed at various spatiotemporal scales, combined with anthropogenic activities (Chen et al. 2017). Different interpolation methods such as radial basis function (RBF) and ordinary kriging (OK) have been used for the prediction of soil properties of agricultural fields affected by spatial variability. Research has indicated the use of RBF interpolation technique for predicting the spatial variability of soil parameters including soil organic carbon (Danesh et al. 2022). Compared with several other interpolation methods, the kriging method consists of inverse distance, which indicates they are less sensitive to data variation than inverse distance weighting (IDW) (Bhunia et al. 2018; Xie et al. 2020).

More accurate prediction of pH, salinity, organic matter, and plant cover was resulted by applying kriging and co-kriging methods compared with IDW. The OK method was the most appropriate method for the interpolation of soil organic matter in central Vietnam (Pham et al. 2018; Danesh et al. 2022). The use of OK, as an interpolation approach, resulted in the best results for mapping the soil nitrogen, phosphorus and potassium contents of Moso bamboo forests. The OK method produced a smaller range of forecasted N, P and K contents than IDW, indicating the necessity of applying various approaches when studying the spatial variation of soil attributes (Guan et al. 2017). Liu et al. (2015) found the significant effects of different land use on the spatial variability of soil parameters. They accordingly indicated that human activities including plowing, fertilization, cropping and harvesting can significantly affect the spatial variability of soil parameters (Baligh et al. 2022; Shokuhifar et al. 2021).

With respect to the above-mentioned details, and the requirement for a better handling of soil fertilization in the province of Khuzestan, Iran, and similar regions in the world, the present research was conducted. The objective was to determine the applicability of geostatistical methods for the prediction of soil spatial variability, using the interpolation methods of RBF and OK, in the cultivated lands in the east of the Karun River of south Ahvaz, Khuzestan Province, Iran.

Materials and methods

Site description

The study area, measuring 23,804 ha, is located in the province of Khuzestan, south-west of Iran (Fig. 1), with a semi-arid climate, and the temperature and humidity patterns of hyperthermic and ustic, respectively. The area is located in the northern latitudes ranging from 31° 2′ 56″ to 31° 15′ and eastern longitudes ranging from 48° 34′ 12″ to 48° 45′ 2″, with the provincial precipitation of ~ 220 mm per year. The major water resource in the study area is from the Karun River in the south-west of Iran.

Fig. 1
figure 1

Distribution of soil sampling points in the study area

Sampling and analyses

A total of 250 soil samples (0–30 cm), specified by a global positioning system (GPS), was randomly collected from the experimental fields of southern Ahvaz (Fig. 1). The samples were mixed thoroughly to obtain a one-kilogram bulked sample, as the representative of the area, which was kept at 4 °C for further analysis. The samples were air-dried during approximately seven days and then ground to pass a 200 µm sieve. Different soil properties were determined using the standard methods (Miransari et al. 2008). Soil pH and electrical conductivity (EC) were measured using 0.1 and 0.01 M KCl solution, respectively. Available potassium (AK), extracted by 1 M NH4OAC, was determined using flame photometry. The CaCO3 content was determined with volumetric analysis of CO2 produced by a hydrochloric acid solution. Soil texture was determined by the hydrometer method (Hyun et al. 2000).

Data analysis

The descriptive statistics of soil parameters including mean, standard deviation (SD), minimum (Min.), maximum (Max.), skewness, kurtosis, and coefficient of variation (C.V.) were determined using SPSS 25 software. To measure the degree of variation in the region, C.V. values were used, and skewness was applied for showing data scattering and flatness. In the methods of RBF and OK, the data must follow a normal variability; otherwise, they make fluctuations of variance, sill and nugget. Thus, the Kolmogorov–Smirnov and Shapiro–Wilk techniques were applied to test the normality of data.

Spatial interpolation approaches

Ordinary kriging (OK)

The OK technique includes the statistical variables of measured data. The range is the distance at which the spatial dependence disappears, and the sill value correlates with the maximum distribution in lack of spatial correlation. The \(z\left({x}_{i}\right)\) variable determined by the OK method, and the error estimation variance \(\sigma {k}^{2} \left({x}_{0}\right)\) at any location \({x}_{0}\) were, respectively, computed using the following formulas:

$$z\left({x}_{0}\right)=\sum_{i=1}^{n}{\lambda }_{i }z\left({x}_{i}\right),$$
(1)
$$\sigma {k}^{2 }\left({x}_{0}\right)=\mu + \sum_{i=1}^{n}{\lambda }_{i} \left({x}_{0}- {x}_{i}\right),$$
(2)

where \({\lambda }_{i}\) are the weights, \(\mu\) is constant, and \({\lambda }_{i}\left({x}_{0}- {x}_{i}\right)\) is the semi-variogram value equivalent to the distance between \({x}_{0}\) and \({x}_{i}\) (Bhunia et al. 2018).

A semi-variogram is a beneficial tool to explain and inquire the spatial structure of a parameter. A semi-variogram is a variable vector exhibiting the degree of spatial similarity between measured samples (Wang and Shao 2013) and is stated according to the following (Eq. 3).

$$\gamma \left(h\right)=\frac{1}{2N\left(h\right)} \sum_{I=1}^{N(h)}{[\mathrm{Z}\left({x}_{i}\right)-Z\left(xi+h\right)]}^{2},$$
(3)

where \(\gamma \left(h\right)\) is the semi-variant,\(z\) is the soil variable, z(xi) and \(Z\left(xi+h\right)\) are the values of the parameter z at location of xi and \(xi+h\) is the lag and N(h) is the number of pairs of sample locations indicated by the distance lag h.

Radial basis function (RBF)

The RBF technique is marked as an accurate interpolator where the surface passes through each measured sample value, and projected values can oscillate above the maximum or below the minimum of the measured sample values (Liu et al. 2015). The RBF technique is applied to predict the soil attributes at unmeasured points.

The accuracy of interpolation methods

The interpolation methods were implemented by applying the geostatistical analyst extension of Arc GIS software (ESRI 2001). The accuracy of the interpolation methods was determined by the cross-validation method, root mean squared error (RMSE), mean bias error (MBE), and mean absolute error (MAE). To validate the precision of these predictive models, cross-validation successively excludes a data location, and assesses the value from the remaining observations, and compares the predicted value with the determined one (Muller and Pierce 2003). The cross-validation method is applied generally to select the suitable variogram model among suggested models for kriging as well as for selecting the best variables from those tested for RBF. The cross-validation technique calculates the difference between the investigated real data. These statistical variables can be computed, respectively, according to the following:

$$\mathrm{RMSE}=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left({Z}^{*}\left({x}_{i}\right)-Z\left({x}_{i}\right)\right)}^{2}},$$
(4)
$$\mathrm{MBA}=\frac{1}{n} \sum_{i=1}^{n}\left({z}^{*} \left({x}_{i}\right)-z\left({x}_{i}\right)\right),$$
(5)
$$\mathrm{MAE}=\frac{1}{n} \sum_{i=1}^{n}\left|{z}^{*} \left({x}_{i}\right)-z\left({x}_{i}\right)\right|,$$
(6)

Z*(xi) and Z (xi) predict and observe values at location xi, respectively.N = number of observations.

RMSE indicates the precision of the interpolation techniques, and accordingly an interpolator with a little value of RMSE is of higher accuracy for unknown variables; MBE is a measure of the estimator bias and for the unbiased interpolator, the MBE should be close to zero. MAE is a measure of the total residuals. Small MAE values denote less error.

Results

Descriptive statistics of soil variables

The statistical results including range, Min., Max., SD, C.V., skewness and kurtosis for the 250 soil quality variables are presented in Table 1. The analysis showed that there are no outliers for these variables. Among these statistical variables, C.V. is a valuable factor of complete variability. In this study area, descriptive statistics showed considerable variability of soil properties. The C.V. values have been classified as weakly (> 15%), moderately (15–35%), and highly variable (> 35%) (Moharana et al. 2020). The C.V. values indicated the highest spatial distribution for EC (49.4%), AK (39.6%) and sand (56%). However, the pH variable had a little variability with a C.V. of 3.7%, and clay (24.7%) and silt (21%) had moderate variability. The influence of distribution was determined by the skewness base, and the skewness range of − 1 and 1 indicated the normal distribution of data (Table 1).

Table 1 Descriptive statistical analyses of the soil properties

Test of normality for the soil variables

Data normality was determined by histogram quantile–quantile (Q–Q) plots used for the recognition of probability and visible outliers. The normal Q–Q plots, plotted for pH, EC, AK, CaCO3, sand, silt, and clay indicated the normal distribution of soil variables. However, such variables followed a straight diagonal (Figs. 2 and 3). The Kolmogorov–Smirnov method also indicated the normality of database.

Fig. 2
figure 2

Histograms: (a) EC, (b) pH, (c) AK, (d) CaCO3, (e) sand, (f) silt, (g) clay

Fig. 3
figure 3

Normal Q–Q plots of soil properties

Geostatistical analysis

The geostatistical analysis showed various spatial variability models and spatial dependence levels for the soil variables. The ranges of spatial dependences displayed a large variation (from 252 for clay to 611 m for sand) (Table 2). The range values displayed remarkable distribution among the variables (Table 2). There was high variation among the various soil parameters. The little nugget to sill ratio (less than 0.25) indicates a large part of the variance is introduced spatially, presenting a strong spatial correlation (S) of the parameter, the ratio between 0.25 and 0.75 presents a moderately spatially correlation (M), and a high ratio more than 0.75 shows weak spatial correlation (W). The ratios of nugget to sill in the soil variables, were between 0.25 and 0.75 (Table 2) indicating the medium spatial dependence for those soil variables.

Table 2 Geostatistical variables of the best-fitted semivariograms models for the soil properties

For each soil variable, the experimental variogram was computed. The suitable fitted model to these experimental variograms were selected from linear, Guassian, exponential and spherical models by applying the least RMSE, MAE and MBE (Table 3). Accordingly, a suitable theoretical model was chosen on the basis of the results of fitting variable (Fig. 4).

Table 3 Summary of RMSE, MAE and MBE on the basis of the cross-validation
Fig. 4
figure 4

Semivariogram models of soil variables

The semivariograms of soil pH and sand were fitted to an exponential model, the AK semi-variogram was fitted to a spherical model, and EC, CaCO3, silt, and clay semivariograms were fitted to a linear model (Fig. 4). According to Table 3, the RBF method resulted in slightly higher RMSE, MAE, and MBE values for AK, pH, and clay, than those of the OK method.

Ordinary kriging maps

The spatial distribution map of clay content displayed higher values in the west to east, which generally decreased toward the middle parts of the study area. The spatial distribution map indicated that pH values in the west, southwestern and eastern parts of the study area were alkaline, however, in the east, center and south, they were neutral. According to the spatial variation of AK, the higher values were obtained in the north and central parts (Fig. 5).

Fig. 5
figure 5

Spatial distribution maps of pH, AK, and clay derived from ordinary kriging

RBF maps

Spatial variability pattern of EC indicated that the increase in the north-west to the north-east as well as south-west to south-east is due to semi-arid conditions. The spatial distribution map of CaCO3 content displayed the increase in the east to the south and the west-north corner of the study area. The spatial variation of sand content in the soil generally increased in the center to the south-west as well as from the east to the west. The spatial variation of silt content indicated higher values were located in most parts of the study area (Fig. 6).

Fig. 6
figure 6

Spatial distribution maps of EC, CaCO3, sand and silt derived from RBF method

Discussion

Investigating the spatial variability of soil parameters using geostatistical methods is a suitable method for the prediction of soil parameters, which can be used for the proper handling of different practices including fertilization (Blanchet et al. 2017). The present research investigated the spatial variability of soil parameters in the province of Ahvaz, Iran using the interpolation methods. There was high variation among the various soil parameters, as had been already documented in different researches (Hu et al. 2021; Jing et al. 2022). Due to the high variation of salinity, available potassium and sand particles, they must be considered the most important factors when recommending fertilization practices. The two interpolation methods of OK and RBF were suitable for the prediction of variability of different parameters. The obtained results can be used for the proper designing of fertilization practices in the region.

In the arid and semi-arid areas of the world due to the high rate of evaporation and deficit rainfall, the soils are saline, with a high variation. The results of the present research indicated the high variation of soil salinity (EC) in the investigated area, which may be due to the use of water containing soluble salts and due to the climatic condition (Jiang et al. 2019). The spatial variation of sand content is resulted by the sediments carried by the Karun River and mixed with fluvial deposits. The spatial variation of silt content indicated higher values were located in most parts of the study area mainly due to the effect of aeolian erosion and fluvial deposits. The spatial distribution map of clay content displayed higher values in the west to east, which generally decreased toward the middle parts of the study area. This type of variation may be due to the accumulation of fluvial deposits. Soil texture is a function of different parameters such as weathering (water presence), the alluvial deposits, etc. The high distribution may also be related to the presence of different types of soil texture in the study area through pedogenic processes influenced by micro topography (Li et al. 2020).

Soil AK was also highly variable, which is due to the high variation of soil texture, especially sand, and soil organic matter. According to the spatial variation of AK, the higher values were obtained in the north and central parts, which is usually influenced by historical practices and fertilizer applications. The little variability of soil pH is due to deficit moisture, and high presence of cations (Ca2+ and Mg2+) and anions (HCO3 and CO32−), which can increase soil pH to higher than 7. The less variation of soil pH would be useful to consider the related soil map when recommending fertilization practices. Some other researchers had also documented the medium spatial correlation of soil variables (Liu et al. 2015).

The histogram and normal Q–Q plots (Figs. 2 and 3) were used to assess the normality of database, because the asymmetry of the database variability affects the geostatistical method. The normality test used for raw data (Webster and Oliver 2007) presented the skewness < 0.5 of soil variables, indicating the normal distribution of estimated data (Rosemary et al. 2017). Moderate variation may be related to extrinsic factors being more of an influence on these soil variables, different planting patterns and fertilization practices as well as the salinity and drought conditions of the study area (John et al. 2021). All scattered fields of south of Ahvaz have been affected by intensive cropping.

The knowledge of influence range for different soil variables can be used for having independent datasets and for conducting classical statistical analysis (Li et al. 2020). It also helps in specifying where to resample, if required, and plan the design of future field experiments to prevent spatial dependency. The various ranges of spatial correlation for nutrients are probably associated with the ion mobility in the soil. A large range shows that observed values of a soil property are affected by other values of this parameter over higher distances than soil parameters, which have less range. Additionally, the sill presents the variance of the parameter, and nugget indicates the measurement error, the nugget: sill ratio displays what percent of the total variance is being at a distance less than the smallest lag interval, and shows how much variance is suitable for the model. The sill to nugget ratio is because soil parameter variability results from the combined action of structural and random factors. Random factors including fertilizer application, cultivation measurements, planting patterns and human activities weaken the spatial dependence in the soil variables (Bhunia et al. 2018).

This recommends fitting models can appropriately reflect the spatial features of the soil variables. The different function curves of the seven soil parameters indicate various ecological processes influence the variation in the soil parameters. Different interpolation methods have been used for the selection of soil map variables, which increase the efficiency of agricultural soils. The root means square error (RMSE), mean error (MAE), and mean bias error (MBE) can be applied to investigate the precision of the interpolation techniques (Xie et al. 2020).

In contrast, RBF values of RMSE, MAE, and MBA for CaCO3, sand and silt were less than those of the OK method. The results are similar to the previous research, which indicated the OK method provide more precise results for spatial interpolation of soil variables (Chen et al. 2017; Moharana et al. 2020). The observed and the predicted values of OK and RBF techniques were used to find the most suitable technique for the prediction of the spatial variability of soil parameters, which is of economic and environmental significance.

Conclusion

The production of soil variables maps is the most significant and the first stage in accurate agriculture. These maps will measure spatial distribution of soil parameters and prepare the basis to manage them. The aim of this study was to optimize fertilization practices by comparing different interpolation methods including ordinary kriging (OK) and radial basis function (RBF), which can predict the variability of soil factors. The range of spatial correlation was found to change within soil variables. The high variability (C.V.) of salinity, available potassium and sand particles indicates that such soil parameters are among the most important ones, which must be considered when recommending fertilization practices. The results obtained from the comparison of the two interpolation techniques indicated OK was the more suitable technique for predicting and mapping the spatial variability of AK, pH and clay and RBF technique performed better in interpolating EC, CaCO3, sand and silt. Majority of soil variables indicated medium spatial correlation. The spatial distribution map of soil variables is useful for planning land use, and crop fertilization practices.