Introduction

Site-specific nutrient management has received considerable attention for increasing nutrient input efficiency, improving plant productivity and reducing the environmental risks [44]. Soil nitrogen (N), phosphorus (P) and potassium (K) are important nutrients for plant growth and productivity, and they play an important role in terrestrial functions by influencing soil properties, plant growth and soil activities [18]. Soil N, P and K can individually or jointly affect terrestrial productivity [19]. However, soils are characterized by high spatial variability due to climate, parent materials, topography, vegetation types, land use as well as management [16, 24]. As a consequence, soils exhibit marked spatial variability both the macro- and micro-scale [2, 35]. Hence, understanding spatial variability of nutrients in soils is essential for devising site specific nutrient management strategies with the aim of better farm economy and increased sustainability in crop production [1].

Geostatistical methods is to predict a soil variable at unknown locations using a property measured at a given place and time [44]. Based on this assumption, many techniques have been developed to predict the spatial variability of soil properties in the last several decades, such as ordinary kriging (OK), inverse distance weighting (IDW), artificial neural network and pedo-transfer functions [28, 30, 33, 40, 45]. In recent years, OK has been widely used by many researchers for preparation of spatial variability maps of soil chemical properties [6, 7, 25, 26, 32, 38] and physical properties [31] in different soils of India.

The Brahmaputra plain of Assam is a part of vast Indo-Gangetic plain and covers an area of about 56,578 km2 [11]. Total length of the valley is 722 km, and average width is 80 km. Based on the rainfall pattern, terrain and soil characteristics, Brahmaputra plain has been delineated into upper, central and lower Brahmaputra plains [11]. In general, the rate of fertilizer application is low in Brahmaputra plain under rainfed conditions due to uncertain water availability. The deficiencies of major nutrients are considered important, but minimum research effort was made to identify the spatial extent of their deficiencies except in different districts of lower Brahmaputra plains [27, 29]. Therefore, diagnosis of nutrient-related limitations and their management assumes a greater significance to sustain or improve the crop productivity. Assessment of spatial variability of available soil nutrients is a viable option to identify and delineate critical nutrient deficiency zones. This will enable farm managers to strategize site specific nutrient management (SSNM) based on soil and crop requirements. Therefore, the study was carried out in Tinsukia district of upper Brahmaputra plains with the objectives: (1) to assess the status of soil pH, organic carbon (OC), available N (AN), available P (AP) and available K (AK) and (2) to study the spatial variability of soil fertility parameters.

Materials and Methods

Description of the Study Area

The area under investigation belongs to the Tinsukia district of Assam (27°07′–27°58′N latitude and 95°02′–96°40′E longitude) covering an area 3790 km2 (Fig. 1) in upper Brahmaputra plain, India. The topography of the district represents mostly plain lands and subdivided into moderately sloping side slope, undulating upland, gently sloping to undulating upland, gently sloping plain, very gently sloping flood plain and level to nearly level active flood plain. The maximum temperature is 39 °C during July and August; a minimum temperature falls up to 9 °C in the month of January. Annual rainfall is 2000–2500 mm, and about 75% of rainfall is from South West monsoon. There are five broad soil subgroups in the district according to Soil Taxonomy (USDA), namely—Typic Kanhapludults, Umbric Dystrochrepts, Typic Dystrochrepts, Aeric Fluvaquents and Typic Udifluvents [22].

Fig. 1
figure 1

Location and grid map of Tinsukia district, Assam

Soil Sampling and Analysis

Three thousand sixty-two surface soil samples were collected from a depth of 0–25 cm (plough layer) following 1 km × 1 km grid pattern (Fig. 1) with the help of handheld global positioning system (GPS) over the entire Tinsukia district of Assam. Soil samples were air-dried and ground to pass through a 2-mm sieve. Soil pH was determined by pH meter in a 1:2.5 soil/water suspension, available N by Subbiah and Asija [36] method and OC by Walkley and Black [42] method. Available K was extracted with 1 M NH4OAc and then estimated by flame photometry [17]. Bray-1 P was determined [8] by colorimetric spectrophotometer.

Data Analysis

The statistical parameters like minimum, maximum, mean, standard deviation, coefficient of variation (CV), skewness and kurtosis were obtained. The Pearson correlation coefficients were estimated for all possible paired combinations of the response variables to generate a correlation coefficient matrix. The normal frequency distribution of data was verified by the Kolmogorov–Smirnov (K–S) test. The results indicated that the pH, OC and K data passed the K–S normality test at a significance level of 0.05 after logarithmic transformation. These statistical parameters were calculated with EXCEL® 2007 and SPSS 15.0.

Geostatistical Analysis Based on GIS

Spatial interpolation and GIS mapping techniques were employed to produce the spatial variability of soil properties, and the software used for this purpose was ArcGIS v.10.1 (ESRI Co, Redlands, USA). The semivariogram analyses were carried out before application of ordinary kriging interpolation as the semivariogram model determines the interpolation function [15], defined as:

$$\gamma \left( h \right) = \frac{1}{2N\left( h \right)}\mathop \sum \limits_{i = 1}^{N\left( h \right)} \left[ {z\left( {x_{i} } \right) - z\left( {x_{i} + h} \right)} \right]^{2}$$
(1)

where \(N\left( h \right)\) is number of data pairs for a given distance and \(z\left( {x_{i} } \right)\) denotes a set of soil variable values.

Semivariogram analysis of different soil properties (e.g. lag size, number of lags, trend and anisotropy) was tested. Anisotropic semivariograms did not show any differences in spatial dependence based on direction, for which reason isotropic semivariograms were chosen. Circular, spherical, exponential, and Gaussian models were fitted to the empirical semivariograms. Best-fit model with minimum root-mean-square error (RMSE) was selected for each soil property:

$${\text{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left[ {z\left( {x_{i} } \right) - \hat{z}\left( {x_{i} } \right)} \right]^{2} }$$
(2)

Expressions for different semivariogram models best fitted to the soil properties are given below [12].

The exponential model can be depicted as follows:

$$\gamma \left( h \right) = C_{o} + C\left[ {1 - \exp \left\{ { - \frac{h}{A}} \right\}} \right]\;{\text{for}}\;h \ge 0.$$
(3)

The Gaussian model can be depicted as follows:

$$\gamma \left( h \right) = C_{o} + C\left[ {1 - \exp \left\{ {\frac{{ - h^{2} }}{{A^{2} }}} \right\}} \right]\;{\text{for}}\;h \ge 0.$$
(4)

where \(h\) = lag interval, \(C_{o}\) = nugget variance ≥ 0, C = structure variance \(\ge C_{o}\), and A = range parameter.

There are three major parameters derived from the fitted models to identify the spatial structure of soil variables for a given scale. The parameters nugget \(\left( {C_{0} } \right)\), sill \(\left( {C + C_{0} } \right)\) and range \(\left( A \right)\) were calculated which provide information about the structure as well as the input parameters for the kriging interpolation. Nugget represents the experimental error and field variation within the minimum sampling space. The sill represents total spatial variation and the ratio nugget/sill, i.e. \(\left( {C_{0} } \right)/\left( {C + C_{0} } \right)\) is considered as a criterion to classify the spatial dependence of soil variables. The values of ratio less than or equal to 0.25 were considered to have strong spatial dependence, whereas values between 0.25 and 0.75 indicate moderate dependence and those greater than 0.75 show weak spatial dependence [9]. Range represents the separation distance, beyond which the measured data are not spatially dependent.

Ordinary Kriging

Maps of surface soil properties were prepared using semivariogram parameters through ordinary kriging (OK). OK is by far the most common type of kriging in practice and provides an estimate for the whole area around a measured sample [21]. The OK estimator is expressed as:

$$z*\left( u \right) = \mathop \sum \limits_{a = 1}^{N} \lambda_{a} z\left( {u_{\alpha } } \right)$$
(5)
$$\mathop \sum \limits_{a = 1}^{N} \lambda_{a} = 1$$
(6)

where \(z*\left( u \right)\) is the estimated value of z at location \(\left( u \right)\); \(\lambda_{a}\) corresponds to the weight associated with the measured value of \(z\) at location a. The weights are determined so that the estimated error variance is minimized. Values of \(\lambda_{a}\) are forced to \(\sum \lambda_{a} = 1\), in which N is the number of measured values used in estimation in the neighbourhood of a.

Accuracy Assessment

Accuracy of the spatial variability maps was evaluated through cross-validation approach [10, 33]. Among three evaluation indices used in this study, mean absolute error (MAE) and mean-square error (MSE) measure the accuracy of prediction, whereas goodness of prediction (G) measures the effectiveness of prediction [39]. MAE is a measure of the sum of the residuals (e.g. predicted minus observed) [41].

$${\text{MAE}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left[\kern-0.15em\left[ {z\left( {x_{i} } \right) - \hat{z}\left( {x_{i} } \right)} \right]\kern-0.15em\right]$$
(7)

where \(\hat{z}\left( {x_{i} } \right)\) is the predicted value at location \(i\). Small MAE values indicate less error. The MAE measure, however, does not reveal the magnitude of error that might occur at any point, and hence MSE was calculated:

$${\text{MSE}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left[ {z\left( {x_{i} } \right) - \hat{z}\left( {x_{i} } \right)} \right]^{2} .$$
(8)

Squaring the difference at any point gives an indication of the magnitude, e.g. small MSE values indicate more accurate estimation, point-by-point. The G measure gives an indication of how effective a prediction might be relative to that which could have been derived from using the sample mean alone [34].

$$G = \left[ {1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left[ {z\left( {x_{i} } \right) - \hat{z}\left( {x_{i} } \right)} \right]^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left[ {z\left( {x_{i} } \right) - \bar{z}} \right]^{2} }}} \right] \times 100$$
(9)

where z is the sample mean. G is one of the methods used for accuracies of interpolated maps [37]. Accuracies of interpolated maps of studied soil properties were checked by G values. According to Parfitt et al. [23], positive G values indicate that the map obtained by interpolating data from the samples is more accurate than an average. Negative and close-to-zero G values indicate that the average predicts the values at unsampled locations as accurately as or even better than the sampling estimates.

Results and Discussion

Descriptive Statistics

The median of each soil property was lower than the mean, which indicates that the effects of abnormal data on sampling value were not significant (Table 1). Soil pH ranged from 3.4 to 8.2 and mostly in acidic range. OC ranged from 0.2 to 43.4 g/kg. The wide ranges of soil pH and OC caused by the extreme soil test pH and OC values of 20 and 6 soil samples, respectively, which could be considered as outliers. Similar to the findings of the present study, Baruah et al. [5] also reported high soil pH values in the char soils and high OC values in forest soils of Tinsukia district. These extreme soil test values may not always be an outlier, but a form of natural or management induced variation in these soils of Assam. However, the presence of the outliers in the dataset might change the structure of semivariograms and its properties. Outliers can cause distortion that violates geostatistical theory [4] and make variogram erratic [3]. Hence, the outlier values were replaced by maximum values for soil pH and OC to avoid the negative influence of outliers on semivariograms. These changes are the reason for removing the outlier in order to obtain the characteristics of majority of data. It can be controversial how to deal with outliers, and if they are not estimation errors, they need to be included if possible [14]. But their influence should be limited. Thus, it can be argued that it is one of the limitations of the geostatistical method to accommodate the outliers in spatial variability mapping. Available N, P and K varied from 5.4 to 222.7 mg/kg, 1.1–37.3 mg/kg and 12.5–392.8 mg/kg, respectively. There was a difference in CV of the soil properties. The largest variation was observed in K (55%), whereas the smallest variation was in pH (14%). Other researchers also documented a smaller variation of soil pH compared to other soil properties [29]. This may be attributed to the fact that pH values are log scale of proton concentration in soil solution, and there would be much greater variability if soil acidity is expressed in terms of proton concentration directly. Skewness indicates departure of data from normality, and a value of less than 1 denotes normal distribution of the data. A logarithmic transformation was considered where the coefficient of skewness is greater than 1 [43]. Therefore, a logarithmic transformation was performed for pH, OC and K parameters as their skewness was greater than 1.

Table 1 Descriptive statistics soil properties (n = 3062)

Semivariogram Analysis of Soil Properties

Among four models, Gaussian model was best fitted to the lowest RMSE of 51.35 for K (Table 2). Similarly exponential model was best fitted to pH, OC, N and P with the lowest RMSE of 0.455, 4.959, 29.07 and 4.318, respectively. Other researchers also used the similar methodology for cross-validation for selecting the best model for interpolation using kriging [13, 21]. The range for all soil properties varies from 1119 to 3663 m, and thus the length of the spatial autocorrelation is much longer than the sampling interval of 1000 m. Therefore, the current sampling design is appropriate for this study, and it is expected that a good spatial structure will be shown on the interpolated map [15]. All soil properties showed positive nugget, which can be explained by sampling error, short range variability, random and inherent variability. The ratio of nugget to sill is used to classify the spatial dependence of soil properties [9]. In the present study, the nugget/sill ratio showed that pH, OC, N and P were moderate spatially dependent (33–58%) and could be attributed to internal factor such as soil-forming process and external factors such as variable rate of fertilizer application by the farmers within the district. Other researchers in some other study also documented the moderate spatial dependence of soil properties [20]. K exhibited weak spatial dependence (82%), and this indicated that the spatial patterns of this soil properties were mainly influenced by extrinsic factors such as fertilization and rainfall redistribution induced by canopy [20].

Table 2 Geostatistical parameters of the fitted semivariogram models for soil properties

Spatial Distribution of Soil Properties and Cross-Validation

Spatial maps of pH and OC (Fig. 2) and, N, P and K (Fig. 3) prepared through kriging showed that pH value in the study area is acidic in nature and varies 3.4–8.2. Soil in the majority of the study area is having 4.0–4.5. This may be due to the crop management strategies adopted and the topography of the area. Soil pH in the range of 4.5–6.0 was recorded along with the Brahmputra river in the northern part of the study area. P had inverse distribution which may be due to fixation of phosphorus with exchangeable Al and Fe in low pH. OC and N had similar spatial variability, and both decreased in the northern part of the study area and increased in central and southeast quadrants. This may be due to close association of carbon and nitrogen in the soil matrix. The distribution pattern K showed that high K content in the central part and southern quadrant of the study area may be due to landscape.

Fig. 2
figure 2

Spatial distribution maps of a pH and b organic carbon (OC) (g/kg) of Tinsukia district, Assam

Fig. 3
figure 3

Spatial distribution maps of a available nitrogen (AN) (mg/kg), b available phosphorus (AP) (mg/kg), and c available potassium (AK) (mg/kg) of Tinsukia district, Assam

The evaluation indices resulting from cross-validation of spatial maps of soil properties showed that pH had low MAE and MSE; however, for OC, N, P and K relatively large MAE and MSE were observed (Table 3). These results are in close conformity with the findings of Reza et al. [29] in the lower Brahmaputra plain. For all the soil properties, the G value was greater than 0, which indicates that spatial prediction using semivariogram parameters is better than assuming mean of observed value as the property value for any unsampled location. This also shows that semivariogram parameters obtained from fitting of experimental semivariogram values were reasonable to describe the spatial variation of pH, OC N, P and K. However, the RMSE value for K was especially large, and prediction of K was especially poor, suggesting that Gaussian model of kriging was unreliable for this parameter.

Table 3 Evaluation performance of ordinary kriged map of soil properties through cross-validation

Conclusions

The summary statistics for soil properties had shown that there was difference in the CV of the soil properties. The raw datasets of pH, OC and K are strongly positively skewed, and the application of log-transformation was effective in normalizing the data. Semivariogram models were fit for all soil properties, and the best variogram model for each property was identified using cross-validation approach. Exponential and Gaussian models performed well in describing the spatial variability of pH, OC, available N, P and K contents. A moderate spatial dependence of soil properties was observed, indicating that soil properties were controlled by both internal factor such as soil-forming process and external factors such as variable rate of fertilizer application by the farmers within the district. Cross-validation of variogram models through OK showed that spatial prediction of soil properties is better than assuming the mean of the observed values at any unsampled location. Finally, spatial distribution maps of soil properties were developed using best fitted semivariogram models and OK. The generated maps can serve as an effective tool in site specific nutrient management. This is a prerequisite in farming systems in order to optimize the cost of cultivation as well as to address nutrient deficiency. The study also helped to identify and delineate critical nutrient deficiency zones.