Introduction

As society continues to develop, water quality has become a topic of increasing worldwide environmental concerns (Diaz and Rosenberg 2008; Chen et al. 2009; Giri et al. 2012; Chen et al. 2012a). Generally, two types of pollution sources are defined: point source (PS) and nonpoint source (NPS) (Ritter and Shirmohammadi 2001; Liu et al. 2009). NPS pollution contributors include forestry, urban runoff, mining/construction, and agriculture. The largest contributor to NPS is agriculture. Recently, the loss of nitrogen (N) and phosphorus (P) from agricultural land via runoff has increased rapidly in comparison to that from industrial and residential lands (De Wit et al. 2000; Reungsang et al. 2005; Chen et al. 2012b). In the US, as much as 60 % of river pollution results from agriculture (Environmental Protection 2008). Pollution prevention requires a clear understanding of the impact of nutrient sources on water quality at a watershed level (Lam et al. 2010; Liu et al. 2014b).

To reduce the negative impacts of agricultural NPS pollution, Best Management Practices (BMPs) have been developed by the USA since the 1960s (Logan 1993). Many studies have shown that BMPs effectively reduce the NPS pollutant loads from agricultural areas (Maringanti et al. 2011; Panagopoulos et al. 2011; Liu et al. 2013). BMPs include structures, such as rain barrels or shoreline buffers, and nonstructural management practices, such as changes in land use and fertilizer application management (Lam et al. 2011; Liu et al. 2013). Unlike structural BMPs, most nonstructural BMPs are cost-effective and flexible. Therefore, land management can be applied as an effective strategy for controlling agricultural NPS pollution in many regions (Cook et al. 1996; Monaghan et al. 2007; Lee et al. 2010; McDowell et al. 2011; Thorburn and Wilkinson 2013).

Soil nutrient management is one land management strategy (Liu et al. 2013). Scientific approaches to soil nutrient management are based on the spatial variability of soil nutrients, which is obtained by analyzing soil nutrient samples. Scientific information on the spatial variability and distribution of soil properties is critical for understanding ecosystem processes and making sustainable soil, crop, and environmental management decisions (Fu et al. 2010; Tesfahunegn et al. 2011). In recent years, a mountain of work has been conducted regarding arable soils. These studies have focused on the spatial variability of soil properties, in the context of water quality protection and food security (Castrignanò et al. 2000; Tavares et al. 2008; Chaplot et al. 2010; Kerry et al. 2012).

Characterizing the spatial variability and distribution of soil properties, as well as information about soil properties, is needed at unsampled sites (Lark and Ferguson 2004). In practice, ordinary kriging is the interpolator to estimating the soil variables at unsampled sites using data at adjacent sample points. However, the spatial prediction of soil nutrients typically involves uncertainties, which must be considered when decisions regarding future management are made (Goovaerts 2001). Such management decisions are often based on the threshold values of a soil property. When a land manager interprets a kriged soil property map with respect to (a) critical threshold value(s), the uncertainty of these estimations becomes important (Lark and Ferguson 2004). Indicator kriging (IK) is used to estimate the probability of values that fall within specific class intervals by incorporating the uncertainty of the values of the variables at unsampled locations (Meirvenne and Goovaerts 2001; Triantafilis et al. 2004; Lee et al. 2007; Arslan 2012).

Correlation is a term that refers to the strength of a relationship between two variables. A large correlation coefficient between variables may imply that these variables come from the same pollution source (Tukura et al. 2011). However, most multivariate correlation analysis studies have focused on conventional correlation, which does not consider spatial variability (Allen et al. 2009). The results of such analyses may lead to mismanagement when the correlation varies in different areas. A new method should be developed to analyze the multivariate spatial relationship.

Based on the sampled soil nutrient data of available nitrogen (AN), available phosphorus (AP), and available potassium (AK) in a crop field near Baiyangdian Lake in 2010, the key objectives of this work were to (1) develop a method of spatial correlation analysis (SCA), (2) calculate the spatial correlation coefficient using SCA, and (3) analyze the probability of abundance using geostatistics.

Material and methods

Study region

The study region covers an area of 15.33 km2 near Baiyangdian Lake, which is located in the middle of the North China Plain (Fig. 1). The center coordinates of the study region are 115.85 E, 38.88 N. The region’s climate is characterized by continental monsoons, and the average annual precipitation is 556 mm. There is a distinct seasonality in the annual rainfall pattern, with approximately 80% (445 mm) occurring from June to September. The mean annual air temperature is approximately 7.5 °C. The soil is calcareous cinnamon soil, which is formed by alluvial flood. The typical cropping system in the region is rotational winter wheat and summer maize cultivation (Liu et al. 2014a).

Fig. 1
figure 1

Study area and sample sites

In this area, long-term human activities have led to changes in the quality and soil fertility status of cultivated land. In recent decades, the aquatic environment has changed drastically, and one of the major pollution sources is NPS (Chen et al. 2008; Zhao et al. 2010). Most studies have focused on water eutrophication, heavy metals, and organic contaminants (Dou and Zhao 1998; Zhao et al. 2011; Gao et al. 2013). Few studies have examined the impacts of soil nutrients on eutrophication.

Sample collection and analysis

A total of 105 samples were taken from the topsoil (0–200 mm) of the study region in May 2010 based on a systematic grid (Fig. 1). The distance between each grid point is approximately 500 m. A 10 m × 10 m square field was designed to sample topsoil at each grid point. A single 5-cm-diameter core was collected at four corners and the center of each square field. Then, the five samples at each square field were mixed as the sample of the grid point. The sampling site geo-positions (latitude and longitude) in each square center were determined using GPS equipment (Garmin GPSMAP 78 s, precision: 3 m).

All soil samples were air-dried and ground to pass through a 2-mm sieve, which is necessary for available nitrogen (AN), available phosphorus (AP), and available potassium (AK) analyses. The AN was measured using the Kjeldahl method. The AP (Olsen-P) was extracted using 0.5 mol L−1 NaHCO3 (pH = 8.5) and was determined using the molybdenum blue method. The AK was extracted using 1.0 mol L−1 NH4OAc (pH = 7) and was measured using flame emission spectrometry. Sampling and chemical analyses were conducted based on standard methodologies (Lu 2000). Strict quality control was operated in the experiment. Quality assurance and quality control were assessed using duplicates, method blanks, and standard reference materials.

Methods

Distribution maps of the element concentrations of all species were determined by ordinary kriging (OK) interpolation. OK provides estimation at an unobserved location of a soil variable, z, based on the weighted average of adjacent observed sites within a given area (Webster and Oliver 2001; Triantafilis et al. 2004). OK has been widely used in the fields of mining, ecology, and environment science (Matheron 1965; Lloyd and Atkinson 2001; Rufino et al. 2005; Daya 2012).

Ordinary kriging is estimated by a linear combination of the observed values with weights:

$$ Z*\left({x}_0\right)={\displaystyle \sum_{i=1}^n{\lambda}_iZ\left({x}_i\right)} $$
(1)

where Z*(x 0) is the estimated value of Z at the point x 0, Z(x i ) is the sampled value at the point x i and λ i is the weight placed on Z(x i ).

The weights of OK are derived from kriging equations using a semivariance function. An unbiased estimator of the semivariance function is equal to half the average squared difference between paired data values:

$$ \gamma (h)={\displaystyle \sum_{i=1}^{N(h)}{\left[z\left({x}_i\right)-z\left({x}_i+h\right)\right]}^2}/2N(h) $$
(2)

where γ(h) is the semivariance value at distance interval h; N(h) is the number of sample pairs within the distance interval h; and z(x i  + h) and z(x i ) are sample values at two points separated by the distance interval h.

The spatial correlation analysis (SCA) method is similar to the conventional correlation analysis, but the variables of SCA are raster data rather than sample data. If X and Y are raster data, the spatial relationship analysis coefficient is calculated by:

$$ r=\frac{\mathrm{Cov}\left(X,Y\right)}{D(X)\times D(Y)} $$
(3)

where r is the correlation coefficient raster; D(Y) and D(Y) are the variances of data X and Y; and Cov(X, Y) is the covariance of data X and Y. A 3 × 3 neighborhood is set in the process of the calculations. X and Y are the OK interpolation results in this study.

Indicator kriging (IK) was used to identify areas where nutrient concentrations were higher than a threshold value, Z K . IK is a kriging analysis performed on a binary-transformed sample population (Marinoni 2003), which involves an initial reclassification of each variable, Z(u), to produce binary variables (Goovaerts 1997), as follows:

$$ I\left(u;{Z}_K\right)=\left\{\begin{array}{l}1,\kern1.25em \mathrm{if}\ Z(u)\le {Z}_K,\ K=1,2,\dots, m\\ {}0,\kern1.25em \mathrm{otherwise}\end{array}\right. $$
(4)

At an unsampled location, u 0, the indicator kriging estimator is written as:

$$ I*\left({u}_0;{Z}_K\right)={\displaystyle \sum_{j=1}^n{\lambda}_jI\left({u}_j;{Z}_K\right)} $$
(5)

where I(u j ;Z K ) represents the values of the indicator at sampled locations, u j , j = 1,2,3,…,n, and λ j is the weight assigned to I(u j ;Z K ) in the estimation of I*(u 0;Z K ).

Nutrition criteria

Crops, like all other living things, require food for growth and development. Proper nutrition is essential for satisfactory crop growth and production. Many agricultural experts have researched the impacts of different nutrient levels on the growth and production of crops, to evaluate the richness or scarcity of nutrients in relation to crop needs. Some studies have led to the establishment of nutrition criteria (Silva and Uchida 2000; Gulser 2005). Based on previous studies (Yang and Sun 2008; Zhang 2011) near our study region, the nutrition criteria of different nutrient levels in the region are listed in Table 1.

Table 1 Evaluation criteria of soil nutrients (mg/kg)

Results and discussion

Basic statistical analysis

The average AN and AP contents were 75.36 and 10.45 mg/kg, respectively, indicating that AN and AP were sufficient for crop needs without considering the spatial variability (Table 2). The average AK content was 208.75 mg/kg, which is more than crop needs in the study area, without accounting for the spatial variability. The skewnesses of all the nutrients were less than 1. Therefore, all the nutrients could be regarded as normal distribution.

Table 2 Descriptive statistics of soil nutrients

The basic correlation analysis results showed that there was a clear relationship between these three nutrients based on the sample data (Table 3).

Table 3 Conventional correlation coefficients of soil nutrients

Spatial variability and correlation analysis

The exploratory spatial data analysis (ESDA) was a prerequisite for carrying out the spatial analysis of nutrients by use of geostatistical analyst extension of ArcGIS. On examining empirical semivariogram, data showed spatial dependence (Fig. 2). Then the experimental variograms of these three nutrients were fitted by spherical models, and the prediction results and prediction errors were calculated. In general, the spatial distribution of soil nutrients was heterogeneity. The prediction errors were very low in the whole region. Compared with the prediction results and the prediction errors, higher values of predicted concentration were always companied with higher values of prediction errors.

Fig. 2
figure 2

a Experimental variograms, b prediction maps, and c prediction errors of soil nutrients

To spatially specify whether the soil nutrients were sufficient for crop growth and production in the study region, the nutrient levels were evaluated based on the nutrition criteria and OK interpolation results (Fig. 3).

Fig. 3
figure 3

Spatial distribution and evaluation patterns of soil nutrient: available nitrogen (AN), available phosphorus (AP), and available potassium (AK)

In general, the AN concentration was sufficient for crop growth and production. Level 3 AN covered more than 60.00 % of the area and was mainly distributed in the eastern and western portions of the study region. The majority of the area was categorized as level 2 AP, which was also mainly distributed in the eastern and western portions of the study region, with a proportion of 54.24 %. The majority of the region was categorized as level 5 AK, which encompassed 89.62 % of the area.

The high-concentration patterns of these three soil nutrients were mainly found near Anzhou. As the distance from the town increased, the soil nutrient concentrations gradually decreased. The high soil nutrient concentrations near Anzhou may be related to the higher incomes of farmers who live in the town, compared to farmers who live in villages. Expecting to obtain higher crop output, the higher-income farmers are willing to pay more money to buy additional fertilizer. The high concentration patterns of AN in the eastern study area may be due to the fertilizing habits of nearby farmers. Other researchers have also found that tillage and fertilizer conditions, cropping systems, and soil conservation practices can act as partial sources of spatial soil nutrient variability (Tesfahunegn et al. 2011).

Based on the OK results, the spatial correlation results of these three soil nutrients showed that the relationship between AN and AK was still positive. However, the relationship between AN and AP was negative, and the relationship between AP and AK was not clear when the spatial characteristics were considered (Table 4). The results differed from the basic results in Table 3 because the basic correlation coefficient cannot reflect spatial variations.

Table 4 Statistic results of spatial correlation coefficients

In addition, the spatial distributions varied (Fig. 4). The positive relationships were mainly distributed in the eastern and western portions of the study area (Fig. 4). A positive relationship means that the main fertilizer used in that region may be compound fertilizer, which contains multiple nutrients in each individual granule. These results would be helpful for site-specific soil nutrient management.

Fig. 4
figure 4

Spatial distribution of correlation coefficients among available nitrogen (AN), available phosphorus (AP), and available potassium (AK)

Probability analysis of abundance or deficiency

The spatial distributions of the conditional probability that the concentration of soil nutrients exceeded the upper threshold showed that the spatial distributions of the probability of overabundance were similar to the OK results (Fig. 5). A comparison of the quantiles of upper limits of level 3 in the measured nutrient concentrations frequency distributions showed that there were few areas with an overabundance of AN and AP. However, almost the entire study region exhibited an overabundance of AK.

Fig. 5
figure 5

The abundance probability of available nitrogen (AN), available phosphorus (AP), and available potassium (AK)

Site-specific soil management

From the results of the soil nutrient distribution, it is clear that the spatial variability of the soil nutrients differed throughout the study region. High-concentration patterns of these three soil nutrients were mainly found near Anzhou. The spatial variety of the soil nutrients involves uncertainties, which must be considered when decisions for future management are made.

Fertilizer management was the key in the soil management. Most fertilizer was added to the soil to supply plant nutrients every year in the study region. Conservative estimates report that 30 to 50 % of crop yields are attributable to natural or synthetic commercial fertilizers (Aulakh and Pasricha 1998; Gallichand et al. 2003; Stewart et al. 2005; Bandyopadhyay et al. 2010; Ma et al. 2012). Therefore, fertilizer management should be cared in the study region.

In the regional geochemical field, correlation analysis is generally used to identify different sources based on the correlation coefficient of sampled data (Dou and Zhao 1998). In this study, the positive relationship in eastern and western part of the study area means that the soil nutrient in these regions might come from same fertilizer, which contains multiple nutrients in each individual granule. Identification of fertilizer was the prerequisite for site-specific soil nutrient management. However, canonical correlation analysis was not appropriate here because it did not consider the spatial distribution.

Furthermore, the existing fertilizer application in the study region is identical in all fields, based on management systems, which does not consider the spatial variability of soil properties. This has resulted in under-application in areas with low nutrient levels and over-application in areas with high nutrient levels. Thus, site-specific soil nutrient management, based on spatial variability, is considered to be the most viable approach to addressing this problem and to achieving sustainable agriculture (Fu et al. 2010; Jiang et al. 2011; Tesfahunegn et al. 2011).

Conclusion

The average contents of AN, AP, and AK were 75.36, 10.45, and 208.75 mg/kg, respectively. AN and AP were sufficient for the needs of crops and AK exceeded that need without considering the spatial variety.

In general, the spatial distribution of soil nutrients was heterogeneity and was mainly found near Anzhou. This heterogeneity of soil nutrients caused the same spatial variation of evaluation and conditional probability that the concentration of soil nutrients exceeded the upper threshold. The prediction errors were very low in the whole region. Higher values of predicted concentration were always companied with higher values of prediction errors. The spatial variety of the soil nutrients involves uncertainties, which must be considered when decisions for future management are made.

There was a clear positive relationship between these three nutrients based on the sample data. However, the spatial relationship between AN and AP was negative and the spatial relationship between AP and AK was not clear when the spatial characteristics were considered. The spatial correlation analysis could be helpful for site-specific soil nutrient management. However, canonical correlation analysis was not appropriate here because it did not consider the spatial distribution.