Introduction

Heavy metals are natural components of the soil, and their background concentrations primarily depend on the parent material composition and pedogenesis (Baize and Sterckeman 2001; De Temmerman et al. 2003; Rawlins et al. 2003; Rodríguez Martín et al. 2008). However, many studies have reported that in the last decade, anthropogenic influence has contributed to the rapid increase of heavy metals in soils (Facchinelli et al. 2001; Franco-Uria et al. 2009 Qishlaqi et al. 2009; Rodríguez Martín et al. 2013; Horta et al. 2015). As the most frequent soil contaminants, heavy metals are toxic, bio-accumulative, and resistant to biochemical degradation and can be readily transferred into the human body (Cui et al. 2005; Lim et al. 2008; Wei and Yang 2010; Wu et al. 2015). In the recent years, soil contamination by heavy metals has become a serious problem in many parts of the world, especially in developing countries like China (Saby et al. 2006; Fabietti et al. 2010; Xia et al. 2011; Li et al. 2014; Chen et al. 2015). In China, about 10 million ha of arable land has been polluted, and about 12 million tons of grains is contaminated each year by heavy metals in soils (Teng et al. 2010). As soil is a fundamental resource for food, feed, and fiber that provides a livelihood for the majority of people in developing countries, it is essential to understand and control the spatial distribution of soil heavy metals to ensure food safety.

Heavy metals in soils often come from multiple sources (Romic and Romic 2003; Zarcinas et al. 2004; Saby et al. 2006; Micó et al. 2006; Schneider et al. 2016), which are scale dependent (Lin et al. 2002; Zhao et al. 2010; Nanos and Rodríguez Martín 2012; Lv et al. 2013). The complexity of the multi-scale variations in heavy metals makes it difficult to fit a spatial model that is universally valid (Saby et al. 2006; Rodríguez Martín et al. 2008; Marchant et al. 2010). It is necessary to separate the variations in soil heavy metals into different scales and identify their potential sources and influencing factors.

To study the scale-specific variations in soil heavy metal concentrations, several techniques have been explored, including multi-scale kriging nested model (Huo et al. 2009; Tóth et al. 2016) and factorial kriging (Nanos and Rodríguez Martín 2012; Lv et al. 2014). Multi-scale kriging nested model has advantages in revealing the spatial structure of a given variable effectively and improves the estimation accuracy compared with the single-scale model, but it assumes stationarity. Factorial kriging, as a multivariate geostatistical approach, using the nested combination of two or more individual auto-variograms called co-regionalization analysis, allows decomposition of the given variables set into different components of spatial variability related to different scales, which can be mapped separately. However, it can only estimate and map the variables at some preselected scale intervals (Bocchi et al. 2000; Castrignanó et al. 2000; Alary and Demougeot-Renard 2010).

Empirical mode decomposition (EMD) is a highly adaptive decomposition method that is known to deal with different types of spatial series including those that are nonstationary and nonlinear. It decomposes the original data into a series of modes without prior information (Huang et al. 1998, 2003a). In recent years, it has been widely used in soil science to separate the spatial variability at multiple scales (Biswas et al. 2009; Biswas and Si 2011; Hu and Si 2013; Zhou et al. 2016). EMD separates data into a number of intrinsic mode functions (IMFs) with different spatial scales and a residue (Huang et al. 1998, 2003b; Huang and Wu 2008; Rao and Hsu 2008). Each IMF represents the realizations of underlying soil processes and the controlling factors at similar scales. Therefore, the overall variability in soil properties can be visualized at different spatial resolutions.

While a spatial variation of heavy metals in agricultural soils has been studied extensively at various scales (Cheng et al. 2007; Marchant et al. 2010; Rodríguez Martín et al. 2013; Duan et al. 2015; Horta et al. 2015; Tóth et al. 2016), to our knowledge, few studies considered the spatial multi-scale variations in heavy metals. In this study, a typical modern agricultural zone in southeastern China was chosen as our study area. The main objectives of this study are (1) to explore the spatial multi-scale variations of soil cadmium (Cd) and mercury (Hg) contents using a two-dimensional EMD algorithm, (2) to calculate the scale-specific correlations between the two metal contents and selected soil properties, and (3) to identify the potential influencing factors of Cd and Hg in the study area.

Materials and methods

Study area

The study area is located in the Hang-Jia-Hu Plain, the northeastern region of Zhejiang Province, China. The study area is bounded by longitude 120° 17′–120° 39′ E and latitude 30° 28′–30° 47′ N with a total area of about 727 km2. The study area is in the northern subtropical zone of monsoonal climate with a temperate and humid climate throughout the year with four distinct seasons. The average annual temperature is 16.0 °C, and the mean annual precipitation is approximately 1233.9 mm. The dominant wind direction is southeast in summer and northwest in winter, respectively. Gleysols and Gleyic Cambisols are two major soil types in the study area, and they accounts for about 81.99 and 12.01%, respectively. Paddy field was the main land use of arable land, accounting for about 81%.

Sampling design and soil analysis

A total of 260 surface soil samples (0–15 cm) were collected from the arable land in November 2005 with consideration of land use uniformity and soil types to ensure all samples were located in arable land and each soil subtype at least had one observation (Fig. 1). During sampling, soil samples from the top layer of 6∼8 points at each site of an area of about 0.1~0.2 ha were collected and then composited and finally divided into subsamples of 1∼2 kg each for laboratory analyses. All sample locations were georeferenced using a handheld Global Positioning System (GPS). All samples were air-dried at a room temperature (20–25 °C), stones or other debris were removed, and then the samples were sieved to pass a 2-mm polyethylene sieve. Portions of soil samples (about 100 g) were ground in an agate grinder and sieved through 0.149 mm. The prepared soil samples were then stored in polyethylene bottles for analysis.

Fig. 1
figure 1

Locations of sampling sites and study area

Soil pH was measured by a pH meter (Sartorius Basic pH Meter PB-10) with a soil-to-water ratio of 1:2.5. In this study, soil organic carbon content (SOC) was determined by wet oxidation at 180 °C with a mixture of potassium dichromate and sulfuric acid (H2SO4), and total sulfur content (TS) was determined with an X-ray fluorescence spectrometer after being extracted by a mixture of phosphate and acetic acid (Agricultural Chemistry Committee of China 1983). Total Cd content (Cdtotal) was measured by inductively coupled plasma mass spectrometry (ICP-MS) after soil samples had been digested with a mixture of nitric acid (HNO3) and perchloric acid (HClO4) (Agricultural Chemistry Committee of China 1983), and total Hg content (Hgtotal) was determined by cold vapor atomic fluorescence spectrometry (CV-AFS) after soil samples had been digested with a mixture of H2SO4, HNO3, and potassium permanganate (KMnO4) (Wang et al. 2003).

The validity of the whole analytical procedures was checked using the certified reference materials (CRMs) GSS1, GSS2, GSS3, and GSS8. Analyses of CRMs, replicate samples, and blanks were performed after every ten samples and were carried throughout the entire sample preparation and analytical process. The precision of the measurements, estimated by carrying out 26 replicates, was in the range of 3.2–5.9% relative standard deviation (RSD). The average recovery rates of certified elements were between 83 and 112%. The method detection limit (MDL) was 50 μg g−1 for TS, 20 ng g−1 for Cdtotal, and 2 ng g−1 for Hgtotal.

Dimensional empirical mode decomposition analysis

EMD was originally developed for analysis of one-dimensional nonstationary and nonlinear signals (Huang et al. 1998). In this study, we introduced the two-dimensional EMD (2D-EMD) which can be used to analyze two-dimensional spatial data. We used a “spemd” package (Roudier 2016) implemented in R (R Core Team 2014). The main decomposition process of 2D-EMD is expressed as follows:

$$ z\left( x, y\right)=\sum_{i=1}^N{c}_i\left( x, y\right)+ r\left( x, y\right) $$

where z(x, y) is the original two-dimensional dataset, c i (x, y) is the N − 1 IMFs, and r(x, y) is the spatial residue. The algorithm of the two-dimensional EMD analysis can be summarized as follows:

  1. 1.

    Determine the neighborhood of each observation point (z(x, y)) using its k-nearest neighbor (here, we chose k = 4);

  2. 2.

    Find local extremum (minimum (min1) and maximum (max1)) points for the input dataset. A local extremum is defined as a point whose value is either smaller or larger to all of its neighbors;

  3. 3.

    Interpolate the minimum and maximum points for each location in the input dataset using multi-level B-splines as implemented in the MBA package for R (Finley and Banerjee 2014);

  4. 4.

    Calculate the envelope (e), defined as the mean values of the interpolated minimum and maximum points;

  5. 5.

    Extract the details, d(x, y) = (z(x, y) − e);

  6. 6.

    Replace z(x, y) with d(x, y) and repeat steps 2–5 until c(x, y) satisfied the IMF criteria (i.e., c i (x, y) = d(x, y));

  7. 7.

    Replace z(x, y) with \( z\left( x, y\right)-\sum_{i=1}^N{c}_i\left( x, y\right) \) and go to step 2 until a monotonic residue is obtained.

    In this study, the original data of the two metals and three soil properties were decomposed into three IMFs with their corresponding spatial residues. The number of maximum IMFs was selected empirically based on the observation that an increase in this value will not produce additional IMFs. Afterwards, the percentage contribution of each of the component of the overall variance was calculated.

Scale analysis using geostatistics

Geostatistics have been successfully used to investigate the spatial variation in soil metal contents and soil properties (Saby et al. 2006; Li et al. 2008; Sun et al. 2012; De Souza et al. 2015). The variogram is an effective tool for evaluating spatial variability and structure (Guo et al. 2001; Iqbal et al. 2005) and could provide a clear description of the spatial structure of variables and some insight into possible processes affecting variable distribution (Wang and Tao 1998; Paz González et al. 2001).

To explore the spatial multi-scale variations in soil metal contents across the study area, the variogram was calculated for each EMD component of measured soil properties. Anisotropy of variogram was not observed in the data. All variograms in isotropic form were fitted using the spherical, exponential, Gaussian, or linear models, and the best-fit model was applied to kriging interpolation. Ordinary kriging was chosen to create the spatial distribution maps of the heavy metals and three soil properties, using the nearest 16 observations and a maximum search distance equal to the range of the variogram of the variable. The software GS+ Geostatistics for the Environmental Sciences, version 10.0 (Gamma Design Software, Plainwell, MI, USA) was used to fit the variograms and perform ordinary kriging. More technical descriptions of kriging and the semivariogram are available in the literature (see Webster and Oliver 2001).

Multivariate analysis

Multivariate statistical solutions are mathematical hypotheses, and their interpretation requires environmental knowledge. These techniques have been widely used to assist the interpretation of environmental data and to distinguish between natural and anthropogenic inputs of heavy metals (e.g., Yu et al. 2000; Boruvka et al. 2005; Lucho-Constantino et al. 2005; Zhou et al. 2008; Schneider et al. 2016). Inter-element relationships can provide valuable information on the sources and pathways of the soil heavy metals (Manta et al. 2002). However, classical correlation analysis does not reveal the real relationships among variables since it averages out distinct changes in the correlation structures occurring at different spatial scales (Liu et al. 2013). To study the scale-specific correlations between the two metal contents and three soil properties, correlation analysis was applied to identify the relationships between the EMD components (soil pH, SOC, and TS). Also, correlation analysis was performed, considering the EMD components of the two soil heavy metal (HM) contents to check relationships among them. Spearman correlation coefficient used for most of the variables in this study was positively skewed.

Principal component analysis (PCA) was used to interpret EMD transformed components (IMFs and residues) of Cdtotal, Hgtotal, pH, SOC, and TS with approximately the same or similar source. In order to facilitate the interpretation of results, varimax rotation was applied because orthogonal rotation minimized the number of variables with a high loading (Gallego et al. 2002). Multivariate analysis was performed using SPSS Statistics 22.0 (SPSS Inc., Chicago, IL, USA).

Results and discussion

Soil heavy metal concentration and soil properties

The descriptive statistics of Cd and Hg concentrations and three soil properties are shown in Table 1. The results showed that both Cdtotal and Hgtotal had a wide data range, suggesting that extrinsic factors affect Cdtotal and Hgtotal in soils of the study area. Previous studies showed that the background concentrations in Zhejiang at a provincial scale for Cd and Hg were 170 and 150 μg kg−1, respectively (Zhejiang Soil Survey Office 1994; Cheng et al. 2006). There were 70 (26.9%) soil samples for Cdtotal and 123 (47.3%) for Hgtotal that exceeded the local background concentration. This indicated that Cd and Hg were enriched in some areas. The soil pH in the area was in the range of 3.66–7.96 with a mean of 6.42, the SOC was in the range of 0.19–3.27% with a mean of 1.20%, and the TS was in the range of 97–682 mg kg−1with a mean of 295.7 mg kg−1. The coefficients of variation for Cdtotal, Hgtotal, pH, SOC, and TS were 41.6, 88.4, 11.7, 50.0, and 43.1%, respectively. It showed that Cdtotal, Hgtotal, SOC, and TS in the study area were highly variable compared to soil pH.

Table 1 Descriptive statistics of the soil properties and their empirical model decomposition components (i.e., IMFs and residues) in the study area (N = 260)

Characteristics of IMFs and residues

In this study, all the original data of heavy metal concentrations and soil properties were decomposed into three IMFs and a spatial residue. The descriptive statistics of the EMD components for the metals and soil properties are listed in Table 1. The results showed that all the IMFs had mean values close to 0 and approximately followed a normal distribution, which was the nature of EMD analysis. The variance percentage was largest in IMF1 and decreased at subsequent levels. For Cd and Hg concentration, IMF1 contributed about half of the total variance while the residue only contributed about 25% of the total variance. The variance contribution of IMF3 was almost negligible compared to other IMFs. The results also showed that the residues had a mean value almost equal to those of the corresponding original data, although the residues had a relatively narrow range. From Fig. 2, it was found that the residue only contained a regional trend. Note that all EMD analyses were performed on the actual sampling points and are presented here as a grid (1.5 km × 1.5 km) for visualization purpose. We conducted EMD analysis on the point-based datasets rather than interpolated grids because we tried to avoid the smoothing effects generated during the interpolation process which may potentially eliminate the small-scale variations in the soil properties.

Fig. 2
figure 2

Spatial distributions of the two heavy metal contents, the three soil properties (a 1 , b 1 , c 1 , d 1 , e 1 ), and their residues (a 2 , b 2 , c 2 , d 2 , e 2 ) in the study area

Spatial structure of soil HM contents and soil properties

Variograms and the fitted models for Cd and Hg concentration, TS, pH, SOC, and their EMD components are presented in Fig. 3. The parameters of the variograms are summarized in Table 2. The variograms showed that all EMD components of all soil properties were fitted with models with a high coefficient of determination (R 2 = 0.874~0.997) except for IMF1. This indicated that IMF1 presented a local-scale variation, while other components and the residues presented a regional trend which varied smoothly in the area and were fitted nicely by the models.

Fig. 3
figure 3

Experimental semivariograms of the two soil heavy metal contents, three soil properties, and their empirical mode decomposition components with fitted models

Table 2 Best-fitted variogram models of the two heavy metal contents, three soil properties, and their empirical mode decomposition components

IMF1–IMF3 for Cd showed increasing spatial values ranged from 1.69, 3.64, to 5.27 km. A spherical model fitted IMF1 well, while the other IMFs were better modeled using a Gaussian model. This implied that IMF1 mainly characterized a local variation, while IMF2, IMF3, and the spatial residue represent trends that vary smoothly over larger scales.

Meanwhile, the range values of IMF1–IMF3 of Hg were around 4 km. From Fig. 3, it was found that the residues had a relatively stronger spatial dependence than those of the corresponding original data. This may be due to the fact that residues of the EMD analysis eliminated the local oscillations (variations). Thus, the residues alleviated the dominant nugget effect (nugget variance) caused by the fine-scale spatial variability that cannot be characterized by the sampling scheme (Webster and Oliver 2001). In terms of the residues, it was noted that Cdtotal had the largest range (~39.0 km), followed by Hgtotal (~17.0 km), pH (~15.9 km), TS (~15.8 km), and SOC (~13.2 km). Based on these results, it was expected that variations in the two soil metal contents and the three soil properties may be found at different spatial scales.

Scale- and location-specific variations in soil HM contents

Figure 4 shows the spatial distribution of IMFs and its residue for Cd and Hg. For IMF1 of Cdtotal, high values were only presented in the north and southwest parts of the area. This pattern can be also found in IMF2 and IMF3 and the residue. All three IMFs had most of the variations at the small scales (1.7~5.3 km). However, the residue of Cdtotal only contained variations at the large scales (39.0 km). The residue characterized areas which have naturally high Cd concentrations.

Fig. 4
figure 4

Spatial distributions of the empirical mode decomposition components of the two soil heavy metal contents in the study area (a 1 a 4 , b 1 b 4 )

Figure 4(b1–b4) shows the spatial distribution of IMFs and the residue of Hgtotal. Most of the variations in IMFs of Hgtotal were located in the northwest area with small scales (3.2~4.6 km), whereas most of the variations in the residue of Hgtotal were in the northwest area with intermediate scales. The large residue of Hgtotal had a spatial distribution trend with the high concentration in the northwest area and low concentration in the southeast area. From the figure, it was found that the residue of Hgtotal had an obvious hot spot in the northwest area, which may be due to a point pollution source of Hg.

Scale-specific correlations between the two soil HM contents and three soil properties

SOC was significantly correlated with Cdtotal and Hgtotal (r S = 0.47–0.79), possibly as a consequence of external sources. This was consistent with well-established studies in the literature that SOC plays a fundamental role in the control of metal sorption by soils (Séguin et al. 2004; Rattan et al. 2005; Kashem et al. 2007). Significant correlations were also found between Cdtotal, Hgtotal, and TS, indicating the main sources of these elements were similar or the same (Table 3). However, soil pH had poor correlations with the Cdtotal and Hgtotal.

Table 3 The nonparametric correlations between the two heavy metal contents and three soil properties in the study area (N = 260)

From Table 4, we found that each component of Cd and Hg was significantly correlated with several different components of the three soil properties, indicating each EMD component of the two metals had different influencing factors. IMF1 of Cdtotal representing a more local variation was significantly correlated with IMF1 of SOC and IMF1 of TS with relatively moderate correlation coefficients (r S = 0.60–0.62). Meanwhile, the residue of Cdtotal was significantly correlated with the residue of Hgtotal and the residues of SOC and TS with moderate–high correlation coefficients (r S = 0.53–0.84). Moreover, moderate correlations were also found between the residue of soil total Hg contents and the residues of the three soil properties. The significant correlations indicated these EMD components had the same or similar influencing factors.

Table 4 The nonparametric correlations between the empirical mode decomposition (EMD) components (IMF1, IMF2, IMF3, and residues) of the two soil heavy metal contents and the EMD components of the three soil properties in the study area (N = 260)

Potent factors affecting Cd and Hg in agricultural soils

To reduce the high dimensionality of the variable space and better understand the relationships among the EMD components of the metals and soil properties, PCA was applied to the data (Fig. 5). According to the results of the eigenvalues listed in Table 5, eight factors accounted for over 74% of the total variation of the 20 EMD components. The largest loadings or contributors for the first principal component (PC1) which accounted for about 14% of the total variance were the residues of Cdtotal, Hgtotal, SOC, and TS (loadings greater than ±0.5 were considered). It indicated that the residues of Cdtotal, Hgtotal, SOC, and TS in the study area had similar influencing factors. It is well known that the spatial residues represent the overall trends of data, and soil properties exhibit variability as a result of dynamic interactions between parent material, climate, and geological history, on the regional scale (Wang et al. 2001; Liu et al. 2006). Therefore, the residues of Cdtotal and Hgtotal, to some extent, reflected their natural concentration. This was in agreement with the previous studies that the natural concentration of heavy metals in agricultural soils depended primarily on the geological parent material composition and pedogenesis (De Temmerman et al. 2003; Rodríguez Martín et al. 2006).

Fig. 5
figure 5

Summary plots of principal components on correlations of the empirical mode decomposition components (IMFs and residues) of the two soil heavy metal contents and the three soil properties

Table 5 Principal component (PC) analysis of the empirical mode decomposition components (IMFs and residues) of the two soil heavy metal contents and the three soil properties

The third principal component (PC3) was responsible for 10.85% of the total variance and was dominated by IMF1 of Cdtotal, IMF1 of SOC, and IMF1 of TS. IMF1 of Cdtotal had a moderate positive association in PC3, and IMF1 of SOC and IMF1 of TS had strong positive associations in PC3, suggesting the three components had similar or same influencing factors. IMF1 of SOC and IMF1of TS separately contributed to the majority of the total variance. They also had short-range spatial correlations (0.65 and 0.55 km), suggesting the two IMF1 components represented the majority of local variations, i.e., field-scale variations in SOC and TS. It is well known that the field-scale variations in soil nutrient contents in Chinese agricultural soils were mainly attributed to the fertilizer history of individual farmers and varieties used in relatively small-scale field management (Jin and Jiang 2002). Based on the above results, it was reasonable to conclude that a large proportion of variations in Cdtotal were mainly due to agricultural activities such as fertilization. This result was consistent with previous studies that agricultural activity was one of the main sources of Cd entering agricultural soils (Mann et al. 2002; Huang et al. 2007; Atafar et al. 2010).

Both IMF2 and IMF3 of Cdtotal were positively associated with the fifth principal component (PC5). It indicated that the two IMFs had similar influencing factors. The two IMFs of Cdtotal had short-range spatial correlations (3.64–5.27 km), and they had a similar spatial variation that was more variable in the north and the southwest areas. Previous studies showed that sewage irrigation and atmospheric deposition were two of the main sources of Cd entering agricultural soils (Nicholson et al. 2003; Liu et al. 2005; Luo et al. 2009; Wu et al. 2011). To our knowledge, there were a large number of rural enterprises distributed in the study area, and water pollution from small rural industries was often a serious problem throughout China (Wang et al. 2008). Based on the above discussions, we can attribute the IMF2 and IMF3 of Cdtotal might represent the influence of industry activities.

IMF1 and IMF2 of Hgtotal were shown in the sixth principal component (PC6), and IMF3 of Hgtotal and IMF3 of soil pH were shown in the seventh principal component (PC7). PC6 and PC7 account for about 6.99 and 6.67% of the total variance, respectively. All the three IMFs of Hgtotal had short-range spatial correlations (3.2–4.6 km), suggesting the three IMFs represented the majority of the local variation of Hgtotal in the study area. Previous studies showed that the anthropogenic sources of Hg in agricultural soils in China may be mainly originated from pesticides, fertilizers, atmospheric deposition, sewage irrigation, and so on (Wang et al. 2003; Zheng et al. 2008; Wei and Yang 2010; Wu and Cao 2010). It is well known that coal combustion is one source of Hg and acid gas emission and therefore leads to Hg enrichment and soil acidification synchronously (Xu et al. 2004). Based on the above discussions, we can conclude that all three IMFs of Hgtotal might represent the influence of industrial activities such as coal combustion, whereas they might represent the influence of different industrial activity or different pathways of the same industrial activities. Short-distance atmospheric deposition arising from the mini-scale coal-fired boiler of small rural industries could explain the short-range spatial correlation of the IMF3 of Hgtotal.

Conclusions

Cd and Hg were enriched in parts of the study area. Cdtotal in agricultural soils had an overall trend with a high concentration in the north and southwest areas and low concentration in the middle part. Meanwhile, Hgtotal had an overall trend of high concentration in the northwest and low concentration in the southeast, although they exhibited a complex spatial variability at different scales.

A 2D-EMD was used to decompose the observations into three IMFs and a residue. IMF1 had the highest contribution to the total variance in the four EMD components, indicating the dominant scales in terms of the explained variance of the five variables were IMF1 and it might represent the variation caused by major influencing factors, whereas the residue representing the overall trend only accounts for a small proportion of the total variance. Anthropogenic activities were the dominant contributors of the variation in Cdtotal and Hgtotal in the study area. IMF1 of both SOC and TS had short-range spatial correlations, and they were strongly correlated with IMF1 of Cdtotal, indicating the IMF1 of Cdtotal might represent the influence of agricultural activities. Both IMF2 and IMF3 of Cdtotal were positively associated with the same principal component, and they also had short-range spatial correlations, indicating the two IMFs of Cdtotal may represent the influence of industry activities. IMF1 and IMF2 of Hgtotal were shown in the one PCA component, and IMF3 of Hgtotal and IMF3 of soil pH were shown in another PCA component, indicating IMF1 and IMF2 of Hgtotal and IMF3 of Hgtotal might represent different industrial activities or different pathways of the same industrial activities. Agricultural activities were the dominant contributors of the overall variation in Cdtotal, while industrial activities with concentrated sources may explain the variation of Hgtotal in the study area. Using 2D-EMD has great potential in identifying the potent factors of spatial multi-scale variations.