1 Introduction

Polycyclic aromatic hydrocarbons (PAHs) include a number of organic compounds that contain two to seven fused aromatic rings made up of carbon and hydrogen atoms (Wang et al. 2015), which are widespread throughout the environment, persist for a long time, and undergo long-range transportation (Sun et al. 2009). PAHs are of concern because of their carcinogenic and mutagenic potential, and their subsequent adverse effects on ecosystem and human health, which can lead to a variety of diseases (Zhang et al. 2009; Ren et al. 2011; Duan et al. 2015). PAHs are pollutants emitted from both natural (forest fires and volcanic activity) and anthropogenic sources (incomplete combustion of fossil fuels, coke production, and many industrial processes) (Boström et al. 2002; Wang et al. 2010; Shen et al. 2011). Specifically, 16 PAHs have been placed on a priority control list by the United States Environmental Protection Agency (USEPA), seven of which are confirmed to be carcinogens (Zhao et al. 2014).

Although the total emissions of PAHs have been declining since the 1970s or 1980s in developed countries (Li et al. 2001; Rose and Rippey 2002), in developing countries, such as China and India, PAH emissions have been continuously increasing (Zhang et al. 2009). Based on energy combustion statistics and corresponding emission factors, the annual PAH emission in China is 25,300 t (Xu et al. 2006), accounting for over 20 % of the global total (Zhang and Tao 2009). PAHs are present ubiquitously in the environment, including water, air, soil, and sediment, although soil is regarded as the most important sink for PAHs (Agarwal et al. 2009). Soil PAH contamination attracts much public attention because they not only have severe environmental impacts, but they also represent a serious human health hazard (Wilcke 2007; Cai et al. 2008). Therefore, much research has been conducted to determine the contamination level, source identification, and risk assessment in different environment media at different regional scales (Liu et al. 2010; Yang et al. 2012; Bortey-Sam et al. 2014; Kwon and Choi 2014). However, there are few studies of the spatial patterns and hotspots of soil pollution by PAHs, which can be used to provide a scientific basis for better environmental management (Zhang et al. 2008).

Changzhi city is located in the southeast of Shanxi Province, which has four major industries: coal, chemicals, electricity generation, and metallurgy. Changzhi is the most important heavy industry and coal production region of Shanxi province, which has experienced serious environmental contamination during the last few decades of its industrial and economic development, largely due to coal production and energy consumption. Most of the agricultural soils in the region are used to grow crops, and the contamination could therefore have severe impacts on the environment and human health. However, there is limited information available regarding the contamination levels, occurrence, sources, and health risks, especially with regard to hotspots of PAHs in agricultural soils in the region, which have an important impact on food safety and human health. Therefore, the main objectives of this study were (1) to investigate the concentration level and profiles of soil PAHs; (2) to identify the primary sources of PAHs and their respective overall contribution to total PAH contamination in the study region; and (3) to identify spatial patterns and hotspots of soil PAHs. This study will provide valuable information about PAH levels in the region and support a scientific basis for better environmental management and further controls of PAH contamination.

2 Materials and methods

2.1 Study area and soil sampling

Shanxi province is one of the largest provinces in China and is a large producer of coal, with an annual production of over 300 million t (Xu et al. 2009). Changzhi is one of the largest coal production areas and industrial cities in Shanxi province. The coal production and complex industrial system are the main pollution sources in the city. Because of its industrial structure, economic development, and energy consumption, Changzhi is considered to be one of the most polluted areas in both Shanxi province and all China. Contamination not only has severe environmental impacts, but also represents a serious human health hazard.

A total of 203 agricultural soil samples (0–20 cm) from twelve counties of Changzhi were collected in July, 2013. All soil sampling sites were geo-referenced using a global positioning device. At each site, five subsamples within an area of 100 m2 were mixed and sealed in glass bottles. After being transported to the laboratory, residual roots and other debris were removed and air-dried at room temperature for 1 week and then sieved through a 60-mesh sieve. The samples were homogeneous and stored in a clean glass bottles at 4 °C until analysis.

2.2 Chemical analyses

For the extraction of PAHs, a 10 g soil sample was treated with acetone/dichloromethane (1/1, v/v) using an ASE-300 accelerated solvent extraction system (Dionex, Beijing, China). The extracted PAHs were concentrated by organomation, eluted with approximately 30 ml dichloromethane/n-hexane (2/1, v/v), and concentrated to 1 ml for analysis (Grimalt et al. 2004). PAHs in the extracts of all samples were analyzed by gas chromatography–mass spectrometry [6890 N GC, 5975B mass spectrometric detector (MSD), Agilent, Santa Clara, CA, USA] equipped with a HP-5MS capillary column (30 m, 0.25 mm inner diameter × 0.25 mm film thickness, Agilent). In this method, the identification of 16 priority PAHs was performed by gas chromatography–mass spectrometry, and quantification analysis was based on the peak area external reference of a 16 PAH standard sample (Supelco Co, Sigma-Aldrich Corporation, St Louis, MO, USA) containing naphthalene (Nap), acenaphthylene (Acy), acenaphthene (Ace), fluorene (Fle), phenanthrene (Phe), anthracene (Ant), fluoranthene (Fla), pyrene (Pyr), benz [a] anthracene (Baa), chrysene (Chr), benzo [b] fluoranthene and benzo [k] fluoranthene (Bbf & Bkf), benzo [a] pyrene (Bap), dibenzo [ah] anthracene (Daa), benzo [ghi] perylene (Bgp), and indeno [1,2,3-cd] pyrene (Inp).

All analytical procedures for blanks and spiked blanks were subject to strict quality control and quality assurance. The spiked blanks, method blanks, procedural blanks, matrix spiked sample, and a matrix spiked sample duplicate were analyzed using the quality control procedures. A matrix spike together with the soil samples and a method blank, and two sample duplicates were analyzed for each batch of five samples. The analytical procedure was comprehensively evaluated against quality control acceptance criteria (USEPA 2007b), and linear quantitative equation was obtained with an r 2 > 0.99 (USEPA 2007b). The method detection limits ranged from 10 to 15 μg kg−1, while the recoveries were 95 to 105 %, with a relative standard deviation lower than 11 % (Liu et al. 2013).

2.3 Data analysis

  1. (1)

    Geostatistical analysis

    Geostatistical analysis can provide an unbiased estimator of variables at unmeasured locations under the stationarity assumption (Goovaerts et al. 2008), and has been extensively used in soil pollution mapping. Geostatistical prediction includes several types of kriging model, with ordinary kriging being the most commonly used. The semivariance function provided the weights of ordinary kriging and quantified the assumption that things nearby tend to be more similar than those farther apart. In this study, ordinary kriging was used to obtain the spatial distribution of soil PAHs. Before the ordinary kriging interpolation calculation, concentration data was normalized and semivariograms were constructed to obtain the optimal parameters. More detailed model principles and the computational process underlying ordinary kriging are available in the literature (Goovaerts 1997).

  2. (2)

    Local Moran’s I index

    Spatial autocorrelation analysis is a useful approach to identify hotspots and spatial patterns of pollution. There are many methods (e.g., Moran’s I, Geary’s C, Getis’ G, and Join Count analysis) of spatial autocorrelation analysis that have been used, but Moran’s I index is the most widely used in a number of research fields, including environmental management (Zhang et al. 2008). We used Local Moran’s I indices of spatial autocorrelation to calculate the relationship between each sample and its neighbors and identify hotspots and spatial patterns of soil PAHs. The Local Moran’s I statistic for spatial autocorrelation is given in the literature (Cliff and Ord et al. 1981).

  3. (3)

    Positive matrix factorization model

    To obtain quantitative information regarding the contributions of soil PAH sources, positive matrix factorization model (PMF), which is a receptor modeling tool that utilizes non-negativity constraints for obtaining physically realistic meanings, was used in this study to investigate the contribution of emission sources to PAHs. In brief, the PMF model required a concentration and uncertainty data to be input and was run in the default robust mode. The number of PMF factors was determined by comparison between the Q true and Q robust values. A detailed description of the PMF model can be found in Paatero and Tapper (1994), who developed this model.

Detailed information about the model and calculation was provided in the Electronic Supplementary Material.

3 Results and discussion

3.1 Overview of PAH levels in agricultural soils

The concentrations of the 16 individual PAHs in agricultural soils in Changzhi are given in Table 1. The total concentrations of ∑PAH16 varied from 9 to 10514 ng g−1, with a mean of 917 ng g−1. The concentrations of the seven carcinogenic PAHs (∑7CarPAHs) were in the range of N.D. to 3527 ng g−1, with a mean concentration of 363 ng g−1. The seven carcinogenic PAHs accounted for about 33.6 % of the total concentration of 16 priority PAHs. The data of ∑PAH16 has a coefficient of variation (CV) of 173 %, and the individual CVs of the 16 PAHs all exceeded 100 %, which indicates that the data for PAHs in soils varied greatly. The variation of soil PAH content could occur for many reasons, such as the distribution of pollution sources, transport, and soil properties. Outliers with the high concentrations among the soil samples were found in the middle east area of the region. For example, the maximum concentration of Bap was 103 times higher than the minimum value. The middle east area included some industrial enterprises, such as a chemical plant, steel plant, and a machinery factory so the dataset included hot spots with high peak values, resulting in spatial variability among the original data. The larger sampling area of about 13,000 km2, which contains different soil types as well as complex geological/biological variability, may also have led to variability in the original data.

Table 1 Concentrations of 16 PAHs in agricultural soils from Changzhi, China (n = 203, dw)

It is interesting to compare the concentrations of PAHs in agricultural soils with the values reported in previous studies. Average ∑PAH16 concentrations in Changzhi were comparable to or higher than those in agricultural soils from Hong Kong (138 ng g−1, Chung et al. 2007); Shantou (318 ng g−1, Hao et al. 2007); Beijing/Tianjian (336 ng g−1, Wang et al. 2010); Huanghuai Plain (130 ng g−1, Yang et al. 2012); Xinzhou (202 ng g−1, Zhao et al. 2014) in China; and with overseas locations including Korea (236 ng g−1, Nam et al. 2008), Japan (320 ng g−1, Honda et al. 2007), the Czech Republic (847 ng g−1, Holoubek et al. 2009), and Poland (616 ng g−1, Maliszewska-Kordybach et al. 2008), where, based on the latest global emission inventory, emission densities of PAHs were relatively high (Shen et al. 2013). However, the ∑PAH16 concentrations recorded in this study were lower than those reported previously in Delhi, India (1906 ng g−1, Agarwal et al. 2009), Chengdu (3234 ng g−1, Xing et al. 2011), and the Yangtze River Delta in China (1503 ng g−1, Cai et al. 2007). These higher levels are thought to be associated with specific pollution sources. According to the comparison, the soils in agricultural areas of Changzhi were moderately contaminated by PAHs compared with other areas in China and other countries, which indicates that anthropogenic activities have contributed to the PAH levels observed in these soils. According to the contamination classification method developed by Maliszewska-Kordybach (1996), 42 soil samples were heavily contaminated (>1000 ng g−1), 21 soil samples were contaminated (600–1000 ng g−1), 65 soil samples were weakly contaminated (200–600 ng g−1), and 75 soil samples were not contaminated (<200 ng g−1). Specifically, we found that the levels of PAHs in agricultural soils from Changzhi were higher than those in agricultural soils of Xinzhou, another city in Shanxi Province. This may be because the sampling sites used in Xinzhou were located in agricultural fields, far from any apparent pollution sources (Zhao et al. 2014), whereas in this study some soil samples were taken from the vicinity of an industrial site.

3.2 Composition profiles

The PAH profiles in all soil samples are shown in Fig. 1a. Fla was the most abundant, and contributed about 14.7 % of the total concentration of all 16 PAHs, followed by Acy (12.2 %), Bap (11.3 %), Nap (8.7 %), and Ant (8.3 %). The proportions of the 16 PAHs with different ring numbers are shown in Fig. 1b. In soils samples, 3-, 4-, and 5-ring PAHs were dominant in all soil samples, accounting for 29, 31, and 24 % of all PAHs, respectively, followed by 2-ring PAHs, which made up 9 %. The 6-ring PAHs contributed the least (7 %) to the total. Generally, median and high molecular weight PAHs were the dominant compounds in the soil. High molecular weight (HMW, four to six ring) PAHs accounted for more than 62 % percent of the total PAHs. The low molecular weight (LMW, two to three ring) PAHs accounted for 38 % of total PAHs. Unlike the LMW PAHs, which had volatile characteristics, the HMW PAHs were largely present in the particle phase, and tended to accumulate and remain in the soil.

Fig. 1
figure 1

Contribution of the different PAHs to the total PAHs (a) and the contribution of PAHs with different ring numbers to the total PAHs (b)

3.3 Spatial characteristics of PAHs in soils

An understanding of the spatial distribution of pollutants in soils is critical for risk control and environmental management. In this study, ∑PAH 16 was chosen as a measure to analyze the spatial characteristics of PAHs in soils using a kriging model. Spatial distribution maps of contaminants were produced using a kriging model taking into account the available concentrations. Prior to the interpolation, we calculated semivariograms based on the concentration data of all soil samples, and obtained models that fitted the data through the optimization of parameters. Figure 2 displays the contaminant spatial distributions. According to the spatial distribution maps, soils with high concentrations of ∑PAH 16 were mainly distributed in the middle-upper and southeast portions of the region, including Jiao and Cheng districts, and Wuxiang, Lucheng, Xiangyuan, and Pingshun counties. This can be explained by the number of large industrial complexes, heavy traffic, the high number of coking plants, coal-fired power plants, and chemical plants in these counties. Besides, the other pathway for PAHs pollution was via air emission. The wind direction being NW in winter and SW in summer, so contaminants were distributed in the middle-upper and southeast portions of the region, so the spatial distribution results are consistent with the contaminants air emission. Within this area, Jiao district had the highest PAH contamination, with an average concentration of 2140 ng g−1, followed by Lucheng county, Cheng district, and Tunliu, Licheng, and Zhangzhi counties, with average concentrations of 803, 519, 311, 225, 217, and 218 ng g−1, respectively. PAH concentrations in other counties were relatively low, with average concentrations not exceeding 200 ng g−1. This may be because there are few industrial sites located in these mountainous areas. The spatial characteristics and distribution of the PAHs in the soils investigated in this study were mainly affected by the pollution source distribution, the topography, climatic factors, and edaphic variables.

Fig. 2
figure 2

Spatial distribution maps, created using ordinary kriging, for ∑PAH 16 concentrations (ng g−1) in soils from Changzhi

3.4 Identification of hotspots

Pollution hotspots, locations with high levels of pollution in comparison to the surrounding area, in agricultural soils need to be identified to improve environmental management and risk control. In this study, the Local Moran’s I index was used to identify hotspots and the statistical characteristics of hotspots of soil PAH concentrations. The results are shown in Fig. 3. The high-high values indicated hotspots and the low-low values indicated cool spots. For ∑PAH 16, there were 17 high-high values, clustered in the middle of the region, and 11 low-low spatial clusters. There were also five high-low and seven low-high outliers identified in the east and west of the region. Other than the pattern for these four clusters, the other samples showed no significant hotspot characteristics in the area. Pollution hotspots were visible in the middle of the region where soil PAH samples with high concentrations were surrounded by samples with similarly high concentrations. Generally, the hotspot distribution was consistent with the contamination that had occurred in the region. We found that the hotspot samples were mainly located in Jiao and Cheng districts, and Lucheng, Xiangyuan, and Pingshun counties. These locations had a high number of coking plants, coal-fired power plants, and chemical plants and were expected to be seriously contaminated. Taking the spatial distribution of the hotspots into consideration alongside the spatial distribution of ∑PAH 16 in Changzhi (Fig. 2), it was apparent that the hotspots were located in seriously contaminated areas. A relationship was established between the location of hotspots and the factors influencing PAH pollution in these areas. The results showed that the Local Moran’s I index was a useful and reliable tool to identify hotspots and classify spatial clusters that were characteristic of soil sample concentrations. The results provide important information regarding PAH pollution patterns that can be used in the environmental risk management of agricultural soils.

Fig. 3
figure 3

Spatial distribution of significant spatial cluster characteristics for ∑PAH 16 in agricultural soils

3.5 Source identification

To quantitatively assess the contribution of various pollution sources to PAH contamination, the PMF method was used to model the soil PAH data from Changzhi. The dataset used was a 203 × 16 matrix (203 soil samples and 16 individual PAHs). The PMF model used the input concentration and uncertainty data and was run in the default robust mode. Q true and Q robust values were the two key indicators used to determine the PMF factor numbers. Q true represents the advantage of the fitted parameter of the input dataset, and Q robust was determined by excluding outliers (Yang et al. 2013). There was a strong correlation (r 2 = 0.87, seed mode number = 100) between the Q true and Q robust, and four factors were deduced. The source profiles of the 16 PAHs in each of the four factors are depicted in Fig. 4. Unlike in principal component analysis, high values of a variable for a source factor do not mean that the variable is necessarily highly correlated with the source (Sofowote et al. 2008).

Fig. 4
figure 4

Source profiles obtained from the PMF model

Factor 1 was dominated by Bbf, Bap, Bkf, Bgp, Chr, Pyr, and Fle. Among these contaminants, Bap, Chr, Pyr, and Fle are typical markers of coal combustion (Simcik et al. 1999; Larsen and Baker 2003). Therefore, the factor 1 profile was identified as coal combustion. Factor 2 was dominated by Acy, Ant, and Nap, with a moderate influence from Fle and Baa. Other studies have shown that Acy and Nap are indicators of coke production (Khalili et al. 1995; Yang et al. 2013). There are many coking plants, and iron and steel plants operating in Changzhi, and during production and transportation, coke oven gas and fly ash are taken to non-industrial areas. Therefore, factor 2 was identified as coke tar. Factor 3 was dominated by Baa, Bap, Bgp, Daa, and Bkf. Among these compounds, Baa, Bap, Bgp, Daa, and Bkf can be used as markers of diesel emissions (Wang et al. 2013; Yang et al. 2013). Thus, factor 3 was identified as diesel emissions. Factor 4 was dominated by Fla, Phe, Fle, Nap, and Pyr. Phe and Fle are generally derived from fossil fuel combustion (Khalili et al. 1995), and LMW PAHs have been identified as the dominant component of petroleum. Consequently, factor 4 was identified as petroleum combustion.

The average contributions of each source to the total PAH concentration in the soil samples were 34 % for factor 1 (coal combustion), 23 % for factor 2 (coke tar), 19 % for factor 3 (diesel emissions), and 24 % for factor 4 (petroleum combustion). Therefore, the major sources of PAHs in agricultural soils from Changzhi have pyrogenic origins (including coal combustion, coking plants, iron and steel plants, and other industrial facilities), accounting for at least 76 % of the total PAH burden in soils from Changzhi.

3.6 PAH risk assessment

To assess the carcinogenic risk of soil PAHs, Bap toxic equivalency factors (TEFs) were used to estimate Bap equivalent (Bapeq) concentrations (Collins et al. 1998). The Bapeq for a soil sample is calculated by multiplying the soil sample concentration of each PAH by its TEF. The Bapeq concentrations of the total concentrations of the 16 PAH (∑PAH16) and seven carcinogenic PAHs (∑PAH 7c), which are the two main indicators used for estimating the carcinogenic potency of PAHs (Ravindra et al. 2008) are provided in Table 2. The Bapeq concentrations of ∑PAH16 ranged from N.D. to 1683 ng g−1, with an average value of 151 ng g−1 and a median of 29 ng g−1. The Bapeq concentrations of ∑PAH 7c ranged from N.D. to 1680 ng g−1, with an average value of 149 ng g−1. The max Bapeq concentrations of the seven carcinogenic PAHs (Baa, Chr, Bbf, Bkf, Bap, Daa, and Inp) were 107.7, 7.6, 97.6, 113.4, 1601, 456, and 8.9, respectively. The seven carcinogenic PAHs contributed the most Bapeq (99.8 %), which indicates that the seven carcinogenic PAHs were the main carcinogenic contributors among the 16 PAHs. Compared to previous studies, the mean values of the Bapeq for the 16 PAHs in this study were higher than the values reported from Xinzhou (34 ng g−1, Zhao et al. 2014), Huanghuai Plain (12 ng g−1, Yang et al. 2012), and the coke production base of Shanxi, China (44.6 ng g−1, Duan et al. 2015), but was lower than the values reported for cities in China, such as Shanghai (236 ng g−1, Wang et al. 2013) and Beijing (181 ng g−1, Liu et al. 2010). There are currently few guidelines for carcinogenic PAH levels in China. We referred to the Canadian Soil Quality Guidelines for the protection of environmental and human health regarding the commonly occurring parent PAHs, which defines a safe Bapeq of 600 ng g−1 (CCME 2010). In this study, the Bapeq for 17 % of the soil samples exceeded this value. When considering the Dutch target value of 33 ng g−1 (Netherlands Ministry of Housing, Environment etc., 1994), the Bapeq for 69 % of the soil samples exceeded the target value. The PAH pollution in the study area was therefore serious, and better management is required to protect agricultural products and human health in the region.

Table 2 BaPeq concentrations of PAHs in agricultural soils

4 Conclusions

The concentrations, spatial characteristics, hotspots, sources, and potential health risks of 16 PAHs in 203 agricultural soil samples from the Changzhi area of Shanxi, China were analyzed. The total concentrations of ∑PAH16 ranged from 9 to 10514 ng g−1, with a mean of 917 ng g−1. The seven carcinogenic PAHs accounted for 33.6 % percent of the total PAH concentration, contributing 99.8 % of the Bapeq concentration. Compared to the pollution levels and potential risk in other areas of the world, the pollution levels identified in this study were moderate. PMF analysis identified the emission sources of soil PAHs in Changzhi. With coal combustion (34 %), coke tar (23 %), diesel emissions (19 %), and petroleum combustion (24 %) being the four main sources in agricultural soils. Based on the spatial distribution and hotspots of soil PAHs, Jiao and Cheng districts, and Lucheng, Xiangyuan, and Pingshun counties were the most contaminated areas. A relationship was established between the location of hotspots and the factors influencing PAH pollution in these areas. This can be attributed to the high number of coking plants, chemical plants, and coal-fired power plants in these counties. This study showed that the coal chemical industry and other industrial activities have a significant influence on the pollution of agricultural soils by PAHs. This study improves our understanding of the pollution characteristics of PAHs in soils, with reference to the old industrial cities of the developing world, and highlights the need to control PAH contamination, and protect the environment and human health.