Introduction

It is generally known that polycyclic aromatic hydrocarbons (PAHs) are both from natural (Readman et al. 2002; Micić et al. 2011) and anthropogenic sources (Larsen and Baker 2003; Huang et al. 2012), and they have been found in various environmental medias (Larsen and Baker 2003; Ma et al. 2010; Zhang et al. 2012; Yang et al. 2013a). PAHs originated from anthropogenic activities including coal combustion, oil spill, vehicular emission, and industrial discharge (Stout and Graan 2010). These activities are thought to be the main sources of PAHs in the environment (Zakaria et al. 2002). In the aquatic environment, petrogenic PAHs are usually introduced directly into the water body, while pyrogenic PAHs are first emitted into the air and subsequently settled down into the water body and ultimately the sediments. The contents and compositions of the PAHs in sediment varied at different times resulting in temporal distribution, which revealed the anthropogenic impacts and economic development (Guo et al. 2010; Liu et al. 2012). Therefore, the dated sediment cores are good archives to reconstruct the chronology of PAHs in the aquatic system (Guo et al. 2006; Lin et al. 2012; Xu et al. 2014).

Several studies have identified the possible sources of PAHs in the sediment core employing diagnostic ratios and receptor models (Guo et al. 2006; Guo et al. 2010; Wang et al. 2010). The source category of PAHs can be qualitatively identified using PAH isomeric ratio methods. For receptor models, however, the source category and contribution can be quantitatively apportioned (Feng et al. 2007; Malik et al. 2011). Three commonly applied receptor models without employing source profile data are principle component analysis-multiple linear regression (PCA-MLR) (Bzdusek et al. 2004; Shi et al. 2009; Hu et al. 2017), positive matrix factorization (PMF) (Sofowote et al. 2011; Zhang et al. 2012; Yang et al. 2013b), and Unmix (Henry 1997; Yang et al. 2013a; Lang and Yang 2013). These source unknown receptor models have been widely used to source apportionment studies. Although these models aim at source apportionment, they differ to each other in mechanisms. The target of PCA-MLR is dimension reduction and explains the dataset with smaller number of independent factors, while PMF makes full use of available data to retain missing and below detection limit values and estimates the confidence of each input value. Unmix reduces the dimensionality of data using a singular value decomposition (SVD) (Henry 2003). PMF and Unmix are originally developed to identify and quantify the source of air pollutants including particulate matters and volatile organic hydrocarbons. In recent years, these receptor models are applied to PAH source apportionment in surface sediments. For example, Zhang et al. (2012) utilized these three models to study the spatial distribution of PAH source contributions and risk assessment in sediment from Taihu Lake, China. Similar studies applying the receptor model to investigate the spatial distribution of PAH sources and risk in surface sediments can be found elsewhere (Yu et al. 2015; Xu et al. 2016). Inspired by these studies, the goal of figuring out the historical distribution of PAH source contributions could be achieved by the receptor models and dated sediment core.

Honghu Lake, the largest freshwater lake in Hubei province, was chosen as the study area for its significant values in agriculture, ecology, and flood control. In this study, 16 USEPA priority PAHs in each layer were determined. The potential sources and average contributions were apportioned utilizing PCA-MLR, PMF, and Unmix models. Furthermore, the comparison study of three different receptor models was conducted through (1) the fitting degrees between the observed and predicted (O/P) PAH concentrations, (2) inter-comparison of different models, (3) source numbers and compositions, and (4) the historical distribution of each source to the total concentrations. Results of this study could provide the information about the historical variation of PAH source contribution in Honghu Lake and contributed to developing the countermeasures of PAH control.

Materials and methods

Field works

A sediment core (φ 100 mm × 37 cm) was collected in the central area of Honghu Lake (Fig. S1) with the help of a gravity corer in December 2014. The sediment core was sliced into a 1-cm interval, wrapped in acetone-cleaned aluminum foil, and transported to the laboratory in an ice cooler. The sediment samples were stored at − 20 °C until further treatment.

PAH analysis

The detailed PAH analysis procedure was reported elsewhere (Grimalt et al. 2004). Briefly, approximately 3–5 g freeze-dried, homogenized sediment sample spiked with surrogate (mixture of five deuterated PAHs: naphthalene-d8, acenaphthene-d10, phenanthrene-d10, chrysene-d12, and perylene-d10) was ultrasonic extracted with 1:1 (v/v) dichloromethane-acetone (3 × 20 mL, 15 min). A rotary evaporator was used to concentrate the extract to 10 mL, and hydrolyzation was operated by adding an extra 20 mL KOH in methanol (6%, w/w). The neutral fractions were recovered with 30 mL hexane (3 × 10 mL), vacuum evaporated to almost dryness, and fractionated using a column filled with alumina-silica (1:1, v/v). PAH fractions were eluted by 15 mL dichloromethane-hexane (2:1, v/v). High-purified (> 99.999%) N2 was used to concentrate target elutes (PAHs fractions) to 0.2 mL. Prior to GC/MS analysis, 1000 ng internal standard (hexamethylbenzene) was spiked.

Sixteen US EPA priority PAHs—naphthalene (Nap), acenaphthylene (Acy), acenaphthene (Ace), fluorene (Fl), phenanthrene (Phe), anthracene (Ant), fluoranthene (Fla), pyrene (Pyr), benz[a]anthracene (BaA), chrysene (Chr), benzo[b]fluoranthene (BbF), benzo[k]fluoranthene (BkF), benzo[a]pyrene (BaP), indeno[1,2,3-cd] pyrene (IcdP), dibenz[a,h]anthracene (DBA), and benzo[ghi]perylene (BghiP)—were analyzed. Samples (1 μL) were injected (splitless mode) into an Agilent GC (6890N) coupled with DB-5MS capillary column (30 m, 0.25 mm ID, 0.25 μm film) to separate PAHs with helium carrier gas at a constant flow of 1 mL min−1 and programmed temperature of the GC oven: initially 50 °C holding for 2 min, 20 °C min−1 to 180 °C and 4 °C min−1 to 250 °C and 10 °C min−1 to 300 °C min−1 and holding for 5 min. An Agilent MS (5975) with EI source (70 eV) operated in selected ion monitoring (SIM) mode was used to analyze PAHs. Injector and mass transfer line temperatures were held at 280 °C.

The procedural blank, blank-spiked, matrix-spiked, and duplicated samples were processed in every 10 field samples for quality assurance (QA) and quality control (QC). Surrogate standard recoveries for QA/QC samples were 88 ± 11, 84 ± 38, 95 ± 38, 95 ± 14, and 92 ± 22% for naphthalene-d8, acenaphthene-d10, phenanthrene-d10, chrysene-d12, and perylene-d12, respectively. Recoveries of the same surrogate standards for field samples were 79 ± 16, 93 ± 13, 102 ± 23, 89 ± 13, and 84 ± 22%, respectively. Three times of the signal-to-noise level in the lowest standard sample concentration (0.2 mg mL−1) was defined as instrument detection limit (IDL) and ranged from 0.10 to 0.73 mg L−1 (Table S1, supplementary materials). All concentrations were reported after normalized to dry weight, and blank corrected but not surrogated recovery corrected (Zhang et al. 2013).

Receptor model description

Receptor models use statistical or mathematical methods to identify and quantify the source of pollutants at receptor samples. Unlike the dispersion models such as backward trajectory and photochemical models, receptor models do not use the meteorological data and chemical transformation mechanisms. Instead, the PCA-MLR, PMF, and Unmix models generate possible “candidate” source fingerprints, and then identify the source profile by comparing them with the known source profiles. Usually, these receptor models can be expressed as the following equation (Hopke 2003):

$$ {x}_{ij}={\sum}_{k=1}^p{g}_{ip}{f}_{pj}+{e}_{ij} $$
(1)

where x ij is the jth compound concentration measured in the ith receptor sample; g ip is the contribution of the pth source to the ith sample; f pj is the concentration of the jth compound in the pth source; and e ij is the error.

PCA-MLR

The SPSS (22.0, IBM USA) software was used to run the PCA-MLR. The dimension reduction of a matrix (37 × 16) was conducted using the factor analysis. The principle component method was used to extract the factors based on eigenvalues > 1. The Varimax rotation was also applied to make the factor loading values more physically interpretable. The Kaiser-Meyer-Olkin and Bartlett’s test of sphericity results indicated that the raw data was suitable to employ the factor analysis model (Table S2). To calculate the contribution of each factor, multiple linear regression analysis was performed on the PCA score. More details can be found in the researches of Larsen and Baker (2003) and Cao et al. (2011).

PMF

The concentration file (a dataset of 37 × 16) and uncertainty file (equation based) were introduced into the PMF 3.0 model (https://www.epa.gov/air-research/positive-matrix-factorization-model-environmental-data-analyses). The 16 species were firstly categorized to strong due to the signal-to-noise ratio (S/N) higher than 2. A base run was conducted, starting a random seed for each iteration in each run of 20 times. After the first base run, Nap, Acy, Fl, Ant, and Pyr were categorized to weak for their residuals greater than ± 3 standard deviation (Fig. S2). Modified species categories were re-run to find the optimum factor numbers. The sixth run for four factors reached the lowest Q true value (the goodness-of-fit parameter calculated including all data) being 374.8, which was close to the Q theoretical value (380, i × j − p × (i + j)). Besides, the correlation between observed PAH concentrations and predicted values performed by PMF was significant with r = 1.00 (p < 0.01). After the four-factor solution considered as the local minima, bootstrap runs and F-peak (0.7) techniques were performed to estimate the stability and uncertainty of the solution and examine the rational ambiguity, respectively. More information about the PMF operation can be found elsewhere (USEPA 2008).

Unmix

Unmix is also one of the recommended receptor models by USEPA and available at https://www.epa.gov/air-research/unmix-60-model-environmental-data-analyses. Different from the PMF model input files, only concentration data were introduced into Unmix 6.0 software. No species were excluded in the model after evaluating the noise of species using the suggested exclusion function. The dataset of fitting PAH species was used in factor analysis to determine the source numbers. Three sources (factors) were extracted by the Unmix model with Min. R 2 and Min. S/N being 0.90 and 4.13, respectively. The residual scale was within ± 3 standard deviation (Fig. S3), which satisfies the operation of the model. More details about the Unmix model can be found elsewhere (Henry 2003, 2007).

Dating of sediment core

The activities of 137Cs, 210Pb, and 226Ra in each sediment sample were measured using an Ortec HPGe GWL series well-type coaxial low-background intrinsic germanium detector. 137Cs was determined by its emissions at 662 keV. 210Pb, 226Ra, and 214Pb (daughter isotope of 226Ra) were measured via its gamma emissions at 46.5, 295, and 352 keV, respectively. 210Pbexe activities were calculated by subtracting 226Ra activities from total 210Pb activities (Wu et al. 2006). The 210Pbexe activities were shown to exponentially decrease with depth (Fig. 1a), indicating a possibility to date the sediment core using the constant initial concentration (CIC) model (Krishnaswamy et al. 1971; Zhang et al. 2013). The average sedimentary rate based on the CIC model was 0.46 cm year−1. The peak value of 137Cs was measured at the depth of 24 cm (Fig. 1b), which was most likely in 1963 due to the large-scale nuclear test world-widely. The average sedimentary rate was 0.47 cm year−1 based on 137Cs, which was almost equal to the result of 210Pbexe. The similarity from both methods suggested that the dating results were credible. The dating results revealed that the 37-cm sediment core tracked the past 80 years’ sedimentary history (1934–2012).

Fig. 1
figure 1

Depth profiles of 210Pbexe (a) and 137Cs activities (b) in sediment core from Honghu Lake

Results and discussion

The level of PAHs in sediment core

As shown in Table 1, Phe was the most abundant species with an average of 62.0 ± 32.8 ng g−1, followed by BbF (28.1 ± 13.9 ng g−1), and Fla (26.0 ± 10.9 ng g−1). The total concentrations of 16 PAHs (∑16PAHs) ranged from 93.0 to 431 ng g−1, with an average value of 244 ng g−1. Compared with other studies, the concentrations range of ∑16PAHs in this study was higher than the values reported in Qinghai Lake (11–279 ng g−1) (Wang et al. 2010) and the Nador Lagoon in Morocco (59.0–107.7 ng g−1) (Giuliani et al. 2015), but was lower than those measured in Haizhou Bay (72.5–805 ng g−1) (Zhang et al. 2013), a reservoir in northeast China (243–1004 ng g−1) (Lin et al. 2012), the Lake Lille Lungegårdsvannet in Norway (260–58,360 ng g−1) (Andersson et al. 2014), five lakes in western China (626–1398 ng g−1) (Xu et al. 2014), and the Lake Baiyangdian (97–2404 ng g−1) (Guo et al. 2011).

Table 1 Summery of PAH concentrations (ng g−1) in sediment core form Honghu Lake

Source apportionment

The diagnosis ratio method was firstly used to identify the possible PAH sources before the application of receptor models. The diagnostic PAH ratios which exhibited differentiation were the BaA/(BaA + Chr) and IcdP/(IcdP + BghiP) ratios (Fig. S4). The ratios of BaA/(BaA + Chr) > 0.35 and IcdP/(IcdP + BghiP) > 0.5 suggested coal or biomass (wood, grass) burning; the ratios of BaA/(BaA + Chr) < 0.35 and IcdP/(IcdP + BghiP) < 0.5 indicated liquid fossil fuel and petroleum combustion (Yunker et al. 2002). For the ratio of BaA/(BaA + Chr), the values ranged from 0.23 to 0.42, reflecting liquid fossil fuel combustion and coal, grass, or wood combustion. The same results were also found from the ratio of IcdP/(IcdP + BghiP) with the values ranging from 0.49 to 0.55. The diagnosis ratio indicated that PAHs in sediment core of this study were mainly from the liquid fossil fuel combustion and biomass/coal combustion.

PCA-MLR

Based on the principle component and Varimax rotation methods, three factors with the eigenvalue greater than 1 were extracted, and these three factors accounted for 84.8% of the total variances (Table S3). The rotated factor loadings obtained by the PCA-MLR model are shown in Fig. 2a. Factor 1, accounting for 57.5% of variance, was highly loaded on BaA, Chr, BbF, BkF, and IcdP, which indicates diesel combustion (Harrison et al. 1996). In addition, factor 1 was also loaded on Fla, Pyr, DBA, and BghiP, which represent the profile of gasoline engine emission (Larsen and Baker 2003; Wang et al. 2009). Therefore, factor 1 was identified as petroleum combustion. Factor 2 (accounting for 18.7% of the total variance) was dominated by Fl, Phe, and Ant. Nap, Fla, and Pyr also accounted for some loadings in this factor. Phe and Fl are often from coal combustion (Mai et al. 2001; Mai et al. 2003), while Ant is used as the marker of wood combustion source (Harrison et al. 1996). Nap could also originate from incomplete combustion (Simcik et al. 1999). Therefore, factor 2 might be the mixed sources of wood and coal combustion sources. Factor 3 (accounting for 8.67% of the total variance) got high loadings on Acy, Ace, Fl, Phe, and Ant. These low molecular weight PAHs are abundant factions in petrogenic sources such as crude oil and petroleum (Liu et al. 2009; Yu et al. 2015). Therefore, factor 3 was labeled as oil leakage. The calculated average contributions based on MLR were 45.5, 41.4, and 13.1% for petroleum combustion, mixed sources of coal and wood combustion, and oil leakage, respectively.

Fig. 2
figure 2

Source compositions obtained from PCA-MLR (a), Unmix (b), and PMF (c) receptor models

Unmix

Three sources were extracted by the Unmix model, and the source compositions are shown in Fig. 2b. Ace was the dominating species in factor 1. Acy, Fl, Fla, and Pyr also accounted for some loadings in factor 1. The source profile of factor 1 was similar to factor 3 in PCA-MLR analysis. Thus, factor 1 was identified as oil leakage. Factor 2 was heavily weighted by Phe and moderately influenced by Fl, Fla, and Nap. Consequently, factor 2 was labeled as mixed sources of coal and biomass combustion. Factor 3 was significant influenced by Nap, Fla, Pyr, Chr, BbF, BkF, IcdP, and BghiP, which was similar with factor 1 in PCA-MLR. Therefore, factor 3 represented petroleum combustion. The estimated average contributions were 21.3, 39.5, and 39.2% for petrogenic source, mixed sources, and petroleum combustion, respectively.

PMF

Four factors were identified by the PMF model, and the source profiles are shown in Fig. 2c. Factor 1, which accounted for 31.0% of the ∑16PAHs, strongly reflected the variation of Phe and was also influenced by Ant, Fl, and Fla, to some extent. It has been reported that Phe can be identified as the marker of coal combustion sources (Sofowote et al. 2008) and Ant can be used as the tracer of wood combustion sources (Harrison et al. 1996). Therefore, factor 1 indicated the mixed sources of coal and wood combustion. Factor 2 was responsible for 20.9% of the ∑16PAHs, which was mainly loaded on Nap and moderately on Pyr, Chr, Fla, BbF, BkF, and BaA. Nap can be derived from sources related to incomplete combustion (Simcik et al. 1999) or oil leakage (Dahle et al. 2003), while Fla and Pyr have been used as markers of coal combustion emission (Kulkarni and Venkataraman 2000; Fang et al. 2006). In addition, BbF, BkF, and Chr are the main PAH components emitted by domestic coal combustion in China (Chen et al. 2005). Therefore, factor 2 indicated contributions from domestic coal combustion. Factor 3, which explained 31.8% of the ∑16PAHs, was highly loaded on BaA, BbF, BaP, IcdP, and BghiP. The same profile was also found in diesel and gasoline combustion (Harrison et al. 1996; Mai et al. 2003). Therefore, factor 3 was labeled as petroleum combustion. Factor 4 was predominately weighted on Acy and Ace, and moderately loaded on Pyr, Fla, Ant, and Fl. Acy and Ace are low molecular weight PAHs, which are abundant factions in petrogenic sources such as crude oil and petroleum (Liu et al. 2009; Yu et al. 2015). Pyr, Fla, and Phe are also associated with the contribution of crude oil (Lang et al. 2015). Fishery is one of the most important functions of the Honghu Lake, and the oil spills from the fishing boat were inevitable. Therefore, factor 4 indicated oil leakage, and this factor accounted for 16.3% of the ∑16PAHs.

Comparison of three receptor model results

PCA-MLR, PMF, and Unmix are source unknown receptor models, which do not require the source profile information. These models presume that species with similar variability are clustered together in a minimum number of factors that explain the variability of the whole dataset, or rather that each factor is associated with a source or source type (Larsen and Baker 2003; Yang et al. 2013a). In addition, these three receptor models are factor analysis-based methods. Therefore, they have been widely used in pollutant source apportionment studies due to their convenience. However, these models have their own advantages and disadvantages. For example, PMF is very complicated and time consuming for that the number of factors is unknown and needs further evaluation. By contrast, PCA-MLR and Unmix are relatively simple and easy to operate. To better understand the PAH source apportionment in the sediment core, results from multi-receptor models were applied and compared.

It is recommended that four aspects should be prioritized when comparing the results applying different receptor models. The four aspects include (1) the correlation coefficient between the observed and predicted (O/P) PAH concentrations in a certain model (Larsen and Baker 2003; Yang et al. 2013a), (2) the correlation efficiency between the O/P among the different models (Song et al. 2006; 2008), (3) the source numbers and compositions identified by different models, and (4) contributions of each source to the total concentrations (Cao et al. 2011; Zhang et al. 2012).

Intra-comparison of observed and predicted concentrations by a certain model

The fitting degree between the observed and predicted concentrations of ∑16PAHs in a certain model was evaluated by scatter plot as shown in Fig. 3. Significant correlations (p < 0.01) between O/P scatter plots were found with r, slop, and intercept ranging from 0.998 to 1.000, 1.00 to 1.02, and − 1.45 to 0.49, respectively. In particular, PMF showed the best fitting degree with almost one-to-one fitting. In addition, high correlation coefficients between observed and predicted concentrations of 16 PAH species were also found with the r ranging from 0.37 to 1.00 (Fig. S5). Pearson correlation is known to be sensitive to the phase in the trend of two variables, and it tells little about the differences in amplitude (Belis et al. 2015). Therefore, another three statistical parameters were computed to better quantify the differences between solutions reported for O/P data (Table 2): the root mean square error (RMSE), the absolute fractional bias (AFB), and the weighted difference (WD) defined as the following equations, respectively (Cesari et al. 2016):

$$ \mathrm{RMSE}=\sqrt{\frac{1}{m}{\sum}_{N=1}^m{\left({X}_N-{Y}_N\right)}^2} $$
(2)
$$ \mathrm{AFB}=2/\mathrm{m}{\sum}_{N=1}^m\left|{X}_N-{Y}_N\right|/\left({X}_N+{Y}_N\right) $$
(3)
$$ \mathrm{WD}=1/\mathrm{m}{\sum}_{N=1}^m\frac{\left|{X}_N-{Y}_N\right|}{\sqrt{S_N^2+}{r}_N^2} $$
(4)
Fig. 3
figure 3

Fits among the observed and predicted concentrations of ∑16PAHs in the sediment core by PCA-MLR (a), Unmix (b), and PMF (c) receptor models

Table 2 Comparison of observed and predicted PAH concentrations using three different receptor models

where m is the total number of samples; X N and Y N are the observed and predicted PAH concentrations; S N and r N represent their uncertainties. RMSE is always used to indicate the spread of the O/P series; the AFB is an indicator of the fitness of the concentrations (accepted range is 0 to 2); the WD is commonly used to test the relationship of the distance between two-variable series considering their uncertainty (the range of acceptability is considered between 0 and 2). Seen from Table 2, RMSE values suggested a certain level of scatter in ∑16PAH concentrations (8.83 ± 0.17 ng g−1) for the three models mainly due to Phe (9.21 ± 2.23 ng g−1). WD and AFB were in the acceptable ranges indicating the similarity of amplitude.

Inter-comparison of modeled ∑16PAHs within different models

Significant correlations (p < 0.01) were also found between the observed and predicted concentrations of ∑16PAHs in the inter-comparison. Seen from Fig. 4, the r and slope ranged from 0.98 to 1.00 and 0.96 to 1.02, respectively. The best fitting was found between PMF/Unmix, followed by PCA-MLR/Unmix and PCA-MLR/PMF. Same results were also found in source apportionment for PAHs in atmosphere (Larsen and Baker 2003; Ma et al. 2010), sediments (Zhang et al. 2012), and soil (Yang et al. 2013a). Good correlations between the inter-comparisons indicated that the source apportionment results obtained from these models were comparable.

Fig. 4
figure 4

Inter-comparison of observed and predicted concentrations of ∑16PAHs in the sediment core by three different receptor modes. PCA-MLR vs PMF (a),Unmix vs PCA-MLR (b), and PMF vs Unmix (c)

Source numbers and compositions

Three sources (mixed sources of coal and biomass combustion, petroleum combustion, and oil leakage) were both identified by PCA-MLR and Unmix receptor models as discussed above. The source numbers identified by PCA-MLR is eigenvalue based, while the Unmix model needs mass receptor sample data to generate certain source numbers (i.e., 200~300 samples generate five sources and 2000~3000 samples produce seven sources). In this study, only 37 samples were available and three sources were identified by the Unmix model. As for the PMF model, four factors were apportioned including the three commonly identified sources and the domestic coal combustion. In PMF model analysis, the appropriate numbers of source are co-determined by Q values, scaled residuals, predicted versus observed concentration interpretation, and the physical meaning of factor profiles (Baudic et al. 2016).

To better compare the source compositions, the source profiles/loadings were normalized and exhibited in Fig. 5. The mixed sources were characterized by high loadings on Fl, Phe, and Ant. Petroleum combustion was characterized by high loadings on high molecular weight PAHs (BbF, BkF, BaP, IcdP, DBA, and BghiP), whereas oil leakage source got high loadings on low molecular weight PAHs. The Pearson correlation analysis between the source profiles obtained from different models exhibited significant correlations (p < 0.01), with r ranging from 0.51 to 0.68, 0.57 to 0.80, and 0.80 to 0.86 for mixed sources, petroleum combustion, and oil leakage, respectively (Table S4). The results suggested that the source profiles derived from three receptor models agreed with each other.

Fig. 5
figure 5

Scatter plots of normalized factor loadings or profiles of identified possible PAH sources by three different models. Mixed sources vs petroleum combustion (a), mixed sources vs oil leakage (b), and petroleum combustion vs oil leakage (c)

Source contributions

Figure 6 exhibited the average contributions of each identified source to the ∑16PAHs by different receptor models. For the average contributions of three common sources, petroleum combustion was highest (31.8 to 45.5%), followed by mixed sources (31.0 to 41.4%) and oil leakage (13.1 to 21.3%). The recent studies indicated that biomass burning and coal combustion were the main PAH sources in surface sediments in China (Table 3). However, the source contributions of mixed sources derived from PCA-MLR and Unmix in this study were different from previous studies. For example, Xu et al. (2006) and Zhang et al. (2007) found that coal combustion and biomass burning were the dominant PAH sources in China. The source apportionment results from the PMF model indicated that biomass burning and coal combustion contributed most (50.9%) to PAHs in the sediment core. The average contribution of each source to ∑16PAHs did not convey detailed information of source contribution in each sample/layer. Therefore, the historical distributions of each identified PAH source contribution estimated by the three receptor models were as shown in Fig. 7. It should be noted that the mixed sources in the PMF model included biomass burning and coal combustion.

Fig. 6
figure 6

Average contribution of each source to ∑16PAHs apportioned from PCA-MLR (a), Unmix receptor models (b), and PMF (c) receptor models

Table 3 The published papers concerning PAH source contributions (%) in surface sediment in China
Fig. 7
figure 7

Historical variation of concentration contribution (left panel) and percentile contribution (right panel) of different sources identified by three receptor models

Seen from Fig. 7, before 1949, the temporal variations of three sources deduced from three models were similar. In this period, the mixed source contributions (ng g−1) decreased from 131 ± 26.4 to 70.1 ± 13.9 ng g−1 and petroleum combustion increased from 4.44 ± 13.5 to 23.5 ± 8.06 ng g−1. Despite that the mixed sources showed an opposite trend to petroleum combustion, the mixed sources still contributed most (69.8 ± 12.4%) to ∑16PAHs in sediment of Honghu Lake, followed by oil leakage (19.5 ± 6.28%) and petroleum combustion (10.7 ± 8.26%) in this period. From 1949 to 1960, the source contributions (ng g−1) of mixed sources and petroleum combustion dramatically increased, while the percentage contributions (%) of mixed sources decreased. In this period, mixed sources still contributed most (62.6 ± 13.9%) to ∑16PAHs. From 1960 to 1978, the variations of source contributions were complex due to social activities. The percentage contributions of petroleum combustion derived from PCA-MLR (61.7 ± 9.93%) and Unmix (59.8 ± 11.6%) were higher than that derived from PMF (41.5 ± 8.03%) during this period. From 1978 to 1995, the mixed sources and petroleum combustion both exhibited an increasing trend and petroleum combustion was still the dominant PAH source (> 50%) calculated by PCA-MLR and Unmix models. From 1997 to 2005, both the concentration and percentage contributions of mixed sources and petroleum combustion source decreased significantly due to the floods and the pollution control measurements. Compared with the temporal variation calculated from PCA-MLR and Unmix models, the PMF-derived variation was relatively stable without an outlier in this period. From 2005 to now, the contributions of mixed sources and petroleum combustion increased again due to another round of urbanization and industrialization (Xu et al. 2006; Zhang et al. 2007).

For oil leakage source, it exhibited a relative stable temporal variation except three peak values apportioned from Unmix and PCA-MLR that occurred in 1997, 2006, and 2012, which accounted for 17.5–66.4, 54.1–92.0, and 26.8–73.6% to ∑16PAHs, respectively. Larsen and Baker (2003) suggested that the PMF model employed uncertain files to down-weight the outlying variables and it allowed for additional dimensions affecting the measured concentrations not explained by the sources alone, such as weather, additional transient source, or sampling artifacts. Therefore, the temporal distribution of oil leakage deduced from the PMF model was stable compared with the PCA-MLR and Unmix models.

Given the fact that biomass burning and coal combustion are the main PAH sources in surface sediments in China (Table 3), the source contributions of petroleum combustion calculated from PCA-MLR and Unmix were higher than 50% during some periods (i.e., 1960 to 1978). While the results from the PMF model agreed with the real situation that coal combustion and biomass burning are the dominant PAHs in China. The historical statistical data of coal consumption and domestic coal combustion from the PMF model also correlated well with each other (Fig. S6). Therefore, we think that the source apportionment results employing the PMF model were most reasonable due to the following reasons: (1) best fits between the observed and predicted PAH concentrations, (2) the segregation of the domestic coal combustion sources, and (3) the reasonable temporal distribution of source contributions as discussed above.

Suggestion to source apportionment studies

The identification and quantification of the sources of pollutants at the receptor samples employing the receptor models are mathematical or statistical procedures. In view of mathematical or statistical procedures, the source apportionment results from three different receptor models were acceptable. However, the source apportionment is more than just mathematics; the source apportionment results should be reasonable. In this study, the real situation that coal and biomass combustion still contributed most to PAHs in the environment at present in China, not to mention several decades ago. Therefore, the PMF model was thought to be more reasonable. In source apportionment studies concerning temporal variation, we suggested that the correlation between the statistical data (i.e., the energy consumption) and the corresponding historical source contribution should be checked. In other source apportionment studies related to spatial distribution, we suggested that the relationship between the geographic location of pollution source and the spatial distribution of source contributions should be taken into account.

Conclusion

In this paper, the source apportionment of 16 US EPA priority PAHs in the sediment core from the Honghu Lake employing three different receptor models was carried out. The four aspects including the observed and predicated PAH concentrations in a certain model or different model, source numbers and compositions, and source contributions were prioritized in the comparison study. The results suggested that PMF was more reasonable compared with PCA-MLR and Unmix models. The high-resolution temporal distribution of source contributions indicated that biomass burning and coal combustion were the main sources before 1960 and after 1993, while petroleum combustion increased from the bottom sediment to the surface. Suggestions were also made for source apportionment studies concerning the temporal/spatial distribution of pollutants.