Introduction

Soil contamination of heavy metal (HM) has become one of the serious environmental problems globally, and its impact on human health and ecosystems should not be underestimated (Bi et al. 2020; Han et al. 2020). In particular, the accumulation of HMs in agricultural soils has aroused widespread concern because they cannot only jeopardize human health through intake, skin contact, and respiration (Jafarzadeh et al. 2022), but also transport to crops and thus accumulate in the humans along the food (Du et al. 2021; Zhang et al. 2022b). Excessive intake of HMs may irreversibly damage the human immune, nervous and secretory systems (Deng et al. 2023). Therefore, the assessment of pollution and risk for HMs in agricultural soil is essential for the protection of food security and ecosystems.

Pollution source apportionment is the basis for effective intervention and treatment of soil contamination for HMs. The HMs were released into agricultural soils through anthropogenic and natural sources, among which anthropogenic sources were an important cause of agricultural soil pollution for HMs (Deng et al. 2023; Wang et al. 2022a). The complexity of soil composition, the spatial heterogeneity of concentration distribution, and the diversity of sources posed great difficulties for the study of source analysis of soil contamination with HMs (Zhang et al. 2023a). At present, source apportionment methods can be roughly divided into two categories (Ran et al. 2021). One category mainly identified sources qualitatively by methods such as geostatistical analysis and multivariable statistics (Su et al. 2023; Wang et al. 2022a). The other category used receptor models for quantitative analysis for contamination sources, such as chemical mass balance (CMB), UNMIX models, absolute principal component score-multiple linear regression model (APCS-MLR), isotopic tracer technology and positive matrix factorization (PMF) (Shi et al. 2022; Wang et al. 2022b; Wei et al. 2023b). The PMF models has been widely used for quantitative source apportionment of soil HMs in a variety of environments.

Regional soil contamination identification and risk assessment of HMs can provide theoretical references for environmental prevention and policy development. The geo-accumulation index (Igeo) and nemerow integrated pollution index (NIPI) were often used to measure the soil contamination levels of HMs (Xiang et al. 2022; Zhang et al. 2023b). In addition, the potential ecological risk index (PERI) and the health risk assessment method (HRA) were used to evaluate the threats on HMs to the ecosystem and human health. The Monte Carlo simulation can decrease the uncertainty in the process for health risk evaluation to a greater extent and make the evaluation results more appropriate (Liu et al. 2023b; Yang et al. 2022b). Meanwhile, it is important to combine various indicators to systematically assess soil contamination of HMs. It is also important to consider that HMs from different sources exhibited a great difference in the level of health risk due to the diversity of bioavailability, concentration, and toxicity (Jing et al. 2023; Liang et al. 2023). Therefore, it is crucial to assess health risks and identify risk zones for specific contamination sources to develop targeted remedial and preventive measures.

Most studies on agricultural soil pollution by HMs were concentrated in some industrial areas and economically developed zones, for instance, the northeast, central south, and the east coast of China (Liu et al. 2023b; Ran et al. 2021; Wu et al. 2022). The southwest of China is one of the regions with the most concentrated and extensive distribution of karst landforms (Xiao et al. 2022; Zhang et al. 2022a), which is rich in mineral resources, numerous smelting enterprises and frequent mining activities (Xia et al. 2022). The formation of a large number of slag fields has brought the hard threat to the fragile karst ecotope (Eugenio D’Amico et al. 2023; Kong et al. 2018). The high geological background superimposed on the karst landscape has led to the high background values and phenomenon of abnormal enrichment of soil HMs in the southwest region of China (Wen et al. 2020; Yin et al. 2023). In addition, the southwest region is an important eco-security shield for the upstream of the Yangtze River in China, and soil contamination with HMs has become a major hidden danger affecting regional ecological and environmental safety (Du et al. 2021; Li et al. 2022a). Therefore, it is of great relevance to carry out research on the distribution and sources of HMs in typical agricultural ecosystems in Southwest China. In particular, Tongren City has been known as a location of rich Hg resource, but the long-term extraction of minerals in the area left a potential source of environmental pollution. The minerals are mainly located in the Wanshan district, Songtao and Bijiang counties. The main aims of this study were to (1) describe the contamination level and spatial distribution of HMs in agricultural soils; (2) quantify and define the contribution sources for soil HMs using PMF and GIS models; (3) evaluate the health risks of HMs for adults and children based on risk modeling and Monte Carlo simulations; and (4) quantify the impact of contamination sources on ecological and human health risks. The results of the study can provide valuable theoretical references for the distribution and risk control of HMs in agricultural soils.

Materials and Methods

Study Area and Sample Collection

Tongren City, a city in the northeastern part of Guizhou Province of China, is a typical karst landscape area with high elevation in the northwest and low elevation in the southeast of the City. The area of city covers longitude 107°45′E − 109°30′E, and latitude 27°07′N − 29°05′N (Fig. 1). The study area lied in the central subtropical monsoon humid climate zone, with annual rainfall of 1100.0–1400.0 mm. The main rock-forming soil is the limestone soil developed by carbonate rocks. The minerals gained from the study area were mainly minerals of Hg, Mn, Cu, Pb and Sn.

Fig. 1
figure 1

Distribution map of soil samples, elevation, metal mines and landuse in Tongren City

According to the principle of classification sampling of soil, combined with a land use type map, mineral distribution map, and topographic map, the location and number of sample points were decided through grid distribution. The mining-affected areas and main grain-producing areas was determined by mining distribution and agricultural soil distribution, and the density of sample collection was appropriately adjusted according to the cultivated land area and the degree of concentration and contiguity. A total of 467 farmland soils and 83 natural soils were collected. The natural soils were collected from 10 to 20 cm soil layer in natural woodland and barren grassland, which were far away from human industrial and agricultural activities and had no or little human activities. Five-point sampling method was used to collect agricultural soils. The soil layer (0–20 cm) was selected for sampling, and 1 kg was evenly mixed for bagging. GPS was used to locate the sampling sites coordinates. After removal of plant roots and stones, soil samples were crushed in a porcelain mortar and sieved through a 0.149 mm nylon sieve before being put into envelope bags for chemical measurements.

Chemical Analysis

An inductively coupled plasma mass spectrometer (ICP-MS, PerkinElmer, ELAN DRC-e) was used to detect the Cd, Cu, and Ni. An inductively coupled plasma emission spectrometry (ICP-OES, Perkin Elmer, Optima 8000) was used to determine the Cr, Pb, and Zn concentrations of soil. The samples were digested in an aqua regia bath, and then, As and Hg were detected using an atomic fluorescence spectrophotometer (AFS-9700). Soil pH was extracted using a 1:2.5 soil-water ratio and measured using the glass electrode technique. A soil reference material (GBW07404) and blank samples were added for quality control during the analysis. The reference material’s recoveries varied from 88 to 106%. The RSDs of the duplicate samples were less than 10%, while the reproducibility of the examined samples varied from 10 to 15%.

Spatial Clustering

Spatial autocorrelation analysis allows users to evaluate the spatial autocorrelation of a certain set of geographic data, i.e., whether they have some regular pattern in the spatial context (Zhang et al. 2022c). Spatial analysis can identify data spatial heterogeneity by determining whether there is a consistent pattern in the data across space. Spatial autocorrelation index (Moran’s I index, I) was used to determine whether the HMs content of agricultural soils in the study area had spatial clustering characteristics (Anselin 2010). The Moran’s I index was calculated using Eqs. (1) and (2):

$$ {I}_{i}=\frac{{x}_{i}-X}{{S}_{i}^{2}}\sum _{j=1,j\ne i}^{n}{w}_{i,j}({x}_{j}-X)$$
(1)
$$ {S}_{i}^{2}=\frac{{\sum _{j=1,j\ne i}^{n}({x}_{j}-X)}^{2}}{n-1}{-}X^{2}$$
(2)

where Ii is the value of Moran’s I index of x; \( {x}_{i}\) is the content of x at location i;\( {x}_{j}\) is the content of x at other locations; \( \overline{X}\) is the average content of x; \( {w}_{i,j}\) is the spatial weight between samples i and j, i.e., the reciprocal of their distance; \( {S}_{i}^{2}\) is the variance of x content; and n is the total number of samples.

Pollution Assessment Methods

According to the contents of various HMs, the soil pollution was evaluated by the Nemerow Integrated Pollution Index (NIPI) (Gui et al. 2023; Zhang et al. 2023b) using the following formulas Eq. (3) and Eq. (4):

$$ {P}_{i}={C}_{i}/{S}_{i}$$
(3)
$$ \text{N}\text{I}\text{P}\text{I}=\sqrt{\frac{{\left({\text{P}}_{i\text{a}\text{v}\text{e}}\right)}^{2}+{\left({\text{P}}_{i\text{m}\text{a}\text{x}}\right)}^{2}}{2}}$$
(4)

Where Pi is the single factor contamination index of soil HMs; Ci is the contents of soil HMs; and Si is the soil environmental standard for HMs (GB15618-2018) (MEE 2018); Piave is the mean value of the single factor contamination index of HMs in soil, Pimax is the maximum value of the single factor contamination index of HMs in soil. The NIPI was classified into 5 grades: NIPI ≤ 0.7, safe; 0.7 < NIPI ≤ 1.0, precaution; 1.0 < NIPI ≤ 2.0, slightly contaminated; 2.0 < NIPI ≤ 3.0, moderately contaminated; NIPI > 3.0, seriously contaminated.

The Igeo is an important parameter to distinguish the impact of anthropogenic activities (Wei et al. 2023a). The calculation of Igeo was shown in Eq. (5):

$$ {\text{I}}_{\text{g}\text{e}\text{o}}={\text{log}}_{2}\frac{{C}_{i}}{\text{K}{B}_{i}}$$
(5)

where Ci is the measured value of agricultural soil HMs content, Bi is the background value of HMs in the soil of the study area (the natural soil content in Table 1 was used as the background value for study area), K was used to correct for regional differences in the background value of HMs in the soil (generally a constant of 1.5). Classification standard: Igeo ≤ 0, uncontaminated; 0 < Igeo ≤ 1, uncontaminated to moderately contaminated; 1 < Igeo ≤ 2, moderately contaminated; 2 < Igeo ≤ 3, moderately to heavily contaminated; 3 < Igeo ≤ 4, heavily contaminated; 4 < Igeo ≤ 5, heavily to extremely contaminated; Igeo > 5, extremely contaminated.

Risk Assessment

Potential Ecological Analysis (PERI)

The PERI approach was used to calculate the single potential ecological risk (EI) and integrated potential ecological risk (RI) associated with HMs in the research area (Li et al. 2022b; Wang et al. 2023a; Zhang et al. 2020a). The concentration, properties, and ecological effects of HMs were comprehensively considered to evaluate the risks by HMs to the ecological environment. Equation (6) provides a demonstration of the calculation:

$$ \text{R}\text{I}=\sum _{i=1}^{\text{n}}{\text{E}\text{I}}_{i}=\sum _{i=1}^{\text{n}}({\text{T}\text{r}}_{i}\times \frac{{C}_{i}}{{B}_{i}})$$
(6)

Where EIi is the single ecological risk that HMs i could pose, and Tri is the individual toxicity response coefficients which is 10 for As, 30 for Cd, 2 for Cr, 5 for Cu, 40 for Hg, 5 for Ni, 5 for Pb, and 1 for Zn. The Ci is the actual measured level of soil HMs. Bi is the background value of local soil HMs (Table 1). The criteria for classifying potential ecological risks were listed in Table S1.

Table 1 Descriptive statistics of heavy metals (HMs) content in soil in the study area (mg/kg)

Health Risk Assessment (HRA)

Using the HRA model published by the U.S. Environmental Protection Agency (USEPA 1997), health risks (carcinogenic and non-carcinogenic) for adults and children were estimated through three pathways: direct ingestion (ing), oral-nasal inhalation of suspended soil particles (inh), and dermal exposure (dermal), respectively (Jafarzadeh et al. 2022; Liu et al. 2023a). The three pathways of average daily human exposure to soil HMs were calculated as Eqs. (7)–(9):

$$ {ADD}_{\text{i}\text{n}\text{g}}=\frac{{C}_{i}\times \text{O}\text{S}\text{I}\text{R}\times {\text{A}\text{B}\text{S}}_{\text{o}}\times \text{E}\text{F}\times \text{E}\text{D}}{\text{B}\text{W}\times \text{A}\text{T}}\times {10}^{-6}$$
(7)
$$ {ADD}_{\text{d}\text{e}\text{r}\text{m}\text{a}\text{l}}=\frac{{C}_{i}\times \text{S}\text{A}\text{E}\times \text{S}\text{S}\text{A}\text{R}\times \text{E}\text{v}\times {\text{A}\text{B}\text{S}}_{\text{d}}\times \text{E}\text{F}\times \text{E}\text{D}}{\text{B}\text{W}\times \text{A}\text{T}}\times {10}^{-6}$$
(8)
$$ {ADD}_{\text{i}\text{n}\text{h}}=\frac{{C}_{i}\times \text{D}\text{A}\text{I}\text{R}\times \text{E}\text{F}\times \text{E}\text{D}}{\text{P}\text{E}\text{F}\times \text{B}\text{W}\times \text{A}\text{T}}\times {10}^{-6}$$
(9)

where ADDing, ADDdemalr, ADDinh(mg/kg/d) is the average daily intake dose of i heavy metal through soil direct ingestion, dermal contact, and oral-nasal inhalation of suspended soil particles, respectively. OSIR (mg/d) is the daily ingestion of soil, ABS0 is the unit conversion factor, ED (a) is the soil exposure duration, EF (d/a) is the soil exposure frequency, BW (kg) is the body weight of the exposed individual, AT (d) is the exposure average time, SAE (cm2) is the exposed skin surface area, SSAR (mg/cm2/d) is the adherence factor, Ev (unitless) is the daily frequency of dermal exposure events. ABSd (unitless) is the dermal absorption factor. DAIR (m3/d) is the inhalation rate of soil. PEF (m3/kg) is the soil particle emission factor.

The formulas for calculating the non-carcinogenic risk index (IHQ) and carcinogenic risk index (ICR) are listed in Eqs. (10)–(11):

$$ \text{I}\text{H}\text{Q}=\sum _{i=1}^{\text{n}}{\text{H}\text{Q}}_{i}=\sum _{i=1}^{\text{n}}\sum _{j=1}^{3}\frac{{ADD}_{ij}}{{\text{R}\text{f}\text{D}}_{ij}}$$
(10)
$$ \text{I}\text{C}\text{R}=\sum _{i=1}^{\text{n}}{\text{C}\text{R}}_{i}=\sum _{i=1}^{\text{n}}\sum _{j=1}^{3}{ADD}_{ij}\times {\text{S}\text{F}}_{ij}$$
(11)

where Ci is content of HM i of soil; HQi is the non-carcinogenic risk index for HM i which is small and negligible when HQi is less than 1, and vice versa (Yang et al. 2022b); ADDij is the dose of HM i through route j; CRi is the carcinogenic risk index of HM i which shows no evidence of a carcinogenic risk CR < 1 × 10− 6), evidence of a carcinogenic risk (10− 6 < CR < 10− 4), and evidence of an intolerable carcinogenic risk (CR > 1 × 10− 4); SFij is the carcinogenic slope factor, and RfDij is the reference dose of HM i corresponding to pathway j. The reference values and related parameter values (Table S2 and S3) were obtained from the technical guidelines for risk assessment of soil contamination of land for construction (MEE 2019) and related studies at home and abroad (Liang et al. 2023; Liu et al. 2023a; USEPA 2011).

The Monte-Carlo Model

The Monte Carlo simulation, based on the Central Limit Theorem, is a fundamental concept that utilizes the law of large numbers and other statistical inference methods to repeatedly perform experiments and obtain more fitting distribution values (Liu et al. 2023b). It is used to estimate the propagation of uncertainty in the output results of a model and calculate confidence intervals. By utilizing computers to perform a large number of repeated random sampling on the analyzed data, random variables that conform to certain probability distribution forms are constructed. For each parameter in the mathematical model, a possible range of values and accompanying probability distributions are developed to account for these parameters’ unpredictable fluctuations. Pseudorandom means are used to simulate the selection of values by particular probability distributions. Ultimately, important parameters that affect the model results were obtained.

The specific steps of the Monte Carlo simulation model operation are as follows: (1) configure the variables for HMs data and input the measured HMs values, fitting the distribution function type of HMs data; (2) generate probability density distribution functions for each variable based on the data type by random sampling within the range of values; (3) establish a mathematical model for health risk functions of HMs, and calculate the model findings for each variable, resulting in a probability density distribution function for health risk assessment; (4) ascertain each HM’s sensitivity contribution.

Source Apportionment Model

The positive matrix factorization (PMF) is a type of multivariate factor analysis model, which decomposes the matrix (X) of sample data into factor contribution matrix (G) and factor spectral matrix (F), identifies the factor spectral matrix, and quantitatively calculates the factor contribution of the sample (Zhou et al. 2023). In the source analysis of soil HMs, the PMF was used to organize the HMs content of soil samples, extract several factors, identified the factors as different sources by using the identification components, and then calculated the contributions of different factors (sources) to soil samples by multiple linear regression. Factor profiles and contributions were derived by minimizing the objective function Q (Eq. (12) and Eq. (13):

$$ {X}_{ij} = \sum _{k=1}^{\text{p}}{G}_{ik}{F}_{kj}$$
(12)
$$ {Q} = \sum _{i=1}^{\text{n}}\sum _{j=1}^{\text{m}}{\left[\frac{{X}_{ij}-\sum _{k=1}^{\text{p}}{G}_{ik}{F}_{kj}}{{u}_{ij}}\right]}^{2}$$
(13)

where Xij is the matrix of sample data; Gik is the factor contribution matrix; Fkj is the factor spectral matrix; eij is the residence of each sample; uij is the uncertainty; p is the number of factor; i denotes the sample count; and j denotes the HM species.

Data Analysis

The Kolmogorov-Smirnov (K-S) test was employed to assess the normality of the data. Variables that did not have a normal distribution underwent a logarithmic modification. Principal Component Analysis (PCA) and Correlation Analysis (CA) were carried out using R software (Zhang et al.2023b).

Results

HMs Content in Soil

Soil HMs Background Values

The contents of As, Cd, Pb, Cu, Hg, and Zn in natural soil in the study area were in accordance with the normal distribution after logarithmic transformation, and the Cr and Ni concentrations were by the normal distribution (Table 1). Therefore, the geometric means were used to represent the contents of Zn (96.74 mg/kg), Pb (25.16 mg/kg), As (18.62 mg/kg), Cu (27.47 mg/kg), Cd (0.15 mg/kg), and Hg (0.14 mg/kg) in the natural soils of the study area (local background values, LBV), and the mean values were used to represent the contents of Cr (69.83 mg/kg) and Ni (37.57 mg/kg) (background values).

HMs Contents in Agricultural Soil

Agricultural soils in the research region had mean Cd, Pb, Hg, Cu, and Zn levels that were all considerably higher than the local soil background values (LBV) (GB15618-2018). pH of the soil varied from 4.40 to 8.56 (mean: 6.84 ± 0.90) (Table 1). The descriptive statistics of soil HMs content in the study area revealed that the mean values of As, Pb, Cd, Hg, Cr, Cu, Ni, and Zn in agricultural soils were 16.5 ± 17.25, 36.06 ± 35.95, 0.34 ± 0.17, 2.51 ± 4.71, 61.88 ± 30.36, 30.76 ± 10.94, 32.52 ± 11.82, and 112.1 ± 62.29 mg/kg, respectively. The proportion of agricultural soils exceeding the risk screening values (GB15618-2018) (MEE 2018) for Hg and Cd was 38.9% and 13.5%, respectively. The results indicated that Hg and Cd in the agricultural soils of the study area need to be given priority attention. Although accumulating to a certain extent, Zn, Ni, and Cu were within a manageable risk range for the quality and safety of agricultural products.

Spatial Distribution and Clustering of Soil HMs Content

The Moran’s I index for soil HMs in the study area was presented in Fig. 2. The concentrations of HMs in agricultural soil in the study area showed a spatial clustering trend, with positive spatial correlation. The local spatial autocorrelation was performed to plot the cluster distribution (Fig. 2). The results showed that the clustered (high-high value) and outliers (low-high value) points of Hg in agricultural soils in the study area were concentrated in the Bijing, Wanshang and Yinjiang areas in the eastern region of Tongren City. The low-low value clustered points of Hg was mainly distributed in the central region and accounted for a higher percentage (38.1%). The clustering distribution of Cd, Pb, and Zn in agricultural soils was similar, with high-high value clustering points mainly distributed in Songtao and Jiangkou areas, and low-low value clustering points mainly distributed in central Yinjiang and Sinan areas and eastern Bijiang and Wanshan areas. The clustering distribution of Ni, Cu, and Cr is similar, with high-high values clustering points mainly in Songtao and Sinan areas, and low-low values clustering points mainly in Dejiang, Bijiang and Wanshan areas.

Empirical Bayesian Kriging (EBK) was performed to map the spatial distribution of HMs in agricultural soils (Fig. 2). The results showed that the content of Hg showed a trend of gradually increasing from the central to the surrounding area, and the high values were mainly distributed in the eastern region of the study area. Pb, Cd, Zn, and Cu levels all followed a similar pattern of progressively declining from the center to the edges. The As content gradually increased and subsequently decreased from the center to the eastern area. Low levels of Cr and Ni were present generally. To sum up, the spatial distribution of HMs content in the study area was generally aligned with the spatial clustering results.

Fig. 2
figure 2

Spatial distribution and cluster of content of heavy metals (HMs) for agricultural soil

Evaluation and Spatial Distribution of Agricultural Soils Pollution for HMs

The Igeo approach was used to assess soil contamination caused by single HMs in the research region (Fig. 3a). The soil pollution levels of HMs were represented by the average values of the Igeo. The mean Igeo values for Hg and Cd were larger than 0, coming in at 2.01 and 0.44, respectively. The soil Hg showed a moderate to extremely contaminated level (categories IV, V, VI, VII) with a proportion of 45%, mainly distributed in the eastern part of the study area, including Bijiang, Wanshan, and Yuping regions (Fig. 3b). The soil Cd exhibited a moderate to heavily contaminated level (categories III, IV) with a proportion of 19%, mainly distributed in regions such as Songtao, Jiangkou, and Sinan (Fig. 3b). The soil Pb showed proportions in categories II, III, IV, I, totaling 32%, primarily distributed in Songtao, Jiangkou, and other areas (Fig. 3b). The Igeo method was closely related to soil background values, and the assessment results based on local background values (LBV) compared to those based on reference background values (RBV) (CNEMC 1990) showed a decrease in the pollution levels of HMs (Table S4). The results were more consistent with the actual situation as the pollution levels aligned with the spatial distribution and considered the variations in background values caused by temporal and spatial factors. Therefore, Hg in the study area was at a moderate to heavy pollution level, Cd was at an unpolluted to moderate pollution level, and the other HMs were at an unpolluted level.

The range of the NIPI for agricultural soil HMs in the study area was 0.25–38.69, with an mean of 2.52 (Fig. 3c). The proportions of slightly, moderately, and seriously contaminated levels were 20.6%, 6.0%, and 16.5%, respectively. Additionally, there was no significant difference in NIPI values between dryland and paddy fields. The spatial distribution of NIPI indicated that the eastern and central regions of the study area had more severe pollution, while the western region showed no pollution, consistent with the high values of Hg content. These results indicated that the overall level of soil pollution for HMs in the study area was moderately contaminated, mainly derived from Cd and Hg.

Fig. 3
figure 3

Distribution map of the nemerow integrated pollution index (NIPI) and the geo-accumulation index (Igeo) for soil HMs

Source Apportionment

The Pb and Zn had a particularly high correlation coefficient of 0.67, according to the CA findings, which also showed a strong positive correlation between Cu, Zn, Ni, and Cu (p < 0.01) (Fig. 4a). The significant correlations appeared between Cr, Cu and Ni (p < 0.01). Hg and As exhibited significant associations (p < 0.01) with various HMs. The eigenvalues of the three principal components was greater than 1, accumulatively explaining 75% of the total variance(Fig. 4b).

Based on the results of PCA, the PMF analysis was performed with 3–7 factors, and when the number of factors was 4, QRobust and QTrue were close, achieving the best fit between observed and predicted values. Most residual values fell within the range of -3 to 3, and all of the fitting curves’ R2 values were higher than 0.7, suggesting strong analytical performance in general (Fig. 4c). Factor 1 had the main loading for Cd, Zn, Pb, and Cu, with contribution rates of 88.3%, 50.8%, 49.8%, and 47.9% respectively. Factor 1 was the smelting source. Factor 2 had the main loading for As, with a contribution rate of 85.4%. Factor 2 was the agricultural source. Factor 3 had the main loading for Hg, with a contribution rate of 84.3%. Factor 3 was the mining source. Factor 4 exhibited the highest loading for Cr (62.6%), Ni (58.9%), Cu (50.0%), and Zn (48.4%). Factor 4 was the natural source.

The results of PMF analysis for the source apportionment of soil HMs in the study area and their contribution rates were presented in Fig. 4d. The findings suggested that Factor 1 was associated with smelting source, contributing 39.7%, Factor 2 was linked to agricultural source, contributing 15.8%, Factor 3 was related to mining source, contributing 11.3%, and Factor 4 was attributed to natural source, contributing 33.2%.

Fig. 4
figure 4

Source analysis of HMs in agricultural soils in the study area. (a) A correlation heat map depicting the relationships between soil HMs; (b) A principal component load map of soil HMs; (c) A load map representing the sources of soil HMs; (d) The contribution of the four sources of soil HMs

Risk Assessment

Potential Ecological Risks

Based on the risk screening values for agricultural soil, the potential ecological risk index of the study area was evaluated, considering the distribution of different sources and HMs of the ecological risk index (Fig. 5). The results showed extremely high risk for Hg and moderate risk for Cd in the study area, while other elements were classified as low risk (Fig. 5b). The results indicated that the widespread enrichment of Hg and Cd in the study area, coupled with their high toxicity coefficients, contributed to their high ecological risk. The study area exhibited an average potential ecological risk index (RI) of 813.5, signifying a generally high level of potential ecological risk. To break it down further, the proportions of RI falling into different classes were as follows: Class I (low risk) accounted for 18.4%, Class II (moderate risk) for 30.4%, Class III (considerable risk) for 24.2%, Class IV (high risk) for 11.7%, and Class V (extremely high risk) for 15.3%. The spatial distribution of ecological risk showed that high-risk areas were mainly concentrated in the southeast of the study area (Fig. 5c), consistent with the spatial distribution of soil Hg content. The mean RI values of factor 3 and factor 1 were approximately higher (Fig. 5d). In a word, Hg and Cd were the primary elements contributing to the potential ecological risk in the study area.

Fig. 5
figure 5

Distribution maps of potential ecological risk index in the study area. (a) A spatial distribution map of four sources; (b) A map displaying the single-factor ecological risk index (EI) for soil HMs; (c) A spatial distribution map depicting the integrated ecological risk index (RI); (d) Distribution map of RI based on four pollution sources

Monte Carlo-Based Health Risks by Pathways

Based on the results of Monte Carlo simulations, the HQ and CR resulting from different exposure pathways and HMs of the study area were performed. The calculation results have been detailed in Table S5 and Table S6. From Table S5, it can be observed that the relative magnitude of non-carcinogenic health risks for different HMs was generally consistent. Among them, both the HQ and the IHQ for children were higher than those for adults. Specifically, the exceedance rates of HQ for Cr, Hg, and As in children are 39.2%, 2.73%, and 1% respectively (Fig. 6). The average IHQ for adults in the study area is 0.11, indicating that it is below the acceptable limit of 1. The results indicated that IHQ of HMs was acceptable for adults. The IHQ for children was 1.5. Among them, 83.8% of soil samples had a risk index exceeding 1 (Fig. 6). Furthermore, dermal contact was the main exposure pathway, and Cr, As, and Pb are the primary contributing elements to the non-carcinogenic health risks of agricultural soils in the study area. Although the pollution levels of Cr, As, and Pb were low, low-contamination soils may still pose higher health risks.

Fig. 6
figure 6

The cumulative probability distribution of non-carcinogenic risk (HQ) in the study area was simulated based on the Monte Carlo simulation

The Cd, As, Cr, and Ni were recognized as carcinogens by the International Agency for Research on Cancer (IARC). Only Cd, As, Cr, and Ni had carcinogenic slope factors, so this study only assessed the carcinogenic risks associated with these four HMs (Fig. S1). The individual adult CR for each HMs were all less than 1E-04, which was below the maximum acceptable level and considered an acceptable risk level. The CR for children was found to be higher than that for adults, and the average individual CR for Ni in children exceeded 1 × 10− 4, suggesting a potential health risk associated with nickel exposure for children. The average ICR for adults was 4.8 × 10− 5, which was below the limit of 1 × 10− 4 and considered within an acceptable or tolerable level. The average ICR for children was 1.7 × 10− 4, indicating a potentially high risk of cancer among children in the study area. In the study area, the proportion of sample points exceeding the carcinogenic limit for ICRI was 94.9% for children (Fig. S1). Thus, the results indicated that the health risk for adults in the study area was low and falls within an acceptable risk range, while children were more affected by HM exposure in agricultural soils and should be paid special attention.

Discussion

“Soil background values” refer to contents of elements or constituents from the soil that are minimally influenced by human activities and reflect underlying geological and soil formation processes (Sun et al. 2019). They vary from location to location and are usually expressed as a range of values for a particular country or region (da Silva et al. 2020; Shi et al. 2023). Therefore, to evaluate the contamination and risk of HMs in regional soils, it is advisable to determine the background soil values for the region. If environmental processes, including natural factors and human activities, affect the capacity of the soil, the background values may change over time (Yang et al. 2022a). Thus, more representative soil HM background values need to be obtained to reasonably evaluate the soil pollution status. The results of the one-sample T-Test showed that As, Cd, Hg, Ni, Cu, and Zn of the natural soil in the study area were significantly higher than those in the background soil of Guizhou Province (CNEMC 1990), which might be due to the rich metallic minerals such as mercury, manganese and non-metallic minerals such as coal in the study area, and the frequent production activities such as mineral extraction, separation, and smelting in the southeastern areas. The mean Hg content of the agricultural soils in the study area was higher than that of the paddy soils in Guizhou Province studied by (Li et al. 2022a) and significantly higher than that of the Chinese agricultural soils studied by (Zhang et al. 2020b).

Combining CA, PCA and PMF methods, the contamination sources of soil with heavy metal in the study area were identified, namely smelting sources, agricultural sources, mining sources and natural sources. Factor 1 was the smelting source with major loadings of Cd, Zn, Pb, Cu. The contents of Cd, Pb, Zn, and Cu in soil varied greatly in the study area, indicating significant anthropogenic influences. The Fig. 4c showed consistent trends in their distribution, being evenly distributed in areas such as Songtao and Jiangkou counties. The spatial distribution of factor 1 also supported this observation (Fig. 5a). Research findings reveal that Cd, Pb, and Zn primarily emanate from ore extraction and smelting operations, with Cd predominantly coexisting within Pb/Zn ores and subsequently being introduced into the soil during their smelting processes (Peng et al. 2022). Factor 2 was the agricultural source with major loadings of As. The study area displayed significant variation in soil As content, primarily driven by both structural and stochastic factors, suggesting the presence of certain human-induced influences. Previous studies had suggested that As primarily originated from the application of arsenic-containing pesticides and fertilizers, as well as the discharge of arsenic-containing wastewater from mines and factories, and the deposition of airborne arsenic dust emitted from coal combustion and smelting (Zhu et al. 2018). However, there were few reports of arsenic mines in the study area, and the accumulation of As in the soil was not severe, consistence with the spatial distribution of factor 2 (Fig. 5a). Factor 3 was the mining source with major loadings of Hg. According to Table 1, Hg exhibited high variability in the study area, and its content was high, indicating significant enrichment primarily influenced by anthropogenic factors. Previous studies had indicated that Hg accumulation in soil was mainly derived from industrial activities including the burning of fossil fuels, nonferrous metal extraction, and smelting. The Hg was released to the environment in the form of exhaust gases, wastewater and sludge, and eventually entered the soil through various deposition pathways. The hotspots of Hg were observed in the eastern part of the study area, including Wanshan, Bijiang, and Yuping counties (Fig. 2). The study area has abundant Hg mineral reserves, mainly distributed in counties such as Wanshan and Bijiang. In particular, Wanshan has a history of mercury mining and smelting activities spanning several decades (Li et al. 2022b). Additionally, previous research had shown that wastewater, waste rock, and leachate from upstream Wanshan mercury mines had entered rivers, leading to increased Hg concentrations in the surrounding rivers and enrichment of Hg in the adjacent soils (Liu et al. 2021). Factor 4 was the natural source with major loadings of Cr, Ni, Cu, and Zn. The average values of Cr and Ni show not significant differences from the background values, indicating that the spatial heterogeneity is primarily controlled by structural factors. Studies had suggested that Cr and Ni in the soil were mainly controlled by soil parent materials, with minimal influence from anthropogenic factors (Sun et al. 2019; Wen et al. 2020). Based on the soil formation characteristics, rock weathering released certain amounts of HMs, resulting in background values in the soil environment (Yang et al. 2022a).

Based on the findings of the potential ecological risk index (RI) and health risk index (IHQ, ICR), a Sankey diagram was constructed to illustrate the links between HMs contents, sources and risks (Fig. 7). Factor 4 (natural source) made the largest contributions to IHQ and ICR for children (49.2% and 57.2% respectively), while factor 1 (smelting source) contributed 28.6% and 35.7% to IHQ and ICR, respectively. This was due to the fact that Cd, Cr, and Ni were mainly derived from F1 and F4, and Cd and Cr showed high bioavailability and toxicity (Peng et al. 2022). It is worth noting that although factor 3 (mining sources) contributed 84.3% of Hg, it had the lowest IHQ and ICR (8.75% and 0.15% respectively). The carcinogenic risk of Hg has not been calculated because a carcinogenic slope factor was unavailable (Gui et al. 2023; Wang et al. 2023b). Regarding the potential ecological risk index (RI), factor 3 (mining source) had the highest contribution rate at 56.6%, followed by factor 1 (smelting source) at 29.1%. This is attributed to the higher ecotoxicity response factors of Hg and Cd and the severe contamination of soil with Hg and Cd in the study area (Wu et al. 2020). The results indicated that the health risks and potential ecological risks caused by sources differ in their contributions to soil contamination, and the relationship was not only related to soil HMs concentrations but also to the properties of HMs, such as Cd, Cr, and Hg, which have high bioavailability and toxicity. It is necessary to consider different risks comprehensively and formulate corresponding policies. The southwestern part of the study area is characterized by intensive mercury mining activities with a long history. To safeguard human health and protect ecosystems from the adverse effects of HMs, a two-pronged approach is imperative. Firstly, industries involved in metal mining must be identified as priority source of pollution. This necessitates a concerted effort to curtail emissions of highly toxic HMs like Cd and Hg, along with enhancing production processes to minimize environmental impacts. Simultaneously, we should also remain vigilant regarding the excessive levels of HMs in soil, especially in regions with naturally high background concentrations. This comprehensive strategy is vital for addressing and mitigating the risks associated with HMs.

Fig. 7
figure 7

Sankey diagram for integrated analysis of soil HMs source risks

Conclusion

A combination of multivariate statistics, GIS, and PMF was employed to analyze the pollution sources of HMs in agricultural soil, as well as to assess the potential ecological and health risks based on the pollution sources. The conclusions are as follows:

  1. (1)

    Spatially, the HMs in agricultural soil exhibited a clustering trend and showed positive spatial correlations. Overall, the agricultural soil was moderately contaminated with HMs, with Hg and Cd being the main pollutants. High Hg values were mainly concentrated in the eastern region of the study area, while high Cd values were primarily found in the central region.

  2. (2)

    The main sources of HMs in agricultural soil in the study area were smelting source (39.7%), natural source (33.2%), agricultural source (15.8%), and mining source (11.3%). The overall potential ecological risk in the study area was categorized as high risk, with high-risk areas concentrated in the southeastern part of Tongren City.

  3. (3)

    Mining and smelting sources posed higher potential ecological risks. Children in the study area faced higher health risks, particularly from smelting and natural sources. The health risks and potential ecological risks associated with pollution sources varied based on their contributions to HMs contamination. Therefore, it is necessary to consider different risks comprehensively and formulate corresponding policies.