1 Introduction

Potential toxic elements (PTEs) have always been at the center stage of soil pollution due to human activities that sustain human life and survival on earth. Agyeman et al. (2020) stated that PTE is a generic lexicon given to hazardous metal(loids) that damages the human health or the ecosystem. PTE's soil pollution has escalated over the globe (Solgi et al. 2012; Yang et al. 2018) and has gained a tremendous spotlight in modern times (Chen et al. 2015; Wang et al. 2016). Mineral ore and rocks are the origin of PTEs (Alloway 2013), which are usually natural formation like those of weathering of rocks including mineral formations and from anthropogenic impacts correlated to urbanization, industrial activities, mining, agriculture, and natural resource extraction (e.g., gold) (Alyazichi et al. 2017; Jones et al. 2019; Bhuiyan et al. 2021). According to Alloway (2013), PTEs can typically be small in nature with an elevated concentration level that is often recurrent and has a sustained bioavailable duration. These PTEs due to their bioavailability in nature and long residence period are found to be injurious to human health and the ecosystem at large. According to Burges et al. (2015), PTEs are pervasive and may be detrimental to the environment and human health due to their degree of noxiousness and tenacity in nature. Anthropogenic processes such as industrial development, urban sprawl, mining, and agriculture have long been at the forefront of soil pollution. According to Wei and Yang (2010) and Agyeman et al. (2020), over the last decades, toxic substance from various sources, including PTEs, have been significantly introduced into the soil because of accelerated industrial development and urban sprawl. Kabata-Pendias (2011) have reported that soil has the greatest natural environmental effect because it monitors the distribution of PTEs to the air, the hydrosphere, and the biosphere, not only as a geochemical sink for the use of pollutants, but also as a natural safety valve. However, much research has been done on PTEs, since it is a potential threat to health in society (Agyeman et al. 2020) in terms of human activities, especially in the urban areas, agricultural land, and industrial areas (Guagliardi et al. 2013). Ferri et al. (2012) reported that the soil tends to remain in its condition for a period of time following contamination due to sorption and mobility of soil particles by PTEs.

According to USEPA (1996), PTEs that are mostly uncovered in most polluted site such as urban soil in excess are As, Cd, Cr, Cu, Hg, Pb, and Zn. The rate of reaction, transportation, and the fate of the these PTEs are largely dependent on the metal speciation and the chemical forms of the metals (Wuana and Okieimen 2011). Shiowatana et al. (2001) and Buekers (2007) outlined that PTEs are adsorbed in seconds or minutes by a rapid reaction, followed by a gradual adsorption reaction process that can take days or weeks, and are disseminated in various forms such as toxicity, mobility, and bioavailability. PTEs may possibly alter soil properties particularly biological soil properties (Friedlova 2010). Nevertheless, the toxicity of the PTEs has a rippling effect on the flora and fauna of the soil. These are profoundly influenced by factors such as pH, organic matter, soil temperature, clay minerals, inorganic cation and anion ratios, and the chemical types of PTEs (Giller et al. 1998; Šmejkalová et al. 2003). According to Levy et al. (1992), the rate of reaction of PTEs in soils such as urban soil, can potentially control the following processes: biological immobilization and mobilization, mineral precipitation and dissolution, plant uptake, aqueous complexation and ion exchange, adsorption, and desorption.

Mamut et al. (2017) and Eziz et al. (2018) argued that PTEs could potentially have an effect on humans, flora, fauna, and the food chain in the environment. Substantial research in recent years has based its studies on both health risk and ecological risk as well as  the distribution of environmental impacts (Xu et al. 2014; Eziz et al. 2018; Doabi et al. 2018; Rinklebe et al. 2019; Baltas et al. 2020). PTE pollution to human from either anthropogenic or natural sources most often has a devastating health implication. Human exposure to PTEs is realized in several forms, be it dermal, ingestion, or inhalation, as the surest orifice or pathways in which the pollutants gets into human body (Ayantobo et al. 2014). The procedural and standardization procedures that human use in the field of medicine have characterized human exposure to PTEs at any level and can lead to carcinogenic and non-carcinogenic health effects (Lim et al. 2008).

Even through there are many pieces of literature published across the globe on health threats, there is a lack of documentation and research in the study area. Quite apart from that, some papers published by Gržetić and Ghariani (2008), Wang et al. (2010), Maria Figueiredo et al. (2011), Luo et al. (2012), and Bhuiyan et al. (2021) claim that health-related risk evaluation is limited in various cities such as Belgrade in Serbia, Changsha in China, Sao Paulo in Brazil, Xiamen in China, and Bangladesh, respectively. Health risk assessment is a realistic cardinal approach to assessing and evaluating risk to human health posed by PTEs by diverging pathways of exposure according to Kampa and Castanas (2008) and Bempah and Ewusi (2016).

A number of receptor models are consistently applied in the source allocation study, involving positive matrix factorization (PMF), UNMIX, principal component analysis/absolute principal component score analysis-multi-linear regression (PCA/APCS-MLR), and chemical mass balance (CMB). In recent papers published by Salim et al. (2019), Fei et al. (2020), Wu et al. (2020), Zhang et al. (2020), Agyeman et al. (2020), and Bhuiyan et al. (2021) relied mainly on PMF or APCS/PCA-MLR or both to calculate and identify the elemental source distribution of PTEs. PMF and APCS/PCA-MLR are chosen due to the following reasons: (i) the use of effective monitoring procedures, along with the intention of establishing a substantial database, has recently become a universal practice; (ii) these receptor models do not need pre-quantified source profiles (i.e., backward tracking) in disparity with CMB; and (iii) the capacity of the receptor models is capable of coping with significant monitoring datasets (Lee et al. 2016). Even though PMF and APCS/PCA-MLR are frequently used, some authors have also raised concerns on the effectiveness and the efficiency of the receptor models based on their applicability to their intended purpose. Some of the constraints outlined by some of the authors are the differences in computing source contribution, differences in estimated contribution for each potential pollution source (Gholizadeh et al. 2016), inability to identify more sources (Zhang et al. 2019), and high percent error Salim et al. (2019).

The study area has a strong emphasis on indigenous health, and therefore, it is necessary and reasonable to evaluate the quality and the detail of the risk of the soil across the cities, predicated on the priority for human life, livestock, and soil health (example urban soil). The aim of this study is to estimate the degree of soil contamination and the pattern of spatial distribution of PTEs and proposes and applies a novel pollution assessment–based receptor model (contamination factor-positive matrix factor-CF-PMF) for source distribution and the assessment of carcinogenic and non-carcinogenic health risks to humans. This study seeks to answer the following research questions: What is the impact of agricultural production and industrial activities on urban soil and the livelihood of the indigenous peoples in the study area? What is the source contribution of the individual PTEs? We hypothesized that the chemical composition of the soil in the study area is less favorable to the health of the indigenes and PTE levels above the normal threshold. Nevertheless, both industrial and agricultural activities were actively carried out in the study area, and therefore, the chemical composition of prevailing pollutants and the spatial distribution across the area need to be investigated. The findings of this study would significantly contribute to understanding the risks resulting from human and livestock exposure to PTEs in cities and towns within the Moravian-Silesian Region in the Czech Republic. Furthermore, the results will motivate interested parties’, indigenes of the study area, and legislators to raise awareness of the soil toxicity and health risk exposure level of the urban soil, allowing them to take corrective actions to ensure a safer environment.

2 Materials and methods

2.1 Study area

The area under study is situated in the district of Frydek-Mistek within the foothill of the Moravian-Silesian Region in the Czech Republic (Fig. 1). The community is a combination of a previous two independent towns, namely, Silesian Frydek and the Moravian Mistek, which were put together in the year 1943 and stayed since 1955. The area under study is positioned within the geographical coordinates 49° 41′ 0″ North and 18° 20′ 0″ East at an altitude between 225 and 327 m above sea level, characterized by a cold temperate climate and a high amount of rainfall even in dry months. In Frýdek-Místek, the summers are hot and partially cloudy, and the winters are cold, dry, windy, and mainly cloudy. Over the course of the year, temperatures usually range from 24 to 75 °F and are rarely below 8 °F or above 86 °F while the average annual precipitation ranges from 685 to 752 mm (Weather Spark 2016). The area survey of the district is measured at 1208 km2 with 39.38% of the land size designated for agricultural activities and 49.36% for forestlands. The study area comprises of the following cities: Havirov, Terlicko, Trinec, Bystrica, Jablunkov, mostly Jablunkov, and Hrcava, which are affected by intensive urban farming and active industries such as the steel industry. Trinec and Vitkovice, a part of Ostrava city, where the steel industry is located, becomes an essential area for the assessment of PTEs distribution and health risk within and around neighboring communities (Agyeman et al. 2020). The soil’s properties are differentiated evidently from color, structure, and carbonate content. The soil shows a medium and fine texture material that is derived from parent materials. It is mostly colluvial, alluvial, or aeolian deposits. Some part of the soils shows mottles in the top and subsoil that is primarily accompanied by concretions and bleaching. The potential toxic element pollution in the area is anticipated to occur from atmospheric deposition emitted from the steel industry nearby, vehicular emission, abrasion from tires, and agricultural activities (e.g., pesticide and insecticide applications) (Agyeman et al. 2020). Nevertheless, the dominant soil types are cambisols and stagnosols (Kozák 2010). Cambisols soil type dominate the Czech Republic; they are found at the elevation range of 455.1 to 493.5 m (Vacek et al. 2020).

Fig. 1
figure 1

Study area map showing sampling points

2.2 Soil sampling and analysis

A total sample of 49 topsoil was obtained across seven towns (Havirov, Terlicko, Trinec, Bystrice, Jablunkov, mostly Jablunkov, and Hrcava) situated within the district of Frydek-Mistek. The sample design adopted for sampling was the regular grid, and the soil sample intervals were 2 x 2 km using the handheld GPS device (Leica Zeno 5 GPS) at a depth 0 to 20 cm. The collected sample was placed in Ziploc bags, well labeled accordingly, and transported to the laboratory. The collected samples were air-dried, crushed by a mechanical device (Fritsch disc mill pulverize), and then sieved (< 2 mm), to obtain a pulverized sample. These samples were then scanned including a three-beam system (Weindorf et al. 2013) for the elemental concentration of As, Cr, Cu, Mn, Pb, Ni, and Zn using a portable X-ray fluorescence spectrometer (Delta Premium XPD 6000, OLYMPUS INNOV-X, USA). Each sample was measured in triplicates with the average measurement computed for each sample at the end of the analysis.

2.3 Quality assurance and quality control (QA/QC)

The quality assurance and control process, the standard reference material for a portable device (i.e., XRF 2711a NIST, the National Institute of Standards and Technology), was used in the analysis to ensure quality compliance. Reference material was measured intermittently together with the soil samples to ensure that the analysis remained precise until completion. The detection limits for the elements been research on are < 10 mg/kg (Ni), < 10 mg/kg (Cu), < 5 mg/kg (As), < 10 mg/kg (Mn), < 10 mg/kg (Cr), < 5 mg/kg (Pb), and < 5 mg/kg (Zn).

2.4 Contamination assessment indices

2.4.1 Contamination degree (Cdeg)

This is a contamination assessment tool used in computing the degree of contamination index. This was introduced by Håkanson (1980), and it is defined as the summation of contamination factors for all PTEs analyzed reflecting the degree of environmental contamination. The equation is given by

$${C}_{deg}={\sum}_{i=1}^{n}{C}_{f}$$
(1)

whereby \({C}_{f}\) represents contamination factor and n the number of PTEs analyzed.

\({C}_{f}\) is employed to compute the contamination level of toxic chemicals such as PTEs in soil or sediments centered on the concentration level in the sample to the geochemical background level. The equation is given as

$${c}_{f= \frac{{C}_{sample}}{{C}_{geo-background} }}$$
(2)

whereby \({C}_{sample}\) denotes the PTE concentration in the soil, and \({C}_{geo-background}\) refers to the geochemical background level. The geochemical background level used was selected from the European average value (EAV) (Kabata-Pendias 2011) (refer to Table 1). The contamination level can be categorized according to their values from 1 to 6: if CF < 1, low pollution; 1 < CF < 3, moderate pollution; 3 < CF < 6, considerable pollution; CF > 6, very high pollution. The interpretation of CDEG values is given as Cd < 6 = low degree, 6 < Cd < 12 = moderate degree of contamination, 12 < Cd < 24 = considerable degree of contamination, and Cd > 24 = high degrees of contamination.

Table 1 The concentration of PTEs in the study area, basic statistics, and background level of toxic elements

2.4.2 mCd

Abrahim and Parker (2008) first used this index. The index allows the assessment of the total contamination of the soil PTEs corresponding to the sum of the total contamination factor (Cf) to the given number of PTEs divided by the number of PTEs analyzed. This index is calculated by the sum of the content of PTEs at a given location. This is given by

$$mCd=\frac{\sum_{i=1}^{n}Cn}{n}$$
(3)

in which n denotes the number of analyzed PTEs and Cn the individual PTEs concentration.

The classification of modified degree of contamination (mCd) values is given as < 1.5 very low contamination, 1.5–2 = low contamination, 2–4 = moderate contamination, 4–8 = high contamination, 8–16 = very high contamination, 16–32 = extremely high contamination and > 35 = ultra-high contamination.

2.4.3 Nemerow pollution index (\({PI}_{Nemerow}\))

\({PI}_{Nemerow}\) computes the overall degree of pollution of the soil that consists of the concentration of all analyzed PTEs (Qingjie et al. 2008). The index is used in the assessment of for both the O and A horizons. The formula is given by

$${PI}_{Nemerow}=\frac{\sqrt{\left(1/n\right.\sum_{i-1}^{n}{\left.PI\right)}^{2} + {PI}_{max}^{2} }}{n}$$
(4)

where PI represents the computed values for the single pollution index and Pmax the maximum values for the single pollution index of all the PTEs; the interpretation of PINemerow class values is given as ≤ 0.7 = clean, 0.7–1 = warning list, 1–2 = slight pollution, 2–3 = moderate pollution and ≥ 3 = heavy pollutions. However, single pollution index (PI) is defined as the concentration of the PTE in the sample to its geochemical background level. The equation is given by

$$PI= \frac{{C}_{n}}{{B}_{n}}$$
(5)

in which the \({c}_{n}\) represents the concentration of the sampled PTEs, and \({B}_{n}\) also denotes the geochemical background level of the same elements. The geochemical background level used was selected from the European average value (EAV).

2.4.4 PMF receptor model

Positive matrix factorization (PMF), EPA–PMF v5.0 (U.S. Environmental Protection Agency 2014), is a mathematical technique, a receptor model used in computing the contribution of the source of samples built on the composition or fingerprints of the sources. The input files are composed of the concentration and uncertainty values of the samples. Matrix X concentration is given as

$$X=GF+E$$
(6)

in which G (m × p), F (p × n), and E (m × n) represents the concentration factor matrices, for the source profile species and uncertainty.

The determination of the contribution, as well as profiles factors, is given by this equation

$$Q={\sum }_{i=1}^{n}{\sum }_{j=1}^{m}{\left(\frac{{\varepsilon }_{ij}}{{u}_{ij}}\right)}^{2}$$
(7)

whereby m refers to the number of PTEs investigated, n signifies the number of soil samples, and Uij means the uncertainty of PTEs j in soil sample i.

The authors have previously described the function of the minimum Q and the uncertainty and explaining the parameters involved as well as the implementation techniques (Agyeman et al. 2020).

2.4.4.1 CF-PMF

The pollution assessment–based receptor model, contamination factor receptor (CF-PMF), is a novel receptor model which is based on the PMF model, but its determination utilizes the computed CF values of the respective PTEs under investigation instead of the raw data gotten from the field. The CF-PMF receptor model equation is given as

$${{C}_{f}}_{= \frac{{(C}_{sample}{)}_{ij} }{{(C}_{geo-background} {)}_{i}}}$$
(8)

in which the \(({C}_{sample}{)}_{ij}\) is the calculated total contamination factor of the PTEs from the jth source in the ith sampling area, \(({C}_{geo-background}{)}_{ij}\) also represents the examined single PTE concentration in the assessed environment in the jth source from the ith sampling location, and \(({B}_{n}{)}_{i}\) denotes the geochemical background values of the respective analyzed PTEs in the reference environment of the reference PTEs.

2.5 Health risk assessment

The presence of industries, productive agriculture, and other anthropogenic factors exposes individuals within the study area to PTEs. Based on the risk of people being exposed, inhalation, ingestion, and dermal are three known pathways that the residents can be exposed to. According to Wang et al. (2017), in urban, peri-urban, and rural areas, three ways can be used to evaluate the risk of PTEs described below. The following equations specify the pathways of human exposure by PTEs.

$${CDI}_{ing}=\frac{C\times {IR}_{ing}\times EF \times ED}{BW \times AT} {10}^{-6}$$
(9)
$${CDI}_{inh}=\frac{C \times {IR}_{inh}\times EF \times ED}{PEF \times BW \times AT}$$
(10)
$${CDI}_{derm}=\frac{C\times SA\times AF\times ABS\times EF\times ED}{BW\times AT} \times {10}^{-6}$$
(11)
$${CDI}_{total}={CDI}_{ing}+{CDI}_{inh}+{CDI}_{derm}$$
(12)

The definition of the parameters for CDIing, CDIinh, and CDIderm and reference values of the indices above Eqs. (8, 9, 10, and 11) are listed in the table (see ST1).

2.6 Non-carcinogenic risk assessment

The potential non-cancerous risk of one PTE was determined as the hazard quotient (H.Q), in which the equation is given by

$$HQ= \frac{{CDI}_{total}}{RfD}$$
(13)

where Rfd implies the reference dosage (mg/kg/day) and is the projected daily humans’ exposure. The computational HQ values were used to assess the detailed risk to health of all the PTEs studied. The values were summed up and expressed by equation as a hazard index (HI), which is given by Eq. 14:

$$HI=\sum HQ= {HQ}_{ing}+ {HQ}_{inh}+ {HQ}_{derm}$$
(14)

In which HQing, HQinh, and HQderm correspond to the hazard quotient for ingestion, inhaling, and dermal. USEPA report (US EPA 1989) specifically asserted that if HI < 1, there is a possibility to have a negative effect upon a person’s health who is exposed to PTEs. Eziz et al. [14], however, reported that there are also, non-carcinogenic health risks when the HI > 1 occurs.

2.7 Carcinogenic risk assessment

The USEPA report (EPA 2002) stressed that a human exposed to carcinogenic risk (CR) could increase the likelihood of developing cancer of any form. Equations 13 and 14 were used to calculate PTEs such as As, Ni, and Cr carcinogenic risk.

$$CR=DCI \times SF$$
(15)
$$TCR= \sum CR= {CR}_{ing}+ {CR}_{inh}+ {CR}_{derm}$$
(16)

where CR, TCR, and SF values symbolize carcinogenic risk (no unit), total carcinogenic risk (no unit), and slope factor for carcinogenic PTEs (mg/kg/day) respectively. The value of the TCR should differ from 1 to 10–6 to 1 to 10–4 in value. This is the acceptable criterion which shows that human health is not significantly endangered (Hu et al. 2012).

2.8 Data analysis

Statistical analyses were conducted using kyplot, PMF EPA 5.0 for source distribution estimation, Excel for possible health risk estimation, RStudio for mapping, principal component analysis (PCA), and Pearson’s correlation matrix estimation. PCA is applied with the aim of finding a collection of low-sized orthogonal base functions known as principal components (PCs) (Jollife and Cadima 2016). PCA also shows the principle of similarity of findings and variables by showing them in maps as points. Furthermore, there is a smaller new collection of uncorrelated variables, also called PCA scores, which represent the original variables of interest (John et al. 2021).

The multiple linear regression model (MLR) is a regression model that encompasses the relationship between a response variable and a large number of predictor variables by using linearly integrated parameters computed using the least squares method. Following the selection of an explanatory variable, the least square model is a prediction function directed toward a soil property in MLR. In order to construct a linear relationship using the explanatory variable, PTE was used as a response variable. The factor scores served as predictors, and the PTEs served as the response variable. The number of samples used in this analysis was 49, and the scale was set between 0 and 1, indicating low and high values. A random approach was used to divide the data into a test dataset (with 25% for validation) and a training dataset (75% for calibration). The models were subjected to a tenfold cross-validation procedure, which was repeated five times. Mean absolute error (MAE), root mean square error (RMSE), and R square or coefficient determination (R2) were used to assess the receptor models. To evaluate the best receptor model using the validation parameters, the R2 value must be high, and the closer the value is to 1, the higher the accuracy.

The ordinary kriging (OK) geostatistical interpolation technique was used. This interpolation technique enabled us to estimate the spatial distribution of PTEs in the location under investigation. Kriging is an interpolation that predicts values of a variable at locations where data are not available based on the spatial pattern of the available data (Bishop and McBratney 2001). It is expressed by this equation:

$${\mathrm{\rm Z}{^{\prime}}(\mathcal{x}}_{0})={\sum}_{i=1}^{n}{\lambda }_{i}{.{\rm {\mathrm Z}}(\mathcal{x}}_{i})$$
(17)

in which Z′(\(\mathcal{x}\) 0) is the interpolated value for point \(\mathcal{x}\) 0, Z(\(\mathcal{x}\) i) denotes the known value, and λi represents the kriging weight for the Z(\(\mathcal{x}\) i) values. It can be computed by the semi-variance function of the variables on the condition that the estimated value is unbiased and optimal. The semivariogram model is expressed as:

$$\gamma \left(h\right)=\frac{1}{2N(h)}{\sum }_{i=1}^{n}{[Z(X}_{i})-Z({X}_{i}+h){]}^{2}$$
(18)

whereby γ (h) signifies semi-variance, N(h) denotes point group number at distance h, Z(xi)represents numerical value at position xi, and Z (xi + h) is the numerical value at a distance (xi + h).

3 Results

3.1 PTE concentration in soil

The concentration of the PTEs in the soil decreased in the following order Mn > Zn > Cr > Pb > Cu > Ni > As (see Table 1). The general mean concentration of the PTEs of the current study juxtaposed with the European average background (EAV) level, particularly Cu, Mn, Pb, and Zn was higher than the EAV tolerable limit. The mean concentration of copper in the present study is 1.23 higher than EAV (see Table 1), likewise Mn (1.28), Pb (1.59), and Zn (1.4). Alternatively, the mean concentration of the following PTEs Cr, As, Mn, Pb, and Zn in Table 1 indicated that the world average value (WAV) (Kabata-Pendias 2011) of the same elements was lower than the mean concentration of the elements of this present study. The present research PTE (Cr, As, Mn, Pb, and Zn) concentration levels are higher with a magnitude of 1.04, 1.4, 1.38, 1.86, and 1.36 times than respective values of WAV. Furthermore, the elements studied juxtaposing with the same elements of upper continent crust (UCC) (Table 1) exceeded some of the PTEs of the present study in exception of As, Pb, and Zn. Comparatively, As, Pb and Zn concentration exceeded that of UCC by the size of 5.44, 3.33, and 1.36 time higher than the respective values.

According to Karimi Nezhad et al. (2015), the coefficient of variance (CV) suggests the degree of variability within the concentrations of PTEs. CV ≤ 20% indicates low variability, 21% ≤ CV ≤ 50% is considered as moderate variability, 50% ≤ CV ≤ 100% suggests high variability, and CV above 100% is regarded as exceptionally high variability. The coefficient of variation (CV %) of the PTEs in the current soils decreases in this order Ni > Pb > As > Cu > Zn > Mn > Cr accruing 63.47%, 50.17%, 46.41%, 43.68%, 38.75%, and 22.95%, respectively.

3.2 Chemometric approach

3.2.1 PCA and Pearson’s correlation matrix

PCA was used in the pattern recognition of the principal source of PTEs pollution in the study area. It is a useful tool that can provide informative suggestion concerning PTE pathways and primary sources (Hou et al. 2013). In this research, the principal component loading’s significant correlation value was fixed at 0.65 or higher (Table 2). Based on the eigenvalues that should be 1 or more than 1, PC 1 and 2 were found to be statistically significant, accounting for 74.23% of the total data variance. The first principal component (PC1) accrued 54.58% that explains the variation in total, which comprises the following PTEs in the order Pb, Zn, As, Mn, and Cu. PC1 origination can be ascribed to a multiplicity of sources such as geogenic and anthropogenic components. Principal component 2 loading explained that 19.65% of the total variance and demonstrated that the concentration of PTEs (Cr and Ni) is associated. Hence, it suggested that Cr and Ni share a common source of contamination and more of geogenic origin with an anthropogenic boost.

Table 2 The total contribution of PTEs in the principal component of the study area

The correlation matrix (see Fig. SF1) between the investigated PTEs indicated that there is a nexus between the PTEs. The correlation between the PTEs illustrated a stronger connection between the elements. Zinc (Zn) and lead (Pb) showed a stronger positive correlation of r = 0.92, as well as Pb and As, r = 0.88; Zn and As r = 0.75; and Mn and As, r = 0.72. With this, therefore, it is vital to accentuate that they probably share the same or closely related sources. In a like manner, other correlation between PTEs such as Pb and Mn (r = 0.58), Zn and Cu (r = 0.55), Cu and Ni (r = 0.48), Ni and Cr (r = 0.48), and Pb and Cu (r = 0.46) also showed a resilient connection specifying that the source of pollution might be related or close. Zn and Pb showed the strongest positive correlation, and the least positive correlated element was between Ni and Mn with r = 0.06. All the PTEs showed a positive collection without any negation.

3.2.2 CF-PMF pollution assessment-based receptor model

The CF-PMF model was used in the identification of the source in the soil and the apportionment of PTE contribution (Fig. 2). The minimum Q controls the residual matrix that ensures that the reasonable number of factors is produced. The CF-PMF used in the current paper discharged factor loadings that run for 20 runs. Run 8 was the selected run among the 20 runs to discharge the factor loadings as well as the percentage contribution of each PTEs in the study. Factor 1 gave high factor loading values that comprised Ni and Cr (42.7% and 46.3%, respectively). Factor 2 was dominated by Cr, Mn, and Cu with 50.6%, 39.7%, and 31.0% factor loadings, respectively. Factor 3 loading comprised As, Zn, and Pb having the factor loadings of 51.7%, 50.6%, and 60.4%, respectively. The fourth-factor loading was dominated by Ni, Mn, and Cu accruing 53.7%, 32.9%, and 36.5% loadings, respectively.

Fig. 2
figure 2

Factor profile of PTEs from CF-PMF model analysis showing percentage contributions

3.3 Contamination assessment of PTEs

3.3.1 Contamination and modified contamination degree

The calculated contamination assessment indices such as contamination factor, contamination degree, and modified contamination degree values of the PTEs showed a diverse degree of contamination (Table 3). The estimated contamination factor showed a contamination level from low to moderate among all the PTEs. However, Zn, Cu, and Pb showed considerable high contamination at sample point 41(FM-468–01), which is displayed on a box plot (Fig. 3 and see Table ST3). Contamination degree (Cdeg) computation is given in Table 3, and Ni showed a considerable degree of contamination. The other PTEs indicated a very high degree of contamination, and Zn calculated Cdeg was very high. The degree of contamination in the soil as assessed by modified contamination degree (mCd) indicated that the Ni level of contamination was moderate; nonetheless, Cr and As showed high contamination level in the soils. Moreover, Cu, Pb, Zn, and Mn degree of contamination was very high.

Table 3 Computed modified contamination degrees values of PTEs
Fig. 3
figure 3

Box plot showing for contamination factor (CF) of PTEs significance levels in the selected towns

3.3.2 Nemerow pollution index (PInemerow)

Nemerow pollution of PTEs is shown in Fig. 4 and Table ST3. Application of Nemerow pollution index to interpret the pollution level in the soil showed that some areas were least polluted by PTEs as displayed in Fig. 4. Nevertheless, some of the regions revealed the tendency of warning limit, as well as other areas, showed slight to the moderate pollution level. The northeastern (Trinec and Bystrice) and some parts of the central regions (Bystrice and Jablunkov) showed moderate to high spatial distribution pollution class as indicated in the map. This hotspot shows an active heterogeneous pollution distribution with a multiplicity of sources. The spatial distribution pattern of PInemerow of soils in the study area showed a sectorial distribution pattern in the soils in these towns: Trinec, Bystrice, and Jablunkov. The non-polluted regions distributed in the western and some parts of the central, north-western, and southwestern areas showed evidence of clean to low polluted areas distributed spatially in other parts of the study area.

Fig. 4
figure 4

Distribution map of PInemerow_kriging showing pollution levels in soils of the study area

3.4 Potential health risk

3.4.1 Non-carcinogenic risk

The computed chronic daily intake (CDI), HQ, and HI values are displayed in Table 4. The CDItotal distribution for the PTEs in the soils in the present studies (adult and children) is given in the following descending order: Mn > Zn > Cr < Pb < Cu < Ni < As (Table 4). The computed hazard quotient (HQ) of the children seems to be higher than the adult’s HQ (see Table 4), and it falls within the range of 4.7E − 01 to 6.81 − 03 while that of the adults’ span between the range 4.47E − 02 to 7.31E − 04. In ascending order, the estimated HQ values for the PTEs (both adults and children) is as follows: As < Cr < Pb < Mn < Ni < Cu < Zn accounting for 27.96%, 1.10%, 0.72%, 43.87%, 6.49%, 19.43%, and 0.43% for children and 28.22%, 1.09%, 0.71%,43.71%, 6.47%,19.37%, and 0.43% for adults, respectively.

Table 4 Comparison assessing of model quality using multiple linear regression

3.4.2 Carcinogenic risk

Regarding carcinogenic risk CDI, CR, and TCR were computed as shown in Table 5. The chronic daily intake was calculated for Cr, Ni, Pb, and As. The CDI total for children and adults is given in this ascending order: Cr < Pb < Ni < As. The CDI for children ranges between 5.51E − 05 and 6.82E − 05 whereas that for adults from 2.36E − 05 to 2.92E − 05.

Table 5 The non-carcinogenic risk index of PTEs in soils in the study area

3.5 Spatial prediction of PTEs

The spatial distribution of PTEs in the study area is shown in Fig. 5. The distribution pattern of the PTEs showed a sectorial trend of spatial variability that is skewed toward the east northern part of the map for Cu and Ni (this was toward Trinec and Bysrice town). The southeastern part (Trinec and Hrcava) and a little part of the central part skewed toward the east northern part (Trinec and Bysrice) of the map showed a hotspot for As and Mn. A larger area of the central part of that map showed a high concentration and a spatial variability pattern for Cr (Trinec and Bysrice). On the other hand, Pb and Zn showed a spatial distribution pattern at the east northern part moving downward to the southeastern part of the map (Trinec, Bysrice and Jablunkov).

Fig. 5
figure 5

The spatial distribution of PTEs in the soil

4 Discussion

4.1 PTE concentration in soil

The PTEs such as Cu, Mn, Pb, and Zn showed a high pollution in the study area. The higher content of the PTEs signifies that they have multiple sources. The geogenic source cannot be ruled out; hence, there is enough evidence that proves that anthropogenic activities (steel industry and intensive agriculture) and other factors are accounting for the upsurge of the PTE concentrations. Hossain Bhuiyan et al. (2021) reported that human-related pollution is mostly caused by anthropogenic activities such as agriculture and sewage drainage, as well as industrial and air emissions. According to Jia et al. (2018), the excess of some of the PTE values in the present study to the UCC respective PTE values provides a clear indication that the source of pollution might also have an anthropogenic origin. Comparatively, the current soil concentration likened to the soil concentration of Sweden (Kabata-Pendias 2011) portrays that the present PTE concentration of the study area exceeded similar PTE concentrations in all levels. The coefficient of variability results explained that there is a moderate variability between As, Cu, Zn, Mn, and Cr and that they are more homogeneous. Ni and Pb showed high variability, which indicates a non-homogenous distribution of Ni and Pb, which explained a probable human-related activity. The spatial distribution of the non-homogeneity of Ni and Pb foretells the presence of locally enrichment source.

4.2 Pollution assessment

The contamination factor of the PTEs revealed that 97.95 % (48 samples) of the 49 urban soils sampled had low chromium contamination, with Ni displaying 93.87 % (46 samples), Cu 34.69 % (17 samples), As 87.75 % (43 samples), Mn 18.38 % (9 samples), Pb 2.05 (1 sample), and Zn 20.4 % (10 samples). Lead, manganese, zinc, and copper exhibited elevated moderate contamination distribution level representing 91.84%, 81.64% 77.55%, and 61.22% of the total sample, respectively. Similarly, chromium, nickel, and arsenic moderate contamination level was relatively low representing 2.05%, 6.13%, and 12.25% of the total sampled data. Contamination levels of lead, zinc, and copper of the urban soil in some locations were considerably high: 6.47%, 2.05%, and 4.09% of the overall sample that were sampled from the 49 locations, respectively.

Modified contamination degree suggests a moderate enrichment based on the cumulative and average of all the PTEs analyzed. The overall enrichment of urban soil and the resultant impact of PTEs on the soil were pervasive in the study area based on the results (see Table 3). The mCd result of Ni compared to the proposed gradations of Abrahim and Parker (2008) establishes that the cumulative average calculation of Ni is moderately contaminated which might be attributed to a geo-anthropogenic source (blend of geogenic and anthropogenic sources (steel industry)). However, arsenic and chromium showed a high level of contamination that can be related to parent materials and intensive farming. Cu, Pb, Mn, and Zn also showed a very high degree of pollution that might be associated to the intensive urban crop production on urban soil, the application of livestock manure on the urban farmland, and the steel industry. Despite the fact that the parent material’s contribution to higher levels of Cu, Pb, Mn, and Zn in urban soil is undeniable fact. However, Bhuiyan et al. (2011) posited that PTEs with higher pollution levels demonstrate an anthropogenic impact.

The Nemerow pollution index displayed various color patterns indicating differing  levels of pollution on the urban soil. But the most contaminated area on the PInemerow distribution map is where the steel plant is situated. The areas that are in proximity with the steel plant also revealed a relatively high pollution pattern, implying that the steel plant seems to be the major pollutant source within that environ.

4.3 PCA

Figure 6 shows the projection of the clustered PTEs and the relationship fostered between the PTEs. The high r values of Pb and As (PC 1) indicate that they may share the same or close related source which might be more of anthropogenic than geogenic (see Fig. 6). However, lead and arsenic are agronomically related in agrochemicals, such as lead arsenate pesticides or herbicides, which are an essential source of chemicals in urban agricultural soil (Franco-Uría et al. 2009). Previous studies from Nicholson et al. (2003) and Luo et al. (2009) outlined that fertilizer and livestock manures are an essential source for both Pb and As, and this is coherent with the present findings in the urban soil. Zn, Mn, and Cu (0.90, 0.71, and 0.67 respectively) source of occurrence may be attributed to a combination of the geogenic and anthropogenic source (liming). Zinc displayed a more definite correlation matrix with the other two PTEs (Mn and Cu). According to Mantovi et al. (2003), Cu and Zn concentrations in soil surges in relation to the application of wastes derived from animal farming and fertilizer application. The enrichment of the PC 2 PTEs (Cr and Ni) proposed that PC 2 might be controlled primarily by a parent material with a hinge to an anthropogenic source. The anthropogenic source of Cr and Ni could be appropriated to agricultural fertilizer that is in accordance with research carried by USEPA (2002) as part of the central metal contaminants.

Fig. 6
figure 6

The use of principal component analysis in the projection of PTE components

4.4 CF-PMF receptor model

The dominance of Cr and Ni (factor 1) in the urban soil can be ascribed to geogenic- and anthropogenic-related sources (such as the steel industry where it is predominantly used for alloy for formation and other agricultural-related activities such as slaked lime). The current results are in accordance with similar studies by Veit et al. (2009) and Saha et al. (2011) outlining that slaked lime NPK plays a role in enriching the soils with Cr. Nevertheless, previous studies by Zhang et al. (2016) revealed that high Cr concentration in agricultural soils that surpasses the maximum acceptable limit is not limited to the agro-related source. However, higher Cr level is due to collaboration with other tenants such as the geogenic source. This assertion (that geogenic source) was confirmed by Manta et al. (2002) and Mamat et al. (2014) in their studies. Beyond that, other studies by Li et al. (2009) and Liu et al. (2015) mentioned that Cr concentration in agricultural soil increases by the application of sewage irrigation to farmlands. Industrial activities such as steel industries and smelting ores are also contributing to Cr and Ni enrichments in soil. Several studies in China, for instance, Gansu province, discovered that non-ferrous metal and steel production pollutes the soil with Cr and Ni coming from industrial activities (steel industries, smelting ore) (Chen et al. 2016; Qu et al. 2013).

Copper accumulation in factor 2 primarily may be related to livestock manure (Nicholson et al. 2003) because combining Zn to Cu serves as an additive that improves microbial activities (anti-bacterial agent to the gut) (Rosen and Roberts 1996) as well as control, scours after weaning (Holm 1990). According to Cheng et al. (2014) and Xiong et al. (2010), livestock manure (particularly from pigs) and phosphate fertilizers are rich in Cu which may eventually contribute to its enrichment in agricultural soil. Manganese enrichment is of geogenic source, and according to International Manganese Institute, it is the 4th most used element in tonnage after Fe, Al, and Cu (Das et al. 2011). According to Goncalves et al. (2014), Mn is ubiquitous and the 2nd and the 12th most abundant element in the Earth’s crust.

The source of Pb and Zn from factor 3 largely might have originated from agriculture, vehicular traffic, fuel knocking, and abrasion from car tires. Previous studies from Li et al. (2001) and Tepanosyan et al. (2016) share similar assertion that Pb and Zn enrichment may be attributed to road traffic, tire abrasion, and fuel knocking as well as the minimal geogenic source. Most pesticides and herbicides such as calcium arsenate, lead arsenate, and sodium arsenate are rich in As and are used in diverse ways for agricultural production. Research conducted by Bhattacharya et al. (2007) attests to the fact that agrochemicals of such nature are highly rich in inorganic As. Similar studies accentuate that livestock manure is also a potential source for As enrichment in the soil (Micó et al. 2006; Fang et al. 2011). Factor 4 is a blend of anthropogenic and geogenic sources.

4.5 Comparison of CF-PMF receptor model to EPA-PMF receptor model

One of the most used pollution assessment indices in assessing the soil quality and extent of contamination with specific PTEs in an urban area is the contamination factor. According to Kowalska et al. (2018), in assessing the contamination level of a specific PTE, CF is one of the most analytical techniques in assessing the soil quality of an urban area. The positive matrix factorization receptor model is a robust receptor model, but the hybridization of PMF and CF increases the source apportionment efficiency and minimizes the error. Comparative assessment the hybrid model and PMF exhibited that consistently, the hybrid model performed better than the parent model. The estimated coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) suggested that out of the seven PTEs evaluated, CF-PMF showed superior performance in all the seven PTEs (Table 4). All the PTEs analyzed in CF-PMF have lower error level as compared to PMF. According to Li et al. (2016), for a model to deemed as a good model, the prediction value R2 value should be 0.75 or greater. However, RMSE and MAE values on the other hand should be close to 0 or infinitesimally small. Evaluating the models, it was evident that CF-PMF receptor model performed better than PMF receptor model. The CF-PMF receptor model is an improvement of the PMF receptor model thereby amplifying the efficiency of source apportionment estimation as well as decreasing marginal error significantly. The errors with regard to RMSE for the CF-PMF receptor model for all the PTEs such as As, Cr, Cu, Mn, Ni, Pb, and Zn are 11.56, 97.85, 17.30, 527.26, 37.16, 32.12, and 68.02 lesser than the EPA.PMF receptor model respectively. Similarly, the MAE error with the CF-PMF receptor model for all PTEs such as As, Cr, Cu, Mn, Ni, Pb, and Zn is 11.58, 95.00, 17.26, 520.85, 37.04, 32.13, and 68.03 less than the EPA.PMF receptor correspondingly.

4.6 Potential health risk

The non-carcinogenic intake (CDItotal) of adult and children is presented in Table 5, and the CDItotal values of the PTEs of children compared to that of the adults indicates that that of the children is a bit higher to that of the adults, since children are more exposed than the adults and are more prone to possible hurt than the adults. Children and adult CDI oral ingestion was the highest among the other CDIs computed. Earlier reports by Fang et al. (2011) and Karim and Qureshi (2014) and a more recent report by Bhuiyan et al. (2021) confirm the same results and proceed to report that oral or ingestion remains the utmost exposure pathway of PTEs into the human body. In this present study, the total HI value estimated for children is 9.5 times higher than that of the adults (see Table 5). It presupposes that children are more susceptible and more sensitive to the health effects of PTEs because their mouth and finger practices tend to increase their rate of exposure to PTEs. Numerous studies regarding health risk have reported similar high HI results for children (Baltas et al. 2020; Rinklebe et al. 2019; Varol et al. 2020; Wu et al. 2018). For instance, Varol et al. (2020) reported 8.44E − 01 for children to 9.85E − 02 for adults, and Baltas et al. (2020) reported 1.21 for children to 0.131 for adults. The computed HI for children was 0.95, which is less than the threshold of 1; therefore, it implies that it is unlikely for the PTEs to have a negative impact to an exposed individual (Kusin et al. 2018). Similarly, the calculated HI for an adult is not significant because it is equally less than the threshold 1, which points out that it is unlikely for non-carcinogenic negative impact to befall an individual if exposed. It is vital to note that the calculated HI results are from summing up all the elemental HQs assessed; therefore, if most HQs are high, this may result in a high HI and vice versa.

The CDI total for children (CR) is higher than that of the adults, irrespective of the estimated value of the PTEs. Children are open to multiple exposure pathways than adults, and children being exposed to PTEs leads to diverse health issues such as cardiovascular disease, poor respiratory function, neurodevelopmental deficits, and skeletal damage as well as reproductive toxicity (Madrigal et al. 2018). The CDI oral ingestion for adults and children were also found higher than the other computed CDIs. Moreover, the CDItotal for children was found higher than that for the adults (Table 6). The CDItotal and TCR for As of the adults were found lower than that of  the children. Computed TCR for children was found 2.33 times higher than that of the adults. The total carcinogenic risk for the adults and children was 6.9E − 06 and 1.61E − 05, respectively. The computed TCR for both adults (6.9E − 06) and children (1.61E − 05) pointed out that carcinogenic health risk within the study area falls within the acceptable limits (TCR values should range between 1 × 10−6 and 1 × 10−4). Therefore, the propensity for indigenes within the enclave of the urban soils to be exposed to carcinogenic related health risk is unlikely.

Table 6 The carcinogenic risk index of PTEs in soils in the study area

4.7 Spatial prediction of PTEs

The concentration of Cu and Ni pointed out that its enrichment primarily can be attributed to the steel industry and agro-related sources (livestock manure); this is coherent with previous research conducted by Facchinelli et al. (2001). Moreover, Salonen and Korkka-Niemi (2007) outlined that some PTEs such Ni and Cu can be present in the parent materials of the soil with minute temporal and spatial distribution in worldwide soils. The enrichment of As in the soil is due to the potential application of lead arsenate and sodium arsenate to boost yield (i.e., increase yield in fruits and potatoes) which upsurges the levels of the PTEs (Frank et al. 1976). Manganese is of natural origin, but the continuous application of manganese sulfate in agricultural soil to increase yield in crops such as vegetables and beans elevate the concentration levels of PTEs (Frank et al. 1976). The enrichment of Cr is due to multiplicity of anthropogenic sources. The hotspot of Cr in the map is as a result of the steel industry usage of chromium in alloy formation as well as sewage discharge. Goovaerts (1997) hinted the source of PTEs such as Cr, Cu, and Ni; the geochemical background of these elements is normal in general, but sometimes, their elevation in soils may be influenced by agro-anthropogenic-related sources. The spatial distribution of Zn and Pb primarily is linked toward agricultural fertilizer, vehicular traffic, and fuel knocking. This is coherent with previous research by Kachenko and Singh (2006), Perez-de-Mora et al. (2006), and Rodríguez et al. (2008) reporting that Pb and Zn higher levels in urban agricultural soil are as a result of anthropogenic component constituted by human-related activities. Pb pollution is one of the critical concerns for almost half of the sites of the US Superfund Environmental Protection Agency (EPA) according to Hettiarachchi and Pierzynski (2004), McBride et al. (2014), and Brown et al. (2016).

5 Conclusion

This study showed that some PTE concentrations, such as Cr, As, Mn, Pb, and Zn, exceeded the WAV, whereas the mean concentrations of As, Pb and Zn were higher than the UCC. However, the concentration of Cu, Mn, Pb, and Zn also exceeded the tolerable EAV limit. The PCA established the prime source of pollution in the study area and clarified that with the significant statistics of 74.23%. It suggested that the source of pollution originated from a multiplicity of origin that is from anthropogenic (mostly agricultural practices and steel industry) and geogenic sources. The CF-PMF pollution assessment–based receptor model discharged four factors, and the source distribution revealed the dominance of Ni and Cr (factor 1); Cr, Mn, and Cu (factor 2); As, Zn, and Pb (factor 3); and Ni, Mn, and Cu (factor 4). The contamination factor exhibited low to medium level of contamination for all the PTEs except for Pb, Zn, and Cu that further displayed a considerable contamination level. The contamination degree also indicated that the PTEs for Ni were considerably contaminated and contamination degrees of the other PTEs such as Cr, Cu, As, Pb, Zn, and Mn were considerably high. However, the mCd also specified that Ni was rather moderately contaminated. In addition, it placed Cr and As in the same high contamination degree categories and Cu, Pb, Zn, and Mn were very highly contaminated in the urban soil.

Nevertheless, the Nemerow pollution displayed a low to moderate level of pollution pattern, but the northeastern (Trinec and Bystrice) and certain parts of the central regions (Bystrice and Jablunkov) indicated moderate to high spatial distribution pollution class. The risk assessment of both non-carcinogenic and carcinogenic health for adults and children suggested that it is unlikely that the exposure to PTEs would have a negative effect, and there is no carcinogenic risk to the residents living within the enclave of the study area. Spatial distribution of PTEs in the study area suggested a hotspot along Trinec, Bystrica, Jablunkov, and Hrcava. Continuous use of agro-related inputs and other anthropogenic tenants, such as the steel industry, is likely to raise the urban soil PTE levels.

The comparison assessment of the novel CF-PMF receptor model based on PMF showed that combining CF to PMF has improved the receptor model's accuracy. Multiple linear regression analysis of both the EPA.PMF model and the CF.PMF model using cross validation evaluation such as coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) has consistently shown that the error level has been reduced significantly across all the PTEs analyzed. The CF-PMF receptor model has shown to be effective and useful in the discovery and distribution of the percentage contribution of the PTEs under investigation.

In parallel with the health risk assessment, pollution assessment and the CF-PMF receptor model highlighted hotspot and risk-prone areas within the urban area, which are of great concern to the communities under investigation. However, it is important for the cities to take pragmatic measures to reduce and protect the soil from PTEs accumulation.