Introduction

Polycyclic aromatic hydrocarbons (PAHs) are parts of diverse group of toxic substances. They are produced primarily as a result of incomplete combustion from predominantly anthropogenic sources including fossil fuel, biofuel, and vegetation fire (Bortey-Sam et al. 2014; Wang et al. 2017c; Tarafdar and Sinha 2017;Klimkowicz-Pawlas et al. 2017). Their physicochemical properties (long-range transport potential, the bioaccumulation potential) give them particular importance (Yap et al. 2014;Li et al. 2015). A large number of these substances are mutagenic, carcinogenic, and toxic (Nakonechny et al. 2019). Based on proven mutagenicity and carcinogenicity, the US Environmental Protection Agency (USEPA) has selected sixteen PAHs and has defined them as priority pollutants (Yap et al. 2014;Zheng et al. 2014; Wang et al. 2017c; Wang et al. 2018a). Soil is considered as an excellent environmental reservoir for these toxic chemicals, can provide useful information on the presence of PAHs in the environment, and represents their ubiquitous character (Hassanin and Lee 2005; He et al. 2008; Wang et al. 2017b; Feng et al. 2019). Several studies were carried out in order to evaluate the levels of these pollutants in the soil surface (Niederer 1998; Jiang et al. 2009; Navarro-Ortega et al. 2012; Larsson et al. 2013; Bortey-Sam et al. 2014; Li et al. 2015; EL-Saeid et al. 2015; Wang et al. 2015; Cai et al. 2017; Sosa et al. 2019; Ukalska-Jaruga et al. 2019;Wang et al. 2019;Yurdakul et al. 2019; Li et al. 2019).

PAHs are classified in two groups according to their number of aromatic cycles of which it is composed: low molecular weight PAHs (LMWPAH), with two or three aromatic cycles, and high molecular weight PAHs (HMWPAH) with at least four aromatic rings (Wang et al. 2017a; Gereslassie et al. 2018). HMWPAH are more persistent in the environment and have greater carcinogenicity and more complexity in degradation compared with those of LMWPAH (Doick et al. 2005b; Ranjan et al. 2012; Wang et al. 2017a). PAHs have a deleterious effect on health, such as malformations, mutagenesis, and endocrine disruption (Council 2008; Haddaoui et al. 2016; Wang et al. 2017b; Gereslassie et al. 2018). It has been reported that several factors, mainly the concentration of PAHs, the duration of exposure, and the route of exposure (inhalation, ingestion, or contact with the skin), have a direct impact on the effects of PAHs on human health (Wang et al. 2015;Manoli et al. 2016; Wang et al. 2017a) .

Kim et al. (2013) reported that 7PAH compounds have been classified by the USEPA, as probable human carcinogens: benz(a)anthracene, benzo(a)pyrene, benzo(b)fluoranthene, benzo(k)fluoranthene, chrysene, dibenz(ah)anthracene, and indeno(1,2,3-cd)pyrene. For this purpose, several studies have given particular importance to these seven compounds (Wang et al. 2015; Daso et al. 2016).

Apart from, sampling and sample storage, sample preparation is an error-prone step in the analytical cycle (Gfrerer et al. 2002). Therefore, special caution is needed to be taken to make an accurate choice in optimizing extraction methods and cleanup procedures. The PAHs are hydrophobic organic compounds characterized by high octanol-water partition coefficients Kow (Navarro-Ortega et al. 2012; Kim et al. 2013).

In solvent-solute interactions, the key parameter related to the polarity, is the dielectric constant (Kronholm et al. 2003). Hence, low dielectric constant favors the solubility of low polarity compounds like PAHs, while a high dielectric constant favors the solubility of high polarity compounds. The water dielectric constant is about 78.5 at room temperature, decreases to about 50 at a temperature of 100 °C and reaches the number 28 at high temperature (~ 250 °C) under 50 bar, and becomes very close to the dielectric constants of ethanol and methanol which are respectively 24 and 33 at a temperature of 25 °C (Maryott 1956;Teo et al. 2010; Richter and Raynie 2012; Islam et al. 2012). The physicochemical characteristics of water, like the viscosity, the surface tension, are modified by the effect of the temperature. Therefore, it can dissolve a wide range of low and medium polarity compounds (Bursać Kovačević et al. 2018). Considering the importance and multiple advantages of hot water in the extraction process, such as the reduction of the consumption of organic solvents, their availability, and non-toxicity, the insertion of a maceration step becomes justifiable. For this aim, this paper proposes the insertion of a step prior to the hot water (maceration) followed by conventional liquid-liquid extraction with an organic solvent immiscible with water. Finally, special attention should be paid to the cleanup procedure prior to the final analysis (Oleszczuk and Baran 2004; Wang et al. 2013). An optimization of the cleanup procedures was proposed using SPE extraction cartridges in our study.

The statistical analysis, in particular, principal component analysis (PCA), provides reliable results and has been widely applied for the determination of the probable source of PAHs in soil (Sharma et al. 2007; Navarro-Ortega et al. 2012; Feng et al. 2017; Klimkowicz-Pawlas et al. 2017; Wang et al. 2017a; Li et al. 2019).

The main objective of this study is to investigate the level of the PAH contamination in one of the most industrial areas in Algeria by considering the 16 most toxic PAHs. Then, the pollution sources were identified and the risk to human health assessed. To achieve these objectives, this paper describes the systematic procedure used to collect the soil samples. After extraction by an adapted and original methodology, based on maceration in hot water followed by conventional liquid-liquid extraction with an organic solvent immiscible with water, the results of the analysis of the will be presented. The PCA and the molecular ratios will be used to identify the probable sources of 16PAHs. Finally, the quantitative evaluation of the toxic potency, the ecological risk, and the human health risk assessment will be calculated for each of the investigated soils.

Materials and methods

Soil sampling

The study area located on the east side of Algiers city is an industrialized area having a petrochemical industrial zone, and there are numerous chemical and pharmaceutical manufactories. The residential area in this region is also under the influence of an immense traffic. There are many transport facilities (airport, motorways, and railways). In order to understand the spatial variation of 16PAHs concentrations over the studied area, the sampling sites were classified into four categories (rural, suburban, urban, and industrial-urban) based on the nature of the sampling location (agricultural or urban fields) and the distance between the sampling points and the organized industrial zones (Table S1 in Supplementary Materials SM). The prevailing winds in the region are the northerly, northwesterly, and northeasterly.

Fifty-five soil samples were collected during the period from April to May 2017, and the total surface studied area was 72 km2. The topsoil samples (0–10 cm depth) were collected at east Algiers around a petrochemical industrial area with 1 × 1 km2 grids. From every sampling location, five subsamples were collected to reach a total sample mass of almost 1000 g. The collection was done with a stainless steel spade, and mixed thoroughly into a single composite sample to reduce the random variation and produce a representative sample. The location for each sampling point was recorded using a global positioning system (GPS) brand Garmine MapSource; the soil samples were stored in a brown glass bottles at − 40 °C until extraction.

Large particles and organic debris present in the samples were removed by sieving the samples on a 0.5 mm mesh. The moisture content of soil was determined by weighing 20 g sample before and after drying at 105 °C in an oven for 24 h, and organic matter content was determined by loss on ignition in a muffle furnace at 600 °C for 4 h.

Chemicals, solvents, and materials

The 16 EPA PAHs were purchased from Restek as mix standard reference solution of 2000 μg/mL in dichloromethane (Cat. 31,011, Restek, Bellefonte, PA), including naphthalene (NAP), acenaphthylene (ACY), acenaphthene (ACE), fluorene (FLU), phenanthrene (PHE), anthracene (ANT), fluoranthene (FLT), pyrene (PYR), benzo(a)anthracene (B(a)ANT), chrysene (CHR), benzo(b)fluoranthene (B(b)FLT), benzo(k)fluoranthene (B(k)FLT), benzo(a)pyrene (B(a)P), indeno(1,2,3,cd)pyrene (INP), dibenzo(a,h)anthracene (DBA), and benzo(g,h,i)perylene(B(g,h,i)P). The 2-flurobiphenyl (2-FBP) was purchased from LGC standard, and used as a surrogate standard. Intermediary solutions (containing 25 μg/mL of each PAH) were prepared by diluting the standard reference solution of 16PAHs with isooctane solvent. Dichloromethane (DCM), n-hexane, and isooctane were high-performance liquid chromatography (HPLC) grade. SPE Bond Elut-C8, 3-mL cartridges packed with 500 mg of adsorbent were purchased from Agilent (USA), and SPE HYPERSEP SI 500 mg/3 mL, HYPERSEP Verity-CX 500 mg/3 mL, and Florisil SPE Colum 500 mg/3 mL were purchased from Thermo Fisher Scientific.

Sample preparation

Sequential solvent extraction

Generally, in the methods related to the extraction of non-volatile and semi-volatile organic compounds in a solid matrix such as the soil by ultrasound, the volume chosen is three to four times the mass of the sample, like the method (USEPA 2007) that uses 30 g of soil and 100 mL of solvent. In our study, we used a mass of 10 g of soil samples, and we systematically used a volume of 30 mL of water. Moreover, concerning the water temperature, we varied it between 50 and 100 °C and found that from the temperature 75 °C, the recovery rates did not vary significantly, so the procedure proceeds as follows:

Ten grams of the prepared soil were weighed in a brown glass vial of 50 mL with PTFE caps. The samples were spiked with surrogate standard 2-FBP to monitor the analytical recovery efficiencies. After 1 day of equilibration, 30 mL of hot water (75 °C) were added and the mixture was shaken for 30 min on a mechanic shaker at 60 rpm/min. Following that, 30 mL of dichloromethane (DCM) were added to the mixed (soil and water); the samples were shaken for 30 min and then sonicated for 30 min. After sonication, the vials were centrifuged for 10 min at 3500 rpm. The supernatant was then recuperated in glass vials, the process was repeated two times, and the extracts were combined. The organic phase was recovered following a liquid-liquid extraction. Obtained extracts were then reduced using a rotary evaporator and a high purity N2 flow to a final volume lower than 200 μL, then 2 mL of isooctane were added in order to recover the extract. The analytical procedure should also include a purification step of the extract by solid-phase extraction (SPE cartridge) before analysis. Once the cleanup is done, 1.5 mL of the total volume of the extract is transferred to GC vials for further injection. For each batch of 6 samples, spiked reagent blanks (surrogate standards spiked into solvent) were analyzed to confirm the absence of interference. Also, an uncontaminated soil spiked with surrogate standard and 16 PAHs into solvent at 16.5 ng g−1 were analyzed for quality control.

Cleanup of extraction solution

SPE cartridges were used to reduce interferences in the soil extract. The Bond Elut C8, 500 mg 3 mL SPE cartridge, was applied. The solid-phase extraction cartridges were conditioned with 9 mL of DCM, and the adsorbent must not be dried. An aliquot of extract samples was then re-dissolved in 2 × 1 mL of isooctane, loaded to the SPE cartridge, and recovered by vacuum aspiration.

Instrumentation

Samples were analyzed for PAHs using a GC/MS system. For this purpose, a Thermo Fisher Scientific GC/MS spectrometer equipped with electron impact ionization (EI) and a multi-mode inlet (MMI), PTV, and split/splitless injectors was used. Target substances were analyzed by single ion monitoring (SIM) mode and quantified by the internal standard method. The PTV injector was set to 250 °C, the injection mode was CT splitless with surge (340 kpa), the capillary column was HP-5MS 30 m × 0.25 mm, 0.25 μm film of stationary phase, and He at 1 mL/min was used as the carrier gas. The temperature oven was initially set to 90 °C (1 min) then increased at 25 °C/min up to 280 °C and 4 °C/min up to 300 °C during 5 min. The transfer-line temperature was 300 °C. The ion source temperature was 250 °C. The ionization energy was set to 70 eV. Xcalibur software was used for instrument control and data processing. The identification was based on the retention time of the quantitative ion for each analyte. The comparison was carried out between the retention time of samples and the standard solution. Quantification was performed by a six-point internal calibration method.

Validation of method–limit of detection, limit of quantification, and linearity

It is reported that the addition of a surrogate standard before the implementation of the various stages of the sample processing makes it possible to overcome any losses inherent to the sample preparation (Skoog 2003).

The reliability of the method has been proved through several points that are validated. The limit of detection (LOD) and the limit of quantification (LOQ) chosen were obtained by considering 3 and 10 times the signal on noise ratio, respectively (Table S2 SM). For the 16 PAHs, the LOD and LOQ are ranging between 0.1–1.5 ng mg−1 and 0.7–3.1 ng mg−1, respectively.

The linearity is tested by considering five solutions containing the 16PAHs at a concentration ranging from 0.2 to 5 mg L−1 analyzed 5 times within 5 different days. The linear model is validated by the determination coefficient (R2) (Fig. S1 in SM), which is systematically greater than 0.98 and residual errors distributed around 0% with an absolute value systematically lower than 60% for the solution with the lowest PAHs concentration, and lower than 20% for the others (Fig. S2 in SM).

Linear models are then used to analyze 5 replicates of spiked samples at 0.2 mg/L, 1 mg/L, and 5 mg/L in isooctane within 5 different days. The calculated amounts are used to estimate the recoveries and the uncertainties with an alpha risk at 5%. For most of compounds and levels, a 100% recovery value is included in the confidence interval, which indicates a good reliability of the measurement. Finally, the five amounts determined for the three levels of spiked sample are used to calculate the uncertainties for each level. Linear models are applied to estimate the uncertainties for the other concentrations in the range and are finally used to determine them for the unknown samples (Fig. S3 in SM).

Optimization of the extraction method

Optimization experiments were conducted to study the effect of hot water on analyte (16PAHs) recovery. A comparison of the recovery rates was made between our developed protocol based on adding a preliminary step of maceration by hot water at 75 °C before adding DCM and the protocol based on the use of an organic solvent alone. The polarity of the SPE cartridges was also evaluated using four types of SPE cartridge of different polarity (SPE Bond Elut-C8, SPE HYPERSEP SI, HYPERSEP Verity-CX, and Florisil SPE Column). All these cartridges are packed with 500 mg of the adsorbent. Finally, an optimization of the conditioning solvent of the cartridge was carried out by three solvent tests, namely DCM, hexane, and isooctane. Ten tests of 10 g of uncontaminated soil samples were enriched at 2 ng g−1 of 16 natives and 1 ng g−1 of surrogate standard for the optimization study.

Statistical analysis and mapping

Statistical analysis was performed with Microsoft Excel (Microsoft Inc., USA). Principal component analysis (PCA) was applied using the XLSTAT software 2014.5.03 to identify the possible sources of PAHs. The input variables were the 16PAH concentrations measured in 55 soil samples. Varimax rotation with Kaiser normalization is used to identify the contribution of variables to the formation of the factorial axes. Only major principal components and significant points (samples) having factor loadings higher than 0.5 (Cos2 ≥ 0.5) were extracted. In data analysis, PAH concentrations under the limit of detection have been replaced by half of the detection limits calculated by the method. The spatial distributions of the Σ16PAH concentrations were mapped out by the MAPinfo version 8.0. To create a continuous contour map of PAH contamination in soil, the inverse distance weighted (IDW) interpolation method was used, because the data is unpredictable, which can vary greatly over short distances (Wu et al. 2016).

Quality control

Tow quality control (QC) criteria were applied to ensure the correct identification of 16PAHs: (1) a retention time identical to the standard compound, (2) a signal-to-noise ratio greater than 3:1. To confirm the absence of interferences or cross-contamination, a blank was prepared in the same time than six samples. Also during a GC run, solutions corresponding to two points of the calibration curve and a QC were analyzed after each sequence of six samples.

Risk assessment

Toxicity risk

In this study, evaluation of the toxic potency of each soil sample has been determined through toxic equivalence quantities (TEQs)’s parameter, which is the sum of B(a)P equivalent concentrations. TEQ values were calculated according to Eq. (1).

$$ TEQs=\kern0.5em \sum \left(\left[{PAH}_i\right]x{TEF}_{PAH i}\ \right) $$
(1)

Where [PAHi] is the concentration of PAH congener i and TEFPAHi is the toxicity equivalency factor (TEF) of PAH congener i (Manoli et al. 2016; Tarafdar and Sinha 2017; Wang et al. 2018b).

TEF was 0.001 for NAP, ACY, ACE, FLU, PHE, FLT, and PYR, was 0.01 for ANT, CHR, and B(g,h,i)P, was 0.1 for B(a)ANT, B(b)FLT, B(k)FLT, and INP, and was 1 for DBA and B(a)P (Yurdakul et al. 2019).

Ecological risk

In this paper, the estimation of the ecological toxicity of the PAH compounds in soil samples was carried out by determination of the risk quotient (RQ) of PAHs. This parameter (RQ) is calculated with Eq. (2) (Sun et al. 2015; Sun et al. 2020):

$$ \mathrm{RQ}=\frac{\mathrm{Monitoring}\ \mathrm{concentration}\ \mathrm{of}\ i}{\mathrm{Allowable}\ \mathrm{concentration}\ \mathrm{of}\ i}\kern0.5em \mathrm{RQ}=\frac{C_{PAHi}}{C_{QV}} $$
(2)

where

CPAHi: The concentration of PAH congener i;

CQV : The corresponding quality value of PAH i in soil (allowable concentration).

Since, the allowable concentration is not reported for all the 16 PAHs in the literature, Cao et al. (2010) proposed to replace  CQV by the negligible concentrations (NCs) to calculate RQNCs and by the maximum permissible concentrations (MPCs) to obtain RMPCs .

Below the NC concentrations of PAHs in the environment, the ecological risk is considered to be negligible. Above the MPC concentrations of PAHs in the environment, the ecological risk is considered unacceptable to ecosystems (Kalf et al. 1997).

The risk quotients RQNCs  and RMPCs are defined as follows:

$$ {R}_{QNCs}=\frac{C_{PAHs}}{C_{QV(NCs)}} $$
(3)
$$ {R}_{QMPCs}=\frac{C_{PAHs}}{C_{QV(MPCs)}} $$
(4)

where CQV(NCs) is the quality values of the NCs of PAHs in the medium and CQV(MPCs) is the quality values of the MPCs of PAHs in the medium.

The risk quotient RQ∑(PAHs) of the 16PAHs , the risk quotient RQ∑(PAHs(NCs)) of the total NCs, and the risk quotient RQ∑(PAHs(MPCs)  of total MPCs are defined as follows:

$$ \kern1em {RQ}_{\sum (PAHs)}=\kern1em {\sum}_{i=1}^{16}\left(R{Q}_i\right)\kern1.5em R{Q}_i\ge 1 $$
(5)
$$ \kern0.5em {RQ}_{\sum \left( PAHs(NCs)\right)}=\kern0.5em {\sum}_{i=1}^{16}\left(R{Q}_{iNCs}\right)\kern1em R{Q}_{iNCs}\ge 1 $$
(6)
$$ {RQ}_{\sum \left( PAHs\Big( MPCs\right)}=\kern0.75em {\sum}_{i=1}^{16}\left(R{Q}_{iMCs}\right)\kern0.5em R{Q}_{iMPCs}\ge 1 $$
(7)

The classification given by Cao et al. (2010) was used in this study to classify the ecological risk of individual PAH and of ∑16 PAHs (Table S3 in (SM)).

Human health risk

The evaluation of carcinogenic risks associated with pollutants’ chemical exposure was based on the deterministic approach from USEPA’s Risk Assessment Guidance for Superfund (RAGS) (USEPA 2004; Cachada et al. 2012; Li et al. 2015; Li et al. 2018; Guidance 2018). Our sampling sites are a residential area located around an industrial zone. For this reason, potential exposure to seven carcinogenic PAH (7CarPAHs) pollutants in soils in residential areas was calculated separately by exposure route (ingestion, dermal absorption, and inhalation of particles) as described in the USEPA baseline equations (USEPA 2019). For calculation purposes, the concentrations of the 7CarPAHs of each soil sample were converted to mg/kg. Based on the USEPA (2019) values established of the chronic oral slope factors (CSFo) and the chronic inhalation unit risk (IUR), the total human cancer risk associated with 7CarPAHs was calculated. Given the lack of a value assigned to chronic dermal slope factor (CSFd) for these compounds, this parameter was replaced by the chronic oral slope factor (CSFo) in the calculation of the carcinogenic risk dermal adsorption. Therefore, the carcinogenic risks for ingestion and dermal adsorption were calculated by multiplying the predicted oral or dermal exposure by the CSFo (Li et al. 2015; USEPA 2019). For the third route of exposition, the cancer risks for inhalation of soil particulates were calculated by multiplying the exposure by the predicted chronic IUR. The total lifetime cancer risks (TLCR) were obtained by summing the individual risks of the three exposure routes. The human health carcinogenic risk of 7CarPAHs was calculated that of the individual of seven PAH congeners, which CSFo and IUR were established by USEPA (2019), and their sum risk was the cancer risks of PAHs (TLCR).

The residential exposure cancer risks were calculated by Eqs. (8), (9), (11), and (13). The values of parameters used in health risk assessment models reported by Li et al. (2015) and USEPA (2019) have been given in Tables S4 and S5 in Supplementary Material (SM).

$$ TLCR={TR}_{\mathrm{ingestion}}+{TR}_{\mathrm{inhalation}}+{TR}_{\mathrm{Dermal}\ \mathrm{contact}} $$
(8)
$$ \kern0.5em {TR}_{\mathrm{ingestion}}=\frac{CSF_o\times {C}_s\times {10}^{-6}\times EF\times {IFS}_{adj}}{AT\times LT} $$
(9)

where

$$ {IFS}_{\mathrm{adj}}=\frac{IRS_c\times {ED}_c}{BW_c}+\frac{IRS_a\times {ED}_a}{BW_a} $$
(10)
$$ {TR}_{\mathrm{dermal}\ \mathrm{contac}t}=\frac{CSF_o\times {C}_s\times {10}^{-6}\times EF\times {DFS}_{\mathrm{adj}}\times {ABS}_d}{AT\times LT\times GIABS}\kern0.5em $$
(11)

where

$$ {DFS}_{\mathrm{adj}}=\frac{SA_c\times {ED}_c\times {AF}_c}{BW_c}+\frac{SA_a\times {ED}_a\times {AF}_a}{BW_a}\kern0.75em $$
(12)
$$ {TR}_{\mathrm{inhalation}}=\frac{ED_r\times {C}_s\times EF\times ET\times IUR\left(\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$ PEF$}\right.\right)}{AT\times LT}\kern0.5em $$
(13)

where

TRIngestion:

total cancer risk through ingestion

TRDermal contact:

total cancer risk through dermal contact

TRInhalation:

total cancer risk through inhalation

CSFo:

chronic oral slope factor (mg/kg-d)−1

Cs:

constituent concentration at soil (mg/kg)

AT:

averaging time (d/year)

LT:

lifetime (year)

EF:

exposure frequency (d/year)

IFSadj:

resident soil ingestion rate - age-adjusted (mg-year/kg-d)

10−6:

correction factor (kg/mg)

BWc:

average body weight from ages 1–6 child (kg)

BWa:

average body weight from ages 7–31 adult (kg)

EDc:

resident exposure duration during ages 1–6 child (year)

EDa:

resident exposure duration during ages 7–31 adult (year)

EDr:

exposure duration (year)

ETrs:

soil exposure time (h/h)

IRSc:

ingestion rate of soil age 1 to 6, child (mg/d)

IRSa:

ingestion rate of soil all other ages, adult (mg/d)

PEF:

particulate emission factor (m3/kg)

IUR:

chronic inhalation unit risk (mg/m3)−1

DFSadj:

resident soil dermal contact factor- age-adjusted (mg-year/kg-d)

ABSd:

fraction of contaminant absorbed dermally from soil (unitless)

SAc:

skin surface area available for contact during ages 1–6 child (cm2/d)

SAa:

skin surface area available for contact during ages 7–31 adult (cm2/d)

AFc:

adherence factor of soil to skin for a child (1–6 years) (mg/cm2)

AFa:

adherence factor of soil to skin for an adult (7–31 years) (mg/cm2)

GIABS:

fraction of contaminant absorbed in gastrointestinal tract (unitless)

Results and discussion

Selection of extraction mode and cleanup

A new extraction method was proposed in this study, to improve the extraction efficiency of 16PAHs from soil samples. A preliminary step was included in the extraction protocol using hot water before the organic solvent. The PAH-recovery rates were used to evaluate the effectiveness of the extraction method. The optimization of the extraction protocol was performed with fortified soil samples with 16 PAHs.

Figure 1 shows the recoveries obtained after extraction using the new protocol with hot water and dichloromethane (DCM) solvent and those given by the literature using only an organic solvents (DCM) (Hollender et al. 2003; Oluseyi et al. 2011; PENG et al. 2012; Li et al. 2015).

Fig. 1
figure 1

Recoveries obtained after extraction using the new protocol with hot water followed by the solvent dichloromethane (DCM) and the protocol with only solvent. Error bars are 1 RSD

The proposed new protocol gave an improved recovery rates for the 16PAHs particularly for low molecular weight PAHs (LMWPAH). Average recoveries using the new protocol for the 16 PAHs were 92.23% (NAP), 92.94% (ACY), 111.54% (ACE), 102.52% (FLU), 103.22% (PHE), 102.62% (ANT), 113.21% (FLT), 125.75% (PYR), 96.39% (B(a)ANT), 83.95% (CHR), 86.77% (B(b)FLT), 99.61% (B(k)FLT), 68.77% (B(a)P), 80.87% (INP), 89.22% (DBA), and 85.93% B(g,h,i)P).

The repeatability was calculated through the relative standard deviation parameter (RSD) of the recoveries across ten replicate. The results presented in Fig. 1(error bar) shows that the RSD for all 16PAHs was less than 10% except for CHR and B(a)P where it was large. These extraction results obtained with the new protocol are interesting and may be due to the effect of the high temperature on the physicochemical characteristics of the water. Kronholm et al. (2003) reported that the increasing of the temperature of water will decrease its constant dielectric (£). This decrease in the dielectric constant of water consequently decreases its polarity and water becomes more effective for the solubilization of compounds having low and medium polarities, such as polycyclic aromatic hydrocarbons (PAHs). The use of hot water, before the organic solvent, becomes the appropriate solution for extracting PAHs from soils.

In order to obtain reproducible results of the extraction and to minimize the loss of PAHs in the cleanup step, four SPE cartridges were tested (SPE Bond Elut-C8, SPE HYPERSEP SI, HYPERSEP Verity-CX, and Florisil SPE Column). The results of the recovery rates obtained after the passing of the enriched solutions with 16PAHs through the used SPE cartridges are shown in Table 1. The recoveries of the 16PAHs were high when the SPE Bond Elut-C8 was used with an average value of 82.59%. This result confirms that the polarity is not the only parameter having impact on the efficiency of molecules retention at the stationary phase of an extraction cartridge. It is shown also in Table 1 that the dichloromethane DCM is the best solvent to use for the conditioning of an SPE cartridge. The SPE C-8 cartridge conditioned with 9 mL of DCM solvent has been retained for the purification of the soil sample extracts.

Table 1 Mean recoveries(CV%) and standard deviations (SD%) \of Σ16PAHs after SPE with (SPE Bond Elut-C8, SPE HYPERSEP SI, HYPERSEP Verity-CX and Florisil SPE Colum) cartridges using dichloromethane (DCM), hexane (HEX), and isooctane as a conditioning solvent for SPE

Levels of PAHs

The sixteen target PAHs were detected in all the fifty-five studied topsoil samples. PAH individual concentrations for all the samples are summarized in Table S6 in Supplementary Material (SM). The concentrations of PAHs (Σ16PAHs) among them the seven carcinogenic PAHs (Σ7CarPAHs), namely B(a)ANT, CHR, B(b)FLT, B(k)FLT, B(a)P, INP, and DBA, extracted from the fifty-five topsoil samples are shown in Table 2. The concentrations of Σ16PAHs varied from 143.73 to 4575.65 μg kg−1 (dry weight) with a mean value of 1209.56 ±985.16 μg kg−1. It is noted that the concentrations of Σ7CarPAHs constitute 8.71 to 58.67% of the concentrations of Σ16PAHs with an average value of 38%. The concentration of B(a)P, one of the most potent carcinogenic PAHs, varied from 2.18 to 414 μg kg−1 with a mean value of 54.63 μg kg−1.

Table 2 Descriptive summary of the concentration (μg kg−1) of the 16 PAHs in 55 samples of soil

In order to improve the understanding of the variation of the obtained values of PAH concentrations and the pollution origin, the sampling sites were classified into rural (11sites), suburban (12 sites), urban (19 sites), and industrial-urban (13 sites) (Table S1 in SM).

The distribution of Σ16PAHs in the residential areas was mapped out by MapInfo professional (v.8.0) and illustrated in Fig. 2. It is shown that the areas far from the industrial activities and heavy traffics have relatively low PAH concentrations (< 600 μg kg−1) (i.e., sampling point P14, P15, and P45). The residential sites affected by the heavy traffic emissions have quite high concentrations (~ 1500–3000 μg kg−1) (P48, P57, and P64). The sites affected by industrial activities have quite substantially high PAH concentrations (> 3000 μg kg−1) (P23, P34, and P72). It is found that the PAHs with their seven carcinogenic PAH concentrations follow this trend: industrial-urban > urban > suburban > rural (Fig. 3).

Fig. 2
figure 2

Map of the study area showing the sampling points and spatial distribution of Σ16PAHs concentration

Fig. 3
figure 3

Average concentrations of Σ16PAHs and Σ7 CarPAHs by soil categories. Error bars are 1 SD

The number of the samples and the total average concentrations for each class of the sampling site in this study are summarized in Table 3. The industrial-urban soil showed significantly high contamination with 16PAHs compared with urban, suburban, and rural soils. The total average concentration of 16PAHs found in industrial areas would be 2 times higher than in urban soils and 3 times higher than that in rural soil, indicating the substantial influence of anthropogenic activities. The total concentrations of the 16PAHs in industrial-urban soils (pharmaceutical, chemical, and petrochemical industries) were between 336.46 and 5815.84 μg kg−1 with an average of 2237.24 μg kg−1. It is noted that the 16PAH concentrations were also considerably high in sites with heavy traffic (i.e., sampling point P48 and P58).

Table 3 Number of samples, average of total and 7carcinigenic PAHs, concentrations, and standard deviations (μg kg−1) in four soil types

Ladji et al. (2007) found that Σ16PAH concentrations in the atmosphere of Algiers city varied from 13 to 212 pg m−3. In our study, the highest concentration of Σ16PAHs recorded in the soil at the same sampling point (P72) was 4575.65 ± 985.16 μg kg−1. This convergence between the two quantification results in soil and air proves that there is a very important exchange in this area, where the presence of a source of pollution nearby probably the landfill of solid waste closed since 2006 (Besse et al. 2016; Cetin et al. 2017).

In order to evaluate the soil environmental quality in Algiers, the total 16PAH levels in this study were compared with other recent measurements in other similar soils around the world. The concentrations of Σ16PAHs found in our study were higher than those found for industrial soils in Yangtze River Delta region (471.30 μg kg−1) in China (Wang et al. 2017a) and were similar than that given for the industrial area of Dilovasi (992 μg kg−1) in Turkey (Yurdakul et al. 2019).

Algiers has a mild Mediterranean climate with mild rainy winter and hot sunny summer. This probably made Algiers soil PAHs decrease by evaporation and runoff.

Composition profiles of PAHs

Figure 4 shows the percentage contribution of PAHs, according to their number of rings, in the total contamination of the different soil classes. It is noted that the relative proportion of PAH congeners found for the different classes of the sampling sites is between 38 and 45% for the PAHs with 2 to 3 rings (LMW) and between 30 and 39% for the PAHs with 5 to 6 rings (HMW). The contamination with LMW PAHs in the industrial-urban soils was relatively higher than in the other categories of soils. LMW PAHs constituted 45% of the total PAH contamination. The urban soil is more contaminated by HMW PAHs with a percentage of 39%. It could be concluded that the contamination of all the sampling soils by PAHs was dominated by HMW PAHs. Indeed, out of 55 sampling points only five points where the percentage of contribution of LMW PAHs exceeds 50% (Fig. S 4 in SM).

Fig. 4
figure 4

The percentage contribution of 2–3, 4, and 5–6 rings of PAHs to the total contamination in rural, suburban, urban, and industrial-urban soils

Sources of 16 PAHs

Three successive steps were adopted to identify the sources of the 16PAHs in the analyzed soil samples:(1) search if a relationship existed between the concentrations of individual 16PAHs and percent soil organic matter (SOM %) in the sample; (2) identify the probable sources of 16PAHs in soil using the principal component analysis (PCA); (3) give the distribution patterns of 16PAH isomers in order to highlight the hypotheses on their origin (Zheng et al. 2014; Wang et al. 2017a; Li et al. 2019).

Determination of the relationship between 16 PAH concentration and percent soil organic matter

Chen et al. (2005) and Doick et al. (2005a) have reported that the organic matter could be considered the most important factor limiting availability and mobility of persistent organic pollutants (POPs) in the soil.

In this study, the percent soil organic matter (SOM%) found for the soil samples was ranged between 4.2 and 19.35% with an average value of 9.54% ± 3.10 (AVG ± SD). The scatter plot of percent soil organic matter (SOM%) vs Σ16 PAH concentration with Pearson’s correlation type was used, and the individual PAHs in soil samples did not correlate with SOM% (R2 = 0.001, p < 0.05) (Fig. S5 in SM). Consequently, the Σ16PAH concentration could obviously be affected by other factors than SOM% (Hassanin et al. 2004).

16 PAH source identification using principal component analysis

Principal component analysis (PCA) with varimax rotation was also used to analyze the relationship between 16PAHs in all samples studied on the basis of concentration in order to identify possible sources of contamination.

PCA is a statistical analysis tool used to reduce the original variables (concentrations of 16PAHs) to two principal components (factors) F1 and F2.

In our study, the factors F1 and F2 represent 70.4% of the total variance (Fig. 5). It is shown that the PAHs were clearly separated into two groups (I and II) based on the factor loading plot. Low molecular weight PAHs such as naphthalene (NAP), acenaphthylene (ACY), acenaphthene (ACE), and fluorene (FLU) were in group I and strongly correlated with F2. The rest of twelve (12) high molecular weight PAHs, except PHE and ANT, were in group II and correlated with F1 explaining the 54.82% of the total variance.

Fig. 5
figure 5

Principal component analysis (PCA) plot showing the distribution of the 16 PAHs on 55 samples in the two-dimensional (The length of the vectors reflects the quality of its representation in the F1/F2 plot. The angle between variables and between variables and axes reflects the nature of bindings (positive or negative) between these variables. Dot indicates the soil samples)

It is reported that naphthalene (NAP), acenaphthylene (ACY), acenaphthene (ACE), and fluorene (FLU) are primarily originated from petroleum spills so the factor F2 could indicate a petrolic source of PAHs (Marr et al. 1999; Dobbins et al. 2006; Li et al. 2015).

The phenanthrene (PHE), anthracene (ANT), fluoranthene (FLT), pyrene (PYR), benzo(a)antracene B(a)ANT, and chrysene (CHR) could be considered as markers of a coal combustion, while the indeno(1,2,3,cd)pyrene (INP) and benzo(g,h,i)perylene (B(g,h,i)P) have been determined as typical markers of a vehicular source of PAHs (Larsen and Baker 2003; Agarwal et al. 2009; Li et al. 2015). Consequently, the factor F1 seems to represent a combination of petroleum and coal combustion sources of 16 PAHs.

Figure 5 shows that eight points (samples) on the right of the biplot and on the positive side of the factor F1 (HMW) correspond to concentration values of the Σ16PAHs ranged between 2901.10 and 5815.85 μg kg−1 (Table S6 in SM). The samples corresponding to these points were collected from the sampling sites situated at the south, the east, and the north of the industrial area (Fig. 2).

Two other points (samples) on the positive side of the factor F2 (LMW) and on the top of the biplot have Σ16PAH concentrations ranged between 937.73 and 971.84 μg kg−1 (Table S6 in SM). The sampling sites of these points are located at the west of the industrial area (Fig. 2).

It is also shown (Fig. 5) that PAHs of some pairs (PYR and FLT; B(a)ANT and B(b)FLT; B(a)ANT and B(a)P; INP and B(g,h,i)P) correlated well with each other with determining coefficients higher than 0.9. This good correlation indicates that the soil PAHs in the studied area come from similar sources (K. Banger et al. 2010).

16 PAH source identification using isomeric ratios of PAHs

To identify the sources of 16PAH contaminants of our sampling sites, the diagnostic ratios (Yunker et al. 2002; Bortey-Sam et al. 2014; Zheng et al. 2014; Li et al. 2015; Cachada et al. 2016; Haddaoui et al. 2016; Cai et al. 2017; Wołejko et al. 2018; Guo et al. 2008) were also applied.

It is reported that the pyrogenic PAHs are characterized by high molecular weight (above 4 rings) while the petrogenic PAHs are characterized by low molecular weight (2–3 rings) (K. Banger et al. 2010; Haddaoui et al. 2016).

Figure 4 shows that the mean contents of HMW PAHs were higher than LMWPAHs (HMWPAHs/LMWPAHs>1) for the urban soils samples. This suggests that pyrogenic processes are the dominant sources of PAHs in this class of soil. In the industrial-urban soils, the mean contents of LMWPAHs were higher than HMWPAHs (HMWPAHs/LMWPAHs<1). This implies that these PAHs are essentially originated from petrogenic sources (Morillo et al. 2007; Wilcke 2007; Zhang et al. 2008; Tobiszewski and Namieśnik 2012).

In this study, the values from seven diagnostic reports were used to identify sources of PAH in the study area. Table 4 summarizes the reports used as well as the variation intervals attributed to each type of source.

Table 4 Molecular reports selected for identifying sources of PAHs in this study

It is shown by the Fig. 6a that the ratio of ANT/(ANT + PHE) is between 0.10 and 0.8, and the ratio of FLT/(FLT + PYR) varied from 0.4 to 0.66. These results indicate that the most sources of PAHs in the rural, the suburban, and the urban areas are traffic, biomass, and coal combustion. The petroleum source has affected some points of the urban-industrial zone (Yunker et al. 2002; Tobiszewski and Namieśnik 2012).

Fig. 6
figure 6

The diagnostic coefficients of PHE/ANT vs FLT/PYR and ANT/(ANT + PHE) vs FLT/(FLT + PYR) and INP/(INP + B(ghi)P) vs B(a)P/B(ghi)P, and B(a)ANT/(B(a)ANT + CHR) vs INP/(INP + B(ghi)P), and FLT/(FLT + PYR) vs INP/(INP + B(ghi)P) contamination from the four (04) soil categories. P, petroleum; PC, petroleum combustion; BC, grass/wood/coal combustion; P/C, mixed source; C, combustion; T, traffic source; NT, no traffic; IND, industrial-urban soils; UR, urban soils; SU, suburban soil; and R, rural soil

Figure 6b shows that the ratio of B(a)P/B(g,h,i) P for 80% of the 55 points (samples) exceeds 0.6 indicating a source related to a traffic emission. The ratio INP/(INP + B(g,h,i)P for 89% of the 55 points (samples) exceeds 0.50 suggesting a predominant combustion source. The ratio of B(a)ANT/(B(a) (ANT + CHR) for the most number of 93% of the points (samples) is higher than 0.35 indicating that the major sources of PAHs for the sampling sites are grass, coal, wood, and petroleum combustion (Fig. 6c) (Tobiszewski and Namieśnik 2012).

Figure 6d presents PHE/ANT vs FLT/PYR; the results confirm and secure the results of PCA analysis and indicate that the combustion is the most sources of PAHs in our studied area.

Moreover, Fig. 6e gives more clarification on the type of combustion (petrol or coal and biomass) more present in the studied area. It consolidates the results obtained by the previous figures and indicates more clearly that the pyrogenic sources are the most present in our area, in particular the combustion of the coil, wood, grass, and fuel (Yunker et al. 2002; Akyüz and Çabuk 2010; Oliveira et al. 2011;Ravindra et al. 2008; Tobiszewski and Namieśnik 2012; Wołejko et al. 2018). The results obtained for the studied area suggest that the PAH contaminants are mainly originated from combustion.

Assessment of the toxicity risk, ecological risk and human health risk of PAHs in soil

The toxicity risk

The values of toxic equivalence quantities (TEQ) of 16PAHs, the toxic equivalence quantities of carcinogenic PAHs (TEQ-carc) which are the sum of TEQs of seven carcinogenic PAHs, and the toxic equivalence quantities (TEQ) of benzoapyrene B(a)P (TEQ(B(a)P)) for each category of soil are given in Table 5. The values of TEQ, TEQ-carc, and TEQ(B(a)P) calculated of all the soil categories show the following trend: industrial-urban > urban > suburban > rural (Fig. 7). The values corresponding to the industrial-urban soil category are three times higher than the values of the rural soil category. The average TEQ values related to the industrial-urban, urban, suburban, and rural soils are 211.60, 176.04, 97.63, and 79.09 μg kg−1, respectively. The TEQ values were calculated individually for the 55 sampling sites, and the results are summarized in Table S7 in supplementary material (SM). The highest TEQ value was detected for the industrial-urban soil with ∑PAHs equal to 684.85 μg kg−1. TEQ values (Table 5) show that the activities corresponding to the industrial and urban areas (industrial activity, road traffic) generate more toxic components than the agricultural activities in the rural areas. The contribution of the seven carcinogenic PAHs in the total toxicity for the four soil categories exceeds 50%. This contribution represents 63.76% for the urban-industrial soil.

Table 5 Toxicity assessment of PAHs in rural, suburban, urban, and industrial-urban soil samples (μg kg−1)
Fig. 7
figure 7

Toxicity assessment of PAHs in rural, suburban, urban, and industrial-urban soils. Samples (μg kg−1)

It is noted that two points (samples) among 55 samples collected from the urban-industrial soil have total TEQ values slightly higher than the safe soil levels recommended by the Canadian Council of Ministers of the Environment (600 μg kg−1) (Council 2008). The first highest TEQ is observed for P72 (679.84 μg kg−1) and the second P41 (684.85 μg kg−1) soils (Table S7 in SM). The main contributors to these TEQ are B(a) (60.9%) and DBA (48.55%), respectively. The soil sample P72, which was collected close to a dump closed since 2006 in Oued Smar zone and p41, was collected near a highway interchange north of a petrochemical facility.

Ecological risk

The ecological risk of PAHs for the four categories of soils of studied area was evaluated based on the determination of the risk quotient (RQ) (Sun et al. 2015). Table 6 shows the values of RQ for individual PAHs of negligible concentrations (NCs) (RQNCs) and of maximum permissible concentrations (RQMPC)). The values of negligible concentrations (NCs) and maximum permissible concentrations (MPCs) for the 16PAHs were taken from the literature (Gereslassie et al. 2018).

Table 6 The assessment results of ecological risk of PAHs in the four categories soils

It is shown that the mean calculated values of risk quotient of negligible concentrations (NCs) RQNCs and of risk quotient of maximum permissible concentrations RQMPC for pyrene (PYR) and fluorene (FLU) (industrial-urban category soil) are above 1 (Table 6). This indicates the presence of a high ecological risk (Table S3 in SM). The mean calculated values of RQNCs and RQMPC of chrysene (CHR) for all soil categories are lower than 1 and indicate the absence of ecological risk (Table S3in SM). For the other 14 PAHs, the average calculated values of RQNCs are greater than 1, and the mean calculated values of RQMPC are lower than 1 implying that these PAHs have moderate ecological risk (Table S3in SM). The calculated value of RQ∑(PAHs(NCs) is below 800, and the mean calculated value of RQ∑(PAHs(MPCs) is higher than 1 suggesting that the total ecological risk of 16 PAHs for the soil samples is of a moderate level (Table S3in SM).

It could be concluded that the two parameters of ecological risk (RQNCs and RQMPC) of PAHs calculated for the sampling sites show the following trend: industrial-urban > urban > suburban > rural. The risk quotient (RQ) of PAHs was calculated individually for the 55 sampling sites, and the results are summarized in Table S 8 in SM.

It is noted that the point P72 (industrial-urban zone) presents a high ecological risk level with ACE, B (a) ANT, B (b) FLT, and B (a) P as compounds. Also, the point P31 (industrial-urban zone) is considered of high ecological risk with NAP, ACY, ACE, and FLU as compounds (Table S8 in SM). The P31 sample was collected near a liquefied petroleum gas (LPG) system plant for vehicles (point P72 has already been discussed).

Human health risk

The human health risk for the seven carcinogenic PAHs (7CarPAHs) were calculated individually for the 55 sampling (Tables S9 in SM). The results of the calculations for the four soil categories are summarized in Table S10 in SM. It is shown by the Table S10 in SM that the total lifetime cancer risks (TLCR) calculated for the sampling sites show the following trend: industrial-urban > urban > suburban > rural, and the highest risks for PAHs were from the ingestion exposure, followed by dermal contact and the inhalation.

According to US Environmental Protection Agency (USEPA), the target excess individual lifetime cancer risk is one in one million (10−6), when the TLCR ≤ 10−6 and the potential cancer risks are negligible (USEPA 2019).

Among the 55 sampling sites in this study, six samples from urban and industrial-urban soil categories possessed TLCR values higher than 1 × 10−6 which indicated an acceptable risk level (USEPA 2019).

These sites were located near the highly contaminated valley of El-huarache in which different domestic and industrial effluents are discharged. This 67-km long river is from the Atlas mountains and flows into the Mediterranean sea in the middle of the bay of Algiers. It is considered to be one of the most important rivers of northern Algeria in which all kinds of pollutants are rejected. Its speed flow is very slow and close to zero in the turning zones. Also, these sampling points are near the highway of vehicles (P57 and P58) and chemical plants in the industrial area of Oued Smar (P72).

It is found in this study that the cancer risk values, sum of three total cancer risks (TRingestion, TRinhala, and TRdermal), calculated are 8.83 × 10−07, 7.60 × 10−07, 1.75 × 10−07, and 1.05 × 10−07, respectively, for the industrial-urban, urban, suburban, and rural sampling sites. These values tend to increase going from the rural zone to the industrial zone but indicate a negligible potential cancer and human health risks.

Conclusion

The impact of a petrochemical factory in the soil surrounding was carried for the first time in a mostly industrial area of the Algerian capital. This impact was evaluated by the systematic analysis of the 16PAHs content in soil of a 72 km2 area. After extraction by an adapted and original methodology, guaranteed a very highly recoveries, the concentration of the 16 PAHs was measured by GC/MS.

The contamination of the soil by 16PAHs that appeared follows a downward trend by moving away from industrial and urbanized areas. Indeed, the concentration in the industrial-urban soil is 2 times higher than for the urban soils and 3 times higher than for the rural soils indicating a substantial influence of anthropogenic activities. The PAH concentrations were also considered high for the sites with heavy traffic. The associated TEQ is generally down to 300 μg kg−1 and definitely lower than the recommendation of the Canadian Soil Quality Guidelines (TEQ < 600 μg kg−1) exception of some area close to a close a closed dump or industrial devices.

PCA analysis was used to provisionally define the sources of pollution; it revealed the presence of two types of potential sources: one relating to the HMW PAHs sampled from sites located to the south, east, and north of the industrial zone, and a second on the LMW PAHs collected to the west of the industrial zone.

The diagnostic ratios were also applied to identify the sources of PAH contaminants in sampling sites. The results suggested that pyrogenic processes, such as combustion of the coil, wood, grass, and fuel, are dominant sources of PAHs in the majority of samplings soil. Nevertheless, there are some points with a typically petrogenic source of contamination, especially in industrial-urban soils.

No ecological risk of 16 PAHs was recorded in the study area except for two sampling points, which slightly exceed the thresholds, and are both located in the vicinity of industrial zones. The potential cancer of human health risks calculates through TLCR results indicated that the exposure to the 7CarPAH-contaminated soils has a negligible cancer risk to human health.