1 Introduction

Mercury (Hg) is considered the most toxic substance and as ranked third among all potential toxic metals (PTMs). Human exposure to Hg causes health concern, declared by the US Agency of Toxicological Substances and Disease Registry (ATSDR 2012). According to WHO and USEPA, many toxic elements, viz. Pb, Cd and Hg, cause environmental health problems in the form of skin lesion, weight loss, reduced cognitive abilities, brain and neurological disruption. Therefore, these contaminants are listed as the most hazardous inorganic substances (USEPA 2000; WHO 2011). Once these elements are released into the environment, it will remain there.

Natural resources including drinking water, viz. surface and groundwater, soil and plants contain different sources of Hg, which are present in various segments of the environment. The most important segment is the lithosphere which contains wide variety of Hg. The lithological deposits include limestone, sandstone, granite rocks, andesite, rhyolitic tuffs, diabase dikes and schists and phyllite. Moreover, ore and gangue minerals also contain Hg in veins of breccia and silicified rocks. Besides, these minerals in some volcanic rock contain quartz and opal, siliceous sinter deposits, cinnabar (HgS), metacinnabar (HgS), calomel (Hg2Cl2), and mercury oxychlorides like (Hg2ClO and Hg4Cl2O) contain abundant Hg. Additionally, enriched Hg is also found in the deposits of marcasite (FeS2), pyrite (FeS2), stibnite (Sb2S3), sphalerite ((Zn, Fe)S), barite (BaSO4), alunite (KAl3(SO4)2(OH)6), muscovite (KAl2(AlSi3O10)(FOH)2) and clay minerals (Al2O3 2SiO2 2H2O) (Johnson et al. 1977), whereas the anthropogenic sources include industrialization, urbanization, transportation, smelting, burning of waste and mining activities (Muhammad et al. 2018; Rashid et al. 2019a).

Drinking water are obtained from a variety of sources in Pakistan. These drinking water sources include glacier, rivers and lakes; groundwater wells and springs water are utilized for domestic use. Nowadays, drinking water sources are mostly contaminated with microbes, dissolved inorganic and organic substances (Nriagu and Pacyna 1988; Rashid et al. 2019a). Recently, there are contamination of water sources and fresh water shortage in Pakistan, and recently, water contamination gains the attention of environmental scientist (Rashid et al. 2018). Latest research study shows that mercury emission from geogenic inputs, viz. weathering of rocks, volcanoes, forest fire, fossils fuels and evaporation from the ocean’s surface, is significantly higher than the mercury emission from anthropogenic sources (Pirrone et al. 2009). Thus, both geogenic and anthropogenic actions release mercury in the surrounding aquatic environment. The long-term exposure of Hg become most harmful, and it ultimately causes cancer in human being (Martinis et al. 2009).

Mercury is a toxic element of environmental concern due to higher concentrations (WHO 2011). Mercury is persistent, bio-accumulative and toxic in their natural abundance (Adeniji 2004). This research study was designed to find out the Hg concentration in groundwater sources. Groundwater is an important source by inferring the concentration of potential toxic substances in water causing harm to human health (Duruibe et al. 2007). Continuous use of contaminated drinking water causes diseases like skin injury, kidney and nervous system damage, fingers and toes numbness (Martin and Griswold 2009). Mercury remains the third toxic element impairing the living status of human being (Rajeswari and Sailaja 2014). The current population of Pakistan is 191.7 million which is expected to reach about 240 million by the year 2030. This overpopulation has direct link with water quality resources to meet the domestic needs (Mohsin et al. 2013; Rashid et al. 2019b).

Multivariate statistics techniques like principal component analysis (PCA), multilinear regression (MLR), factor analysis (FA) and clustering analysis (CA) were employed to better understand the groundwater parameters of the studied surrounding water aquifers. These statistical tools mostly help in the determination of factor scoring by explaining the potential pollution/contamination sources, whereas the results of CA determine the severity of groundwater data arranging in the form of cluster/groups. Overall, these applications provide better variable tools controlling groundwater sources by eliminating redundancy of data matrix and providing prompt solutions for pollution problem (Dao et al. 2001).

The results of different studies in Pakistan reflect water pollution problems. The possible causes of water pollution reported by scientist were urbanization, mineral dissolution, weathering of parent material, volcanic eruption, mismanagement, industrialization and rapid with drawl (Azizullah et al. 2011; Rashid et al. 2019b). Different industries release hazardous waste to water bodies, viz. river, streams, canal, springs, lake and ocean. The hazardous waste effluent contains persistent toxic elements (PTEs), pesticide arsenic, fluoride, mercury (Hg) and other toxic heavy metals which pollute groundwater aquifer. Poor water quality, weak water management and sanitation processes are playing a major role in spreading of waterborne diseases in Pakistan (Muhammad et al. 2018).

This study was designed to investigate the groundwater contamination with mercury (Hg) in the drinking groundwater sources of three hydrological environments in District Swabi, Pakistan. In detail, we compared the geochemistry of Hg with the shallower, middle depth and deeper depth aquifers in order to obtain the following objectives: (1) to measure the geochemical profile of mercury (Hg) in three hydrological environs, viz. shallower, middle depth and deeper aquifer; (2) to understand the geogenic and anthropogenic origin of groundwater pollution by applying multivariate statistical techniques; (3) to identify the pollution load index (PLI) and health risk assessment of Hg.

2 Location of study area

2.1 Site selection

The area under study was District Swabi, Pakistan (Fig. 1), located within 140 km from Peshawar basin, covering an area of about 4.516 km2. District Swabi is located between 34° 19′ 07.17″ °N and 72° 24′ 59.12″ °E. The area is comprised of four small villages, viz. Naranji, Mir Ali, Parmoli and Sher Dara. Swabi District is divided into northern hilly areas and southern plain areas. The important hills are situated in the northwest side known as the Narangi hills, and the plain area of the district is intersected by numerous streams like an important stream known as “Narangi Khawar.” Narangi is located 65 km northeast of the city of Peshawar.

Fig. 1
figure 1

Map showing the location of the groundwater of District Swabi, Pakistan

2.2 Climatology, hydrology and geological setting

Climatic condition of study area lies between warm and temperate. During winter seasons, less rainfall occurs, while during summer, most rainfall takes place. The mean annual temperature was recorded to be 22.2 °C. The hottest month of the year is June and July with the mean temperature of 33 °C. The mean annual rainfall is 650 mm, November and December are the driest month with the mean rainfall/precipitation of 15 mm, and the wettest month is the month of August with the mean rainfall of 140 mm (GOP 2017). The average precipitation in 2017 (the year of present study) was 20% less than normal, though the monsoon brings the highest rainfall.

The localized hydrology reflects that how much drinking water of different sources are consumed by local resident. So most of the drinking water sources are recharged from precipitation and snowfall. The shallower drinking water have lower water table, which are predominantly used for domesticated purposes (Rashid et al. 2018). Most of the local people used groundwater from bore well, dug wells, storage tank, hand pump, spring and tube wells. Drinking water of the municipal community system (tube well) is distributed via supply lines.

The geological setting of the area is comprised of alkali granite known as Ambela granite complex, composed of alkali granites, syenites, quartz syenites, basic dikes and feldspathoidal syenites (Shah and Danishwar 2003). The granites predominantly consisted of plagioclase, alkali feldspar and less quantities of ores including apatite, epidote, biotite, muscovite, zircon, chlorite quartz and clay minerals (Rafiq and Jan 1988). The Koga complex lies at the northeast side of study area (Fig. 2) and composed of feldspathoidal syenites, carbonatites, fenites, syenites and associated rocks (Chaudhry 1982). Shah and Danishwar (2003) described detailed petrographic accounts of the Koga complex. The study conducted by Shah and Danishwar (2003) shows that most of the exposed rocks of Ambela and Koga complex contain fluoride bearing minerals, viz. hornblende, fluorite, micas, tourmaline and apatite minerals which ultimately help in the release of higher content of Hg in groundwater.

Fig. 2
figure 2

Modified from Chaudhry (1982)

Geological map of District Swabi and its surrounding areas.

3 Methodology

3.1 Field visit and sampling

Field survey was conducted to collect drinking water from four different regions, viz. Naranji, Parmoli, Mirali and Sher Dara of District Swabi, Pakistan. Groundwater samples (n = 38) were collected from three hydrological environments, viz. shallower (n = 16), middle depth (n = 12) and deeper depth (n = 10). The drinking water sources include hand pump, tube well, bore well, dug well and spring. Water sample of 100 mL size was collected for basic parameters like pH, EC and TDS that were calculated in situ by means of portable pH and EC meter. The drinking water samples for Hg and toxic trace elements (TTEs) analysis were filtered via 0.42-µm filter paper (Rashid et al. 2018; Rashid et al. 2019a). The groundwater samples were preserved in clean washed bottle with potassium dichromate K2Cr2O7 and ultrapure HNO3, whereas water samples for Pb, Cd and Zn analysis were collected in polypropylene bottle which were preserved with ultrapure HNO3. The groundwater samples were transferred to the geochemical laboratory of NCEG, University of Peshawar, where groundwater samples were kept in the refrigerator below 5 °C before the analysis.

Hg in the drinking water samples was calculated via cold vapors atomic absorption (CVAA) spectrophotometer technique, and rest of TTEs were measured by graphite furnace atomic absorption spectrophotometer (PerkinElmer HGA, 700). For the determination of Hg, 50 µg/L gold solution was added to make 20 mL of water sample to amalgamate Hg within the sample. After this, 10 mL of HCl (1.5%) and 2 drops of KMnO4 (5%) were added to the reaction vessel; after shaking, 100 µL of NaBH4 (3%) solution were added to the reaction mixture which produced air bubbling. Afterword, Hg at a wavelength of 253.7 nm at 180 mA using CVAA technique was measured.

3.2 Pollution load index (PLI)

Groundwater was measured through pollution load index (PLI). It mostly depends on the elements concentration by comparing with reference concentrations. For the calculation of PLI, we used the following equation (Liu et al. 2005):

$${\text{PLI}} = \frac{{\rm Cw}}{{\rm Cr}}$$
(1)

where “Cw” is the elemental concentration of Cr(0.5 µg/L) (lowest possible concentrations among all the samples) of mercury as suggested by World Health Organization (WHO 2011). PLI < 1 suggests no pollution; PLI > 1 indicates the presence of pollution (Yang et al. 2011).

3.3 Groundwater risk assessment (GRQ)

Groundwater risk assessment is a fundamental method used to identify that at what level the drinking water contaminant poses a threat to the local community. Therefore, groundwater risk assessment can be determined by using the following equation (Odukoya and Abimbola 2010):

$${\text{GRQ}} = \frac{{\rm GC}}{{\rm GTV}}$$
(2)

where “GRQ” is the water risk quotient, “GC” is the chemical concentration of groundwater samples, and “GTV” is the groundwater threshold value. The GTV value up to 0.75 is considered safe for groundwater consumption. If the GRQ values of Hg ≤ 1 low priority pollutants, GRQ > 1 up to 10 is considered medium priority and GRQ > 10 then Hg as considered the highest priority pollutants.

3.4 Statistical analysis

3.4.1 Cluster analysis

Clustering analysis (CA) is an important statistical tool for the calculation of similarity and dissimilarity index. Groundwater data were assembled into groups by applying clustering. It reduces repetition in the data set. The variables form cluster on the basis of similarity index. Most variables of water samples having same nature fall within the same groups, whereas those samples vary in characteristics will form a different group. The most convenient clustering technique is the hierarchical agglomerative clustering providing identical relation for overall data set. The results of groundwater data are represented by a plot called dendrogram, which shows three clusters C1, C2 and C3, respectively. The Euclidean distance measured the differences and resemblance between water samples (Otto 1998; McKenna Jr 2003). Clustering analysis used Ward’s method to measure the distance between the clusters in such a way that the square sum of two clusters would be reduced (Singh et al. 2005).

3.4.2 Principal component analysis and multilinear regression

Principal component analysis (PCA) and multilinear regression (MLR) are combined to identify the possible pollution sources of drinking water ingredients, in terms of percentile contribution (Jehan et al. 2018; Rashid et al. 2019a). First of all, pollution load was calculated (PI). Then, PCA was performed for factor score through varimax rotation and reduction method (Ellison 2004). Afterward, regression was performed by uploading the PI as dependent variable, and all calculated factors act as independent variables (Dragović et al. 2008). The R2 value was achieved from model summary. Moreover, factors F1, F2 and F3 were used as dependent and independent variables. At last, the percentage contribution of every individual component was measured by subtracting the R2 value of each component from the overall R2 values.

3.5 Mapping

Handheld GPS was used for the collection of geographic coordinates of drinking water samples. The geographic coordinates were plotted in the excel sheet and were uploaded in the ArcGIS version 10.3 to produce location map (Fig. 1). The geological map was extracted and modified from the map used by Chaudhry (1982) (Fig. 2). The spatial interpolation map for Hg was generated to describe the pattern of drinking groundwater contamination (Fig. 3).

Fig. 3
figure 3

Iso-concentration map showing the distribution of Hg in groundwater of study area

3.6 Quality control measure

The analytical techniques were applied under a strict control measures. Overall, laboratory glassware’s and Teflon vessels were used in testing; note that all glassware was soaked with 10% HNO3. Earlier during cold vapors atomic absorption (CVAA) spectrophotometer technique, 50 µg/L gold (Au) solution was added to groundwater samples to merge the Hg in groundwater samples. Thus, a standard rinse solution was used according to the standard method introduced by Zhou et al. (2018). The reproducibility and reliability of water data were checked by assessing blank (double-deionized water) and pre-analyzed samples after every 6 samples (APHA 2005).

4 Results and discussion

4.1 Hydrochemistry of drinking water

Table 1 represents hydrochemistry of drinking water and its specification with WHO (2011). For convenience in discussions, the drinking water sources were categorized into three hydrological environments based on their depth profile, viz. shallower (10–20 m), middle depth (25–45 m) and deeper depth (50–90 m), respectively.

Table 1 Descriptive statistic of groundwater (n = 38) of District Swabi, Pakistan

The shallower drinking water was slightly alkaline, and the pH values were in the range of 6.80–8.20 (Jordana and Batista 2004). The variation in pH in groundwater sample causes changes in the hydrochemistry of water. The pH values of drinking groundwater sources were observed within the WHO guideline values. Depth profile, EC and TDS values of shallower aquifer water were in the range of (10–20) m, (425–750) μS/cm and (265–460) mg/L, respectively. Spatial variation in electrical conductivity (EC) and TDS content reveals heterogeneity in hydrogeochemistry of shallower aquifer. Thus, the chemistry of groundwater was not homogenous and therefore governed by various geochemical mechanisms (Nagarajan et al. 2010). Therefore, Pb, Cd and Zn concentration in shallower groundwater depth were in the range of (8.50–30.80, 0.40–2.39 and 150–2200) μg/L, respectively (Table 1). Shallower water are mostly contaminated with lead. The number of drinking water samples (n = 13) exceeds the WHO’s guideline values of Pb in shallower aquifer. Therefore, the shallower drinking water sources are predominantly contaminated with Pb, and their individual percentage contribution was 81% and overall 31%, respectively. The findings of this research were compared with the findings of the research study conducted by Rashid et al. (2019a). It was notified that concentration values of most variables were observed lower (Rashid et al. 2019a).

The middle depth drinking water was also slightly alkaline, and pH values were in the range of 7.10–7.80 (Jordana and Batista 2004). The depth, EC and TDS were in the range of (450–630) μS/cm, (280–385) mg/L and (25–45) m, respectively. The Pb, Cd and Zn concentrations in middle depth drinking water were in the range of (5.5–8.5, 0.70–1.40 and 100–650) μg/L, respectively (Table 1). The drinking water sources of middle depth water show satisfactory results regarding Pb, Cd and Zn concentrations. However, when these water sources are mixed with the shallower aquifer water, it will rapidly contaminate the underground sources.

The deeper drinking water was slightly alkaline, and the pH values were in the range of 7.20–8.10 (Rashid et al. 2018; Rashid et al. 2019a). Similarly, depth profile, EC and TDS content were in the range of (50–90) m, (475–745) μS/cm and (285–445) mg/L, respectively. Similarly, Pb, Cd and Zn contents in drinking water samples were n the range of (0.21–3.50, 0.10–0.45 and 45–245) μg/L, respectively (Table 1). The drinking water sources of deeper water show satisfactory results regarding Pb, Cd and Zn concentrations. The comparative study shows that the results of Pb, Cd and Zn were similar to those reported by Rashid et al. (2019b) and are different from those documented by Sultan et al. (2011) and Khan et al. (2016).

4.2 Depth profile of mercury (Hg)

The range concentrations of mercury in shallower, middle depth and deeper aquifer were (0.85–2.00, 0.28–1.85 and 0.16–1.80) μg/L, respectively. However, shallower water samples show highest concentrations of 2.00 μg/L in the spring water of Naranji area whereas lowest concentration was found 0.85 μg/L in the hand pump of Mirali Banda. Overall, fourteen drinking water samples exceed the concentrations of Hg in shallower water, and their individual percentage contribution was 87.5% and overall 37%, respectively. The middle depth water are mostly contaminated with mercury. Dug well water having 45 m depth shows higher concentrations of 1.85 μg/L Hg in Parmoli area, and the lowest concentration of 0.28 μg/L was shown by hand pump having depth of 27 m in Mirali area. Overall, seven drinking water samples show the higher concentrations of Hg and their individual percentage contribution was 58% and overall 18.4%, respectively. The Hg contamination in middle depth drinking water aquifers resulted from mixing with shallower aquifers of vadose zone. The deeper drinking water samples show the highest concentrations of 1.80 μg/L of Hg which was documented in the bore well of Sher Dara area, whereas the lowest content of Hg up to 0.16 μg/L was documented in the hand pump having 90 m depth in Parmoli area. However, the drinking water samples (n = 5) reveal the higher concentration of Hg and their individual percentage contribution was 50% and overall 13%, respectively. Thus, the Hg contamination in deeper drinking water aquifers resulted from mixing with shallower and middle depth water in the vadose zone.

Spatially and geographically, the concentrations of Hg in drinking water increased in the following order: Naranji > Sher Dara > Parmoli > Mirali. Similarly, Hg pattern was evaluated depth-wise: shallower > middle depth > deeper depth (Fig. 1). Groundwater rock interaction, granitic and gneissic rock, andesite, sandstone, gangue minerals, silicified, breccia and volcanic rock contain various minerals of Hg. However, cinnabar (HgS) and metacinnabar are considered the most dominant form of Hg in the granitic terrain of study area that has relatively higher dissolution in the acidic aquifer, though the solubility of cinnabar is very low in neutral to alkaline environment. Moreover, in most cases, the rate of water flow mostly shows a significant role in making higher mercuric water. Additionally, the less residence time and water–rock interaction, water regime and insufficient interaction among bedrock and water of the aquifer are considered the most significant factors which decrease the dissolved content of Hg in the drinking water (Kim and Jeong 2005). The present study represents the regions located far away from River Indus and River Kabul that were more saturated, because of longer distance and high residence time. The spreading of Hg in drinking water sources of study area located far away from River Indus gradually increases as a result of higher retention time. Therefore, the occurrence of Hg in current study was attributed to host granite, gneisses rocks.

The existence of Hg in the middle and deeper aquifers results from the inter-mixing of water sources with the shallower water in the vadose zone. Moreover, the weathered granitic and gneissic rocks and dissolution of cinnabar (HgS) mineral rock increase Hg in hydrological environment. The sources of water and their interaction with underlying mineral composition of Hg defines the hydrochemistry of drinking water. Thus, principal factor of PCA governing the dissolution of toxic elements like Hg, Pb, Cd and Zn inorder to help in the formation of contaminated water. The results of the current study were complemented with the study designed by Sultan et al. (2011) and Khan et al. (2016), and its results were found many folds lower than this study.

4.3 Inter-elemental correlation between drinking water variables

Correlation matrix measures the closest degree and linearity among dependent and independent drinking water parameters (Singh et al. 2005). Thus, correlation coefficient (r) values of water variables usually define that how appropriately the drinking water samples show linear arrangement. The Pearson correlation coefficient (r) values of drinking water variables are represented in Table 2 and Fig. 4. The most valuable correlation (r) values were reported for pH and EC (r = 0.82), TDS and pH (r = 0.74), EC and TDS (r = 0.96), Pb and depth (r = − 0.77), Cd and depth (r = − 0.80), Zn and depth (r = − 0.48), Cd and Pb (r = 0.82), Zn and Pb (r = − 0.83), Hg and Pb (r = − 0.45), Zn and Cd (r = − 0.67), Hg and Cd (r = − 0.45), Hg and Zn (r = − 0.48), respectively. The Pearson correlation pairs are pH–EC, pH–TDS, EC–TDS, Pb–depth, Cd–depth, Zn–depth, Cd–Pb, Zn–Pb, Hg–Pb, Zn–Cd, Hg–Cd and Hg–Zn, respectively (Fig. 4).

Table 2 Pearson correlation matrix of hydro-chemical parameters in groundwater of District Swabi
Fig. 4
figure 4

Concentration profile of Hg vs depth, pH, EC, TDS, Pb, Cd and Zn, respectively

4.4 Pollution load index

Table 3 represents pollution load indices (PLI) of drinking groundwater of shallower, middle depth and deeper aquifer of District Swabi, Pakistan. The PLI range and mean concentrations of Hg in shallower, middle depth and deeper water sources were (1.70–4.00 and 2.95 ± 0.71), (0.57–3.62 and 2.11 ± 0.89) and (0.33–3.98 and 2.00 ± 1.13), respectively. The PLI values of most samples lie above the recommended values of 1. However, all water samples lie within the range of 4.00, representing moderate pollution in the research area. The percentage contribution of shallower and middle depth was 100%, and deeper water 83%, respectively. However, overall contribution was 94.7%. The highest percentage contribution was shown by shallower and middle depth water, and lowest contribution was shown by deeper water. Mostly, 94.7% samples show PLI > 1. Therefore, the PLI values of drinking water of this study were highly contaminated with Hg and PTEs concentrations (Yang and Li 2011) (Fig. 5).

Table 3 Pollution load index (PLI) and groundwater risk quotient (GRQ) via mercury (Hg) in groundwater (n = 38) of District Swabi, Pakistan
Fig. 5
figure 5

Dendrogram shows the clustering of groundwater into three classes C1, C2 and C3

4.5 Health risk assessment and risk quotient (GRQ) of mercury

Local people of the research area were interviewed in order to extract information about shallower, middle depth and deeper depth drinking water sources, viz. spring, hand pumps, bore wells, dug well and tube wells. Moreover, the residents were asked to get information about their education, income source, body weight, age, eating habits and health-related problems. The interviewer noticed the status of resident people regarding the use of water sources. They notify shallower, middle depth and deeper depth drinking groundwater which were frequently used for drinking, domestic and agriculture needs.

The health risk assessment of Hg was calculated under the standard method, and its results via shallower, middle depth and deeper depth water samples are recorded in Table 3. The findings of GRQ show that drinking water samples above log 10 cause higher priority pollutants, while the samples below log 10 cause medium priority pollutants and samples below Log2 show no pollution.

Table 3 represents the drinking groundwater risk quotient (GRQ) in shallower, middle depth and deeper water aquifer. The GRQ method is the most feasible for characterization of the potential risk of Hg with the use of risk quotient index (RQ). The range and mean values of Hg for GRQ in the three hydrological environment, viz. shallower, middle depth and deeper aquifer water, were (1.13–2.67 and 1.97 ± 0.47), (0.38–2.41 and 1.41 ± 0.60) and (0.22–2.65 and 1.34 ± 0.75), respectively. The consumption of the contaminated Hg water mostly affects children and teenager. It has been proven to cause weight lose, permanent brain and neurological tissue disruption and cancer risk. Moreover, it also reduces cognitive capacities and causes developmental concerns. The pregnant woman and young children are mostly affected by drinking groundwater of the study.

The percentage contribution of Hg for GRQ in shallower, middle depth and deeper water was (100%, 66.6% and 70%) respectively, whereas the overall percentage contribution of GRQ was 81.5%. Mostly, the groundwater samples of the ongoing research area fall within moderate priority pollution. The moderate priority pollution is the pollution whose GRQ values are equal to or above 1, which were mathematically quoted as GRQ ≥ 1 (Odukoya and Abimbola 2010), whereas seven drinking water sample falls in low priority pollutant (GRQ ≤ 1) and none of the drinking water samples reached the GRQ values up to 10, which were consider as highest priority pollution.

4.6 Cluster analysis

The cluster analysis determined the classification of drinking water samples into varying classes. Mostly, the data sets are arranged in classes of cluster, three groups, less pollution, moderate pollution and severe pollution clustering. Hierarchical agglomerative cluster usually used Ward’s method for estimation of squared Euclidean distances in order to calculate similarity and dissimilarity index. The factor scores obtained during cluster analysis reduce the clustering error of groundwater samples. The distance among the two clusters was tested to minimize the square sum of two clusters by following the calculation of variance (ANOVA). The results of clustering are characterized by using dendrogram. The distance of the dendrogram is equivalent to (Dlinkage/Dmax) * 100 that reveals the quotient linkage distance for special case divided by highest linkage distance. Moreover, to standardize the distance, link quotient is multiplied with 100 (Simeonov et al. 2002).

The possible cluster results obtained during this study were grouped into three: the first group of less polluted C1, second moderately polluted C2 and third severe polluted C3. Variability within the class is 15.76% and between the classes is 84.24%. In cluster C1, distance among the class centroid was (0, 547 and 2027). The distance between the class centroid for the second cluster moderately polluted cluster was (547, 0 and 1483). Similarly, for the third cluster C3, severely polluted were (2027, 1483 and 0), respectively. The distribution of samples in the clusters C1, C2 and C3 was 26, 11 and 1 drinking water samples. Within the clusters C1, C2 and C3, the variance was 15,030, 62,342 and 0, respectively. The range, mean distance and centroid for clusters C1, C2 and C3 were (29.36–208, 101.45–386.28 and 0–0) and (110.3, 218.3 and 0), respectively. The number of drinking water samples in the clusters C1, C2 and C3 of shallower, middle depth and deeper water was (6, 12 and 8), (10, 1 and 0) and (1, 0 and 0), respectively.

4.7 Distribution of pollution sources

PCAMLR analysis is carried out by factor matrix after varimax rotation. It is an important complex linear correlation method that enabled to inter-correlate the drinking water variables. Therefore, the variables having similar geochemical composition fall within same group and those who have different comes in other group. Most often this technique was used to consider the strong influence of water variables on identification and apportionment of various elements’ origin in the study area. Thus, we used principal component analysis and multilinear regression for pollutant source distribution and percentage contribution. Therefore, PCAMLR arranged the water variables into group of three significant factors such as F1, F2 and F3 (the present study). Therefore, the contribution of the drinking water sources was divided into three groups: (1) factor F1 caused by predominant natural processes, (2) F2 strong anthropogenic influence and (3) factor F3 based on mixed sources (Table 4).

Table 4 Principal component analysis after varimax rotation reduction for groundwater of District Swabi, Pakistan

To better understand the identification of factors loading and distribution manner of drinking water variables, principal component analysis and multilinear regression were employed (Table 4). After varimax rotation, three valuable factors (F1, F2 and F3) were obtained which explain variability of groundwater samples. Thus, the total variability observed was about 75.41% with eigenvalues of 3.65, 2.45 and 1.20 for F1, F2 and F3, respectively.

Factor F1 exhibits 35.53% variability of the total variation which is 75.41%. The water variable shows positive loading for Pb, Cd, Zn and Hg and negative loading for depth (Table 4 and Fig. 6). However, the overall percentage contribution via MLR results of the factor F1 was 78%, which shows natural pollution in groundwater system of the research area (Fig. 7). The correlation coefficient (r) of Pb (r = 0.45), Cd (r = 0.40), Zn (r = 0.39) and Hg (r = 0.32) and depth was (r = − 0.39), respectively (Table 4). This factor shows strong association of Hg, Pb, Cd, and Zn with each other, and their concentrations show greater variability. However, depth profile of these elements has no significant relation. Though it is challenging task to separate the background concentration of Pb, Cd, Zn and Hg in the drinking water sources due to geogenic input and higher variability in the analytical data, the mercury concentrations were found to be higher in majority of the samples. The downward migration and movement of mercury affected the drinking water of study area. Overall, this factor showed geological influences for Pb, Cd, Zn and Hg in water environment. Mercury in drinking water system is originated from water rock interaction, evaporation, weathering of granite and volcanic rocks and dissolution of minerals, viz. cinnabar and metacinnabar (HgS), whereas Pb, Cd and Zn were originated from mafic and ultramafic rock via weathering processes (Johnson et al. 1977; Jehan et al. 2018; Rashid et al. 2019b).

Fig. 6
figure 6

a Overall loading factors. b Relationship of the first loading factors (F1 and F2), after varimax rotation, dimension reduction

Fig. 7
figure 7

Contribution of pollution sources percentage-wise in the groundwater of the study area

Factor 2 exhibits 20.7% of the total variability (Table 4 and Fig. 6), with positive loading on pH, EC, TDS and Cd. The correlation coefficient (r) values of aforementioned variables observed in water were (r = 0.45, 0.52, 0.53 and 0.32), respectively. Factor F2 contributes up to 14%, which shows anthropogenic pollution in the water environment of the area (Fig. 7). Thus, the major contributor of aforementioned variables was attributed to be anthropogenic origin for Cd.

Factor 3 represents 9.18% of the total variability (Table 4 and Fig. 6) with positive loading for depth and Hg. The estimated correlation coefficient (r) values for depth and Hg were (r = 0.45 and 9.84), respectively. Moreover, the percentage contribution of the factor F3 was 8%. This factor more or less shows mixed pollution sources, viz. natural and anthropogenic (Fig. 7). Mostly, natural sources explain the depth profile of Hg in the study area. Therefore, the possible geogenic sources in the drinking groundwater for Hg reported to be weathering and dissolution of cinnabar (HgS) minerals occur in the granite and gneissic rocks of the area (Gray et al. 2002), whereas the anthropogenic sources include fossil fuel combustion, mining, transportation and smelter activities (Hough et al. 2004; Jehan et al. 2018; Rashid et al. 2018). Overall, this factor described that Hg in the drinking groundwater can be progressively increased in the shallower, middle depth and deeper water aquifer, respectively.

5 Conclusions

This study represents the occurrence of Hg in the drinking groundwater of District Swabi. Shallower groundwater sources have been found unfit for drinking domestic and agriculture purposes. Shallower water samples of 87.5%, middle depth of 58.3%, and 50% deeper groundwater samples had exceeded the guideline value of Hg. Geographically Hg concentrations in the drinking water aquifers increase in the following order: Naranji > Sher Dara > Parmoli > Mirali, similar trend as followed by depth: shallower > middle depth > deeper. Long-term water rock interaction and the presence of ore mineral deposits of cinnabar (HgS), gangue and clay minerals can significantly contribute Hg in the water system. Once these minerals released Hg into the groundwater aquifer, it will remain there in groundwater system. The existence of Hg in the middle and deeper aquifers results from the inter-mixing of water sources with the shallower water in the vadose zone. The PLI values of most samples lie above the recommended values of 1. Thus, shallower and middle depth water showed highest pollution load index. Risk assessment of mercury revealed the moderate risk posed by Hg ingestion via groundwater consumption. The health indices GRQ indicate that 81.5% drinking water samples are unfit for drinking and domestic needs. The PCAMLR results of drinking water sources show natural and anthropogenic pollution in the study. In order to reduce health risk due to Hg contamination, drinking water treatment plant must be installed in the surrounding areas and awareness campaigns for the local peoples are indispensable for safe drinking water uses and management. Further research is urgently needed for monitoring Hg concentration around Swabi District which is one of the areas in Pakistan, with a very rapid socioeconomic growth. It is highly recommended that the local authorities, government sector and NGOs should take necessary actions to control Hg contamination in the groundwater of the study area.