Introduction

Iran is one of the countries which has faced water crisis in the last decades. Regionally, groundwater is considered as the main resource for irrigation and supplying domestic demands, due to scarcity of surface water and erratic rainfall (Sakizadeh and Ahmadpour 2016). More than 90% of Iran is located in arid and semi-arid regions, whereas 55% of water consumption is provided from groundwater resources (Sakizadeh et al. 2016). Due to the low rainfall and the high evaporation rates, groundwater has been subjected at risk of elevated salinity in recent years (Umar and Absar 2003); in addition, groundwater contamination and water shortage crisis increased due to population growth and industrial growth, as well as due to increased use of synthetic fertilizers and pesticides. Therefore, hydrogeochemical investigations and groundwater quality assessments are essential to protect groundwater from contamination triggered by natural (geogenic) factors or anthropogenic activities (Chung et al. 2015; Barzegar et al. 2017b), as well as effectively manage these resources.

Groundwater is a natural resource that can be renewed by different processes (Chung et al. 2015), and its hydrogeochemical characteristics are controlled by various natural and anthropogenic factors (Kelepertsis 2000; Siegel 2002; Stamatis et al. 2001; Sullivan et al. 2005; Barzegar et al. 2016b, 2017a, b) among which are general conditions of the study area (e.g., precipitation, seasonal hydrology, climate); ongoing geochemical process (rock weathering, redox, complexation, sorption processes, etc), as well as man-made activities related to agriculture, industry, mining, urbanization, etc) (Pejman et al. 2009; Kumaresan and Riyazuddin 2006; Singh et al. 2015; Srivastava et al. 2011; Bharose et al. 2013).

There is a variety of methods such as graphical techniques (Piper, Durov, Stiff, etc. diagrams); bivariate or scatter plots (Barzegar et al. 2016b); and multivariate statistical techniques such as correlation analysis, clustering analysis (CA), factor analysis/principle component analysis (FA/PCA), etc. to understand the processes that govern groundwater chemistry. Multivariate data analysis methods are gaining popularity in the field of environmental research that deals with surveillance and quantification plans (Belkhiri and Narany 2015); hence, the use of these methods is common for hydrogeochemical studies in recent years (e.g., Tziritis 2014; Barzegar et al. 2015; Voutsis et al. 2015; Tziritis et al. 2016b; Barzegar et al. 2016a). Correlation analysis is carried out as a bivariate statistics to determine the mutual relationships and strength of association between pairs of variables through calculation of the linear Pearson’s correlation coefficient (Sivakumar et al. 2014). Cluster analysis is a multivariate method, which is based on the categories of data according to predetermined selection criteria. In this method, the data are placed in clusters that indicate high internal (within-cluster) homogeneity and high external (between clusters) heterogeneity (McKenna 2003). Factor analysis reduces the number of variables that subsequently becomes easy to describe and identify the behavior of the system and homogeneous subgroups (Lawrence and Upchurch 1982; Al-Qudah et al. 2011).

Environmental indices are robust and efficient tools aiming to provide accurate results which are easily comprehensible and scientifically sound; hence, use of the environmental indices may play a critical role in the implementation of water resources management schemes that lay on the basis of policy making. Environmental indices can summarize water quality data in an easily expressible and understandable format and by design contain less information than the initial raw data they were built on. They can be used for monitoring different areas (e.g., river basins) or for the same area but for different time spans. Their ability to perform direct comparison enhance their strategic role in environmental monitoring programs and may be used as aggregation and communication tool for water quality and definition of the potential environmental and hydrogeochemical trends

So far, several indices have been employed to determine groundwater quality; however, “PoS” index (Tziritis et al. 2014, 2016a, 2016c) includes significant strategic advantages, since it can be used for water quality assessment regardless of the parameters used, the time span of sampling period, and the characteristics of the studied basin. It is envisaged to provide critical information regarding the quality status and the hydrogeochemical trends of surface and groundwater as well as their potential interconnection. Its goal is to improve the understanding of water quality issues by integrating complex parametric data and generating a score that describes overall water quality status and water quality trends independently of any spatial, temporal, or arithmetic variation to the total number of the examined parameters. A detailed description of the PoS index method is given in Tziritis et al. (2014).

Kordkandi-Duzduzan plain is one of the fertile plains of East Azarbaijan Province, NW Iran, which the groundwater is an important resource for drinking and agricultural purposes due to lack of surface water resources in the region. In this frame, the main objectives of the present study are to identify the major processes affecting hydrogeochemistry of groundwater resources and to evaluate the potential sources of major, minor, and trace elements and to assess groundwater quality deterioration with the use of PoS environmental index. The results are expected to provide a valuable insight to the environmental aspects of aquifer bodies in Kordkandi-Duzduzan plain and contribute significantly to the sustainable groundwater resources.

Material and methods

Study area

Kordkandi-Duzduzan plain, with an area of approximately 127 km2, is located northwest of Iran’s East Azarbaijan province. The plain is a part of the Urmia Lake basin and bordered to the north by the Mehraban plain, to the west by the Aji-Chay River, to the east by the Oujan-Chay River, and to the south by the Qamlar and Qasem-Daghi Mountain ranges (Fig. 1). According to the Barazin meteorological station information, the mean annual temperature in the area is 17.8 °C and the long-term mean annual precipitation is 269.19 and 274.9 mm in Kordkandi and Duzduzan rain gauge stations, respectively. The climate of the area is defined as cold semi-arid.

Fig. 1
figure 1

Location of the study area and drainage system

Geological and hydrogeological setting

The oldest rock outcrops are Eocene in the study area. The Eocene and Oligocene series have surrounded the southern part of the region. The Quaternary sediments of the plain consist of conglomerate, sandstone, clay, gravel, and sand and recent alluvial terraces with gentle slopes which have very high discharge capacity. These terraces have more deep wells for various uses especially agricultural purposes. Therefore, it can be said that the entire area is covered with terraces and these sediments provide the region’s water needs.

Most of the surrounding area of the plain is covered with Miocene formations series. Miocene sediments include alternations of sandstone, marly shale, reef limestone, conglomerate, tuff, marl, and silt. Main outcrop of Miocene formations can be seen on the both sides of Aji-Chay River, northwest of Sharabian and the surrounding of the Duzduzan Plain. Having some interbedded layers of gypsum and salt in Miocene formations and crossing of these series by Aji-Chay River, the water quality of the river is saline.

Pliocene sediments have covered highlands located at the center of the plain and the north and east of the plain sporadically. These sediments are a series of rounded conglomerates located on Miocene unit unconformity.

The Eocene and Oligocene formations of the study area consist of volcanic and sub-volcanic rocks, e.g., tephrite, rhyodacite, tuff, syenite, and andesite basalt which occupy mostly southeast of the area. Generally, the stratigraphy of Kordkandi-Duzduzan plain includes Quaternary formations (recent sediments, young terrace, older and higher terrace, terrace, and alluvial fan), Pliocene formations (conglomeratic sandstone and clay), and Miocene formations (sandstone, siltstone and marl). Among the rock units older than Quaternary, it seems that the sandstone and conglomerate units of Miocene have good groundwater discharge according to their lithology. Figure 2 shows the geological map of the study area.

Fig. 2
figure 2

Geological map of the study area, water type and sampling sites

Surface water of the study area is collected from the south by the Oujan-Chay permanent river and two intermittent streams (Karam-Javan-Chay and Yonjali-Chay rivers), then join to the Aji-Chay River and eventually discharges to Urmia Lake. Water flow direction of the Aji-Chay River is from the northeast to the northwest in the study area, and Oujan-Chay River joins it in the northwest region. Figure 1 shows the drainage system of the study area.

The results of the drilled wells and geophysical investigations show that the aquifer type is unconfined. Unconfined aquifer is composed of old and recent alluvial terraces, alluvial fans, and stream sediments, and the generic ingredients of sediments are gravel, sand, silt, mud, and clay. The sediments formed in the mountain range pediments, border of the plain, and alluvial fans are coarse, and gradually, the grain sizes decrease towards the central parts of the plain. The results of geophysical investigations show that the grain sizes of Sharabian’s alluvial fan sediments gradually decrease from south to north, so that clay minerals and coarse-grained alluvial deposits cover the aquifer in the northern and southern parts of the plain, respectively. Conglomeratic and clay terraces related to Plio-Quaternary period were detected as bed rock of the aquifer in the eastern and western borders. Also, layers of marl and clay with gypsum and salt related to Miocene period form the bed rock of aquifer (Alizadeh 2008)

Figure 3 illustrates geological cross-sections of the study area that their locations are shown in Fig. 2. In the A-A′ and B-B′ cross-sections, alluvial sediments are located on the surface and the bed rock is made of conglomerate and sandstone. The thickness of alluvium in A-A′ and B-B′ cross-sections has decreased by about 100 and 30 m, respectively. According to the drilling logs, most sedimentary layers of the Kordkandi plain are consisted of clay, silt, sand, coarse-grained gravel, and conglomerate sediments and in the Duzduzan plain, most sedimentary layers are composed of silt and clay, silty sand, marl, sand, and conglomerate sediments.

Fig. 3
figure 3

Geological cross sections in the study area

The direction of groundwater flow in the Kordkandi plain is mainly from south to north and southwest to northeast; it should be noted that the groundwater flow is diverted from its original path in the eastern part of Kordkandi plain due to local discharge of the groundwater. The direction of groundwater flow in the Duzduzan plain is from the northern and southern borders towards the central and western part of the plain (Fig. 4).

Fig. 4
figure 4

Groundwater flow direction in the plain

Groundwater sampling and analysis

Totally, 23 water samples were collected from the shallow and deep wells in September 2015 to assess the hydrogeochemical characteristics of the Kurdkandi-Duzduzan plain. Location of the sampling points is illustrated in Fig.2. Water samples were collected in two separate polyethylene bottles in order to analyze for major ions and trace elements. Consequently, 2 ml HNO3 50% was added into water samples of trace element analysis to preserve precipitation. The pH and electrical conductivity (EC) parameters were measured in the field. Cations such as Na+ and K+ were analyzed by flame photometer; SiO2 and anions such as SO4 2−, NO3 , and F by spectrophotometer; and other cations (e.g. Mg2+, Ca2+) and anions (e.g., HCO3 , Cl) by titration in the Hydrogeology Laboratory of the University of Tabriz. Concentrations of trace metals and metalloids such as Cr, Mn, Cd, Fe, Al, and As were analyzed by atomic absorption method in the Water Quality Control Laboratory of the East Azerbaijan Province.

The analytical accuracy was assessed through ion balance error (Eq. 1), which was found to be within the range of ± 5% (Domenico and Schwartz 1998).

$$ BE=\left[\frac{\left( TCC- TCA\right)}{TCC+ TCA}\right]\times 100 $$
(1)

where TCC and TCA express concentrations of total cations and total anions in meq L−1 and BE denotes the percentage of ionic balance error.

Saturation index

The degree of equilibrium between water and minerals is expressed by saturation index (SI). In this study, the saturation indices of some of the potential minerals were calculated using Eq. 3 (Parkhurst and Appello 1999):

$$ \mathrm{SI}={log}_{10}\left(\frac{\mathrm{IAP}}{K_{sp}}\right) $$
(2)

where IAP is the ionic activity product of the dissociated chemical species in solution and K sp denotes the equilibrium solubility product for the chemical involved at the sample temperature. The SI < 0 and SI > 0 indicate under-saturation (dissolution) and super-saturation (precipitation) states for groundwater with respect to the particular mineral, respectively (Barzegar et al. 2015). PHREEQC version 3.0 (Parkhurst and Appello 1999), with thermodynamic values of phases from the MINTEQ database (Allison et al. 1990; Mukherjee and Fryar 2008), was used to calculate the SI values of the groundwater samples.

Statistical analysis

Multivariate statistical analysis has been widely used to investigate the hydrogeochemical properties of groundwater, e.g., to recognize the origin of solutes and to identify the hydrogeochemical processes in a groundwater system and simplify the interpretation of datasets (Hassen et al. 2016, Yidana et al. 2008; Barzegar et al. 2015, 2016a, 2017b). Among the most widely used techniques are the correlation, cluster, and factor analyses. Pearson correlation is a technique by which can be determined the correlation coefficient (r) between parameters. It has also been carried out as a bivariate statistics for determination of the mutual relationships and strength of association between pairs of variables through calculation of the linear Pearson’s correlation coefficient (Sivakumar et al. 2014). Values of correlation coefficient greater than 0.7, 0.5 < r < 0.7 and less than 0.5 indicate a strong, moderate, and weak correlation, respectively (Oinam et al. 2012; Barzegar et al. 2015, 2016a).

Cluster analysis (CA) is a class of statistical method that includes two modes of Q and R. Q-mode CA is used to identify the spatial relationships between sample points while R-mode CA is used for classification parameters into groups based on similarity with each other (Banoeng-Yakubo et al. 2009). A dendrogram is used to determine the number of clusters in which the levels of the similarity of observations are merged (Lokhande et al. 2008). The dendrogram provides a visual summary of clustering process, presenting a picture of groups and their proximity, with a dramatic reduction in dimensionality of original data (Machiwal and Jha 2015; Bodrud-Doza et al. 2016). In this study, hierarchical cluster analysis was performed in the form of Q-mode by using Ward method and squared Euclidean distance.

Factor analysis (FA) helps to identify the variations in groundwater quality caused by natural and anthropogenic processes. In FA, the contribution of less significant variables is reduced by further simplifying the data structure resulting from principal component analysis (PCA) (Nosrati and Van Den Eeckhaut 2012). PCA uses an orthogonal transformation to convert the original variables into new ones called the principal components, which are uncorrelated variables (axes) and linear combinations of the original variables. The extract of principal component is in the way that the maximum variance is dedicated to the first component, the second greatest variance to the second component, and so on (Purushothaman et al. 2014). To provide information based on the most meaningful parameters with minimum loss of original information, the axes defined by PCA rotates, and new variables can be created, also called varifactors (VFs) (Brumelis et al. 2000; Rogerson 2001). Thus, varimax rotation procedure makes interpretation of FA easier and reliable (Sakizadeh and Ahmadpour 2016). In this study, the measured data were standardized for both CA and FA to avoid misclassification resulting from different measurement units.

Analysis of variance (ANOVA) is a statistical technique that evaluates potential differences in a scale level-dependent variable by a nominal-level variable having two or more categories. This study employed one-way ANOVA to determine and quantify associations and differences between cluster variables. A detailed description of the ANOVA can be found in Molugaram and Rao (2017). The software package SPSS 21.0 was used for statistical analysis.

PoS index classification

The evaluation of groundwater quality was performed with the aid of PoS index (Tziritis et al. 2014), which depicts the footprint of a water sample’s environmental quality due to the presence of its various parameters (each of which may have the potential to cause adverse effects to humans and the natural systems) and their potential cumulative effect. It is an alternative approach for the assessment and characterization of water quality that allows temporal comparisons between different periods of time at the same basin, but also enables assessments between basins that are subject to different pressures and controlled by diverse mechanisms. Classification according to PoS index embraces the initial sample characterization through a common ranking system of six classes referring to quality degradation level (Table 1).

Table 1 Characterization of PoS index classes according to their quality degradation level

Below follows a brief description of the methodology followed. In this study, 15 parameters including EC, Ca2+, Mg2+, K+, Na+, SO4 2−, Cl, F, NO3 , As, Cd, Cr, Al, Mn, and Fe were selected to calculate the PoS index; these parameters were classified (Table 2) according to their toxicity and overall environmental adverse effects which is related with the “P-class,” based on the original PoS method. The basis of their classification was founded on the “Priority List of Hazardous Substances” (ATSDR 2011), but it was further modified by data obtained from various sources (e.g., WHO 2012) so as to consider also additional aspects of water quality (e.g., ecosystem functioning).

Table 2 Classification of the selected parameters for the calculation of PoS index; based on the Priority List of Hazardous Substances (ATSDR 2011) and further modified by other relevant sources (e.g., WHO 2012)

Specific “partial score” (ps) is considered for each “P-class” (Table 2). Partial score is obtained by trial and error method and individual weight is determined for each parameter using “ps” (Eq. 3):

$$ {W}_i={ps}_i/\varSigma ps $$
(3)

where W i is the weight factor of ith parameter, psi is partial score of ith parameter, and Σps is total ps i of all parameters.

In the next step, the influence of each parameter to the overall assessment of sample’s quality is assigned by means of the individual quality contribution factors (Qfs), calculated by Eq. 4:

$$ Q{f}_i=\left[\left({C}_i.{W}_i\right)/{MAC}_i\right]\times 1000 $$
(4)

where

Qf i :

The quality contribution factor of ith parameter

C i :

The concentration of ith parameter (units according to the studied parameter)

W i :

The weight factor of ith parameter

MAC i :

The maximum parametric value of ith parameter

The maximum admissible concentration (MAC) is introduced solely to normalize the concentration of a parameter against a well-established threshold or trigger value and does not by any means relate the concluded water quality assessment to a particular use, nor does it create any dependency to use specific characterizations. The MAC values used in this case followed the thresholds imposed by the potable water directive (European Council 1998).

The final step of PoS index calculation consists of summing up all the individual Qf factors (Eq. 5). The derived score is a dimensionless number which may be used for the qualitative assessment of a sample.

$$ PoS=\sum Qf $$
(5)

For the final evaluation, a reference water sample (R-sample) is assumed, which reflects the mean typical concentrations of the focal parameters as expected in natural groundwater (Kelepertsis 2000), according to the original PoS method. Based on this, collected samples were classified accordingly into the categories of Table 2.

Results and discussion

Physicochemical characteristics of the water samples

Descriptive statistics of the measured parameters and computed SI values of some minerals including calcite, gypsum, halite, sanidine, and kaolinite are given in Table 3. Water samples have the electrical conductivity (EC) values ranging from 400 to 5000 μS/cm (median value of 980 μS/cm) and pH values ranging from 6.5 to 7.85 (median value of 7.49) which indicate that the water is slightly alkaline. The high EC values of the samples are mainly attributed to the impact of irrigation water return flow and evaporites. The values of the computed SIs for mentioned minerals display that all groundwater samples are under-saturated with respect to halite (− 7.73 ≤ SI ≤ −4.7) and gypsum (− 2.12 ≤ SI ≤ −0.48) while they are super-saturated with respect to sanidine (0.58 ≤ SI ≤ 3.44) and kaolinite (2.24 ≤ SI ≤ 5.97). In respect to calcite, the SIs of samples are varying (− 1.33 ≤ SI ≤ 0.65). Based on median values, the order of abundance of the major cations is Ca2+ > Na+ > Mg2+ > K+, for the major anions is HCO3  > SO4 2−> Cl, and for trace metals and metalloids is Al > Fe > As > Cr > Mn > Cd. High standard deviation was found for EC (standard deviation value of 1075.06) and Cl (standard deviation value of 347.55) parameters as a result of the wide variation of their concentrations. Most of the major ions and trace elements show a higher skewness and kurtosis which indicate that their statistical population is not following a normal (Gaussian) distribution. Values of these statistics outside the range of − 2 to + 2 indicated significant deviations from normality (Wu et al. 2010; Barzegar et al. 2016a).

Table 3 Statistics of the measured parameters for the collected groundwater samples

Groundwater type

Piper diagram (Piper 1944) was used to determine groundwater type (Fig. 5). According to the diagram, four types were detected in the area. The dominant type of groundwater samples is mixed Ca–Mg–Cl, corresponding to 15 samples (65.3%). Spatial distribution of the water types is illustrated in Fig. 2. Mostly, these samples are located between the regions of recharging and discharging in groundwater flow path, thus showing a spatial area where secondary processes chiefly control groundwater hydrogeochemnistry. The abundance of samples is the same in both type of Ca–HCO3 and Ca–Cl. Three samples with Ca–HCO3 type are located in the southwest in recharging zones that show fresh water in sampling points. Also, samples of 14, 18, and 20 indicate Ca–Cl type which are located in the northwest of the study area and associated with the Miocene formations (e.g., sandstone, siltstone) existing in the region. Na–Cl type containing two samples (22 and 23) has a chemical composition amenable to sea water that is located in the north in discharge area. High salinity is the main feature of this type also and can be attributed to the Miocene formations.

Fig. 5
figure 5

Piper diagram of the groundwater samples

Correlation analysis

The results of the correlation analysis are shown in Table 4. High correlation between the ions that have similar ionic radii with the possibility of replacing including calcium, magnesium, sodium, and potassium may indicate a common origin, which very likely is related to the geological characteristics of the sampling locations (Drever 1997; Mahlknecht 2003). There is a strong correlation between EC and all of the major ions except K+ (r = 0.639, p < 0.01) and HCO3 (r = 0.003, p > 0.05) which indicates the contribution of these ions in the salinity of the water. The EC is significantly correlated with concentrations of some trace metals including Cd (r = 0.841, p < 0.01), as well as moderately correlated with Cr (r = 0.571, p < 0.01) and Mn (r = 0.564, p < 0.01). A weak correlation between HCO3 and Ca2+ (r = 0.021, p > 0.05) indicate that the Ca2+ possibly did not originate from calcite, while a moderate correlation between SO4 2− and Ca2+ (r = 0.693, p < 0.01) can be indication of the gypsum as origin of Ca2+; however, the dissolution of other Ca-rich minerals (e.g., Amphiboles, Pyroxenes, and Ca-feldspars) cannot be excluded. Chloride concentrations can be attributed to the dissolution of halite and re-concentration processes by evaporation and contamination from anthropogenic activities (Jalali 2010; Srivastava and Ramanathan 2008). The correlation of the Cl with Na+ (r = 0.936, p < 0.01), Ca2+ (r = 0.846, p < 0.01), and Mg2+ (r = 0.837, p < 0.01) may suggest that the sources of Cl can be explained by weathering and dissolution of Cl-bearing minerals, e.g., halite and igneous rocks. Strong correlation of Na+ with Ca2+ and Mg2+ (r = 0.838, p < 0.01 and r = 0.774, p < 0.01, respectively) express the possibility of cation exchange process. The moderate correlation of the HCO3 with NO3 (r = 0.568, p < 0.01) indicate the effects of nitrogen fertilizers for farming and increasing HCO3 due to their use (Bohlke 2002).

Table 4 Matrix of correlation coefficient for water variables

There is a negative and weak correlation between SiO2 and Fe, Al, Mn, and Mg2+. It can be deduced that the dominant type of silicate is quartz (SiO2) in the area; therefore, silicates cannot be the origin for Fe, Al, Mn, and Mg. As a result, a possible source for Fe, Al, Mn, and Mg can be attributed to the hydroxide and oxide minerals or sulfated minerals.

Strong correlation between Fe and Mn (r = 0.724, p < 0.01) may be due to their common sensitivity to changes in redox potential and similar geochemical characteristics (Navratil et al. 2007; Barzegar et al. 2015); also, moderate correlation between Fe and Al (r = 0.696, p < 0.01) can be justified according to the similar geochemical processes and conditions for the release of these metals in the groundwater (Barzegar et al. 2015, 2016a).

Moderate to strong correlation between Cd and parameters such as Mg2+ (r = 0.679, p < 0.01), K+ (r = 0.694, p < 0.01), Cl (r = 0.795, p < 0.01), Na+ (r = 0.829, p < 0.01), and Ca2+ (r = 0.869, p < 0.01) can be refered to their common enrichment process. On the other hand, correlation of the Cd with Na+ and Cl suggests that Cd readily forms complexes in solution with evaporites (e.g., NaCl) (Salminen et al. 2005) and the possible origin of Cd can be evaporitic formations (Miocene formations).

The correlation of Cr with Ca2+ and Na+ (r = 0.569, p < 0.01 and r = 0.701, p < 0.01, respectively) can demonstrate the fact that cation exchange process along with the increasing pH and sodium can be effective in the presence of chromium in the groundwater (Bertolo et al. 2011a). Negative moderate correlation of Cr with NO3 (r = −0.545, p < 0.01) suggests that Cr is the result of natural and geological processes and is not related to agricultural fertilizers. Correlation of Cr with Cd (r = 0.750, p < 0.01) may be justified with regard to the role of sodium as effective ion in the release of both elements.

The correlation coefficients between As and HCO3 (r = −0.444, p > 0.05), SiO2 (r = 0.389, p > 0.05), Mn (r = −0.333, p > 0.05), and Fe (r = −0.242, p > 0.05) suggest that HCO3 does not affect the release of As into groundwater, and hydroxides of iron and manganese may absorb arsenic in high concentrations of iron.

Cluster analysis

Groundwater samples were divided into two groups known as clusters 1 and 2 with the use of hierarchical cluster analysis (HCA); the dendrogram is shown in Fig. 6. Cluster 1 includes 17 groundwater samples (74%) while cluster 2 includes six samples (26%), respectively. A one-way ANOVA analysis can show the differences between the clusters. As shown in Table 5, there is a significant (sig. < 0.05) difference for EC (and its dependent variables such as Na+, K+, Ca2+, Mg2+, SO4 2−, and Cl), Cd, and Cr variables between clusters 1 and 2, whereas pH, HCO3 , NO3 , F, SiO2,Mn, Al, Fe, and As variables had insignificant (sig. > 0.05) differences between the clusters. It is clear that the most significant differences are for EC, Ca2+, Mg2+, Cl, SO4 2−, and Cd variables which have significance values equal to zero. The Box and whisker plots for the physicochemical parameters of all samples for the clusters are illustrated in Fig. 7. Based on the median values, the concentrations of pH, NO3 , SiO2, and As in cluster 1 are elevated compared with those of cluster 2, while their maximum values occur in cluster 2. Samples of cluster 1 are scattered across the plain without a specific pattern while samples of cluster 2 are located mostly in the northern part (almost the end of the groundwater flow paths), except sample 3 which is located in the southwestern part.

Fig. 6
figure 6

Hierarchical cluster analysis: dendrogram of groundwater samples

Table 5 The results of the one-way ANOVA analysis between clusters 1 and 2
Fig. 7
figure 7

Box and whisker plot of the physicochemical parameters for the groundwater samples of clusters 1 and 2

The saturation indices of selected minerals including calcite, gypsum, halite, sanidine, and kaolinite for samples of clusters 1 (sub-cluster 1-1, 1-2, 1-3) and 2 are shown in Fig. 8. Calcite was selected as being the dominant mineral in Ca-rich formations (mainly limestones); gypsum and halite, as being the dominant minerals in evaporites; sanidine, as being the dominant mineral in most of the volcanic formations (high temperature formed feldspar); and kaolinite, as being a typically found clay mineral in Quaternary deposits and unconsolidated formations. Hence, they may be used as proxies (jointly with the other data) for the identification of dominant hydrogeochemical processes and impacts, e.g., substrate rock weathering. Based on their values, it can be argued that volcanic, evaporite, and calcareous formations have the least impact on sub-cluster 1-3; also, the least impact of the clay formation is seen in sub-cluster 1-2, while clay and evaporite formations have maximum impact on cluster 2, and maximum impact of the volcanic and calcareous formations belongs to sub-cluster 1-1. Low and negative values of SI for sub-cluster 1-3 can be derived from the presence of samples in the primary of flow path (fresh water) and minor presence or absence of effective formations. The higher values of saturated indices for sub-cluster 1-1 and cluster 2 indicate the influence of northern formations and the end of flow path on groundwater samples of these sub-clusters.

Fig. 8
figure 8

Saturation indices of clusters 1 (sub-clusters 1-1, 1-2, 1-3) and 2

Factor analysis

The factor analysis (FA) method was used to determine the factors affecting the regional groundwater. The results of FA analysis are given in Table 6. In this way, four factors were selected with a total variance of 81.86%, with respect to the eigenvalues higher than 1 as factor affecting the system. Also, to select high loading variables in each factor, loadings absolute greater than 0.5 were considered.

Table 6 The results of the factor analysis after varimax rotation

The first factor (FA1) controls about 38% of the groundwater chemistry variations and includes EC, Ca2+, Mg2+, Na+, K+, SO4 2−, Cl, Cr, and Cd parameters. This factor shows the general trend of groundwater, the impact of evaporation and salt formations, and the dissolution of minerals of Miocene and Pliocene formations in the area. High loading between Ca2+, Mg2+, K+, and Na+ suggests ion exchange processes. The presence of Cd and Cr with positive loadings could be indicative of their common origin; since no other obvious anthropogenic source is evident, the most possible source should be attributed to bedrock volcanic formations, as similarly reported in other areas (Tziritis 2009). The spatial distribution of FA1 scores (Fig. 9a) shows that the highest value of FA1 is located in the northwestern part of the plain (the end of the groundwater flow path) while the lowest value of this factor is located in the southern part of the plain (the beginning of the groundwater flow path). This spatial distribution matches with the spatial distribution of SI values of volcanic, evaporite, and calcareous minerals for all samples.

Fig. 9
figure 9

Spatial distribution of factor scores of the factor analysis

The second factor (FA2) with variance of about 20% includes pH, Mg2+, SiO2, Mn, Al, and Fe. This factor indicates weathering and dissolution of quartz minerals and the origin of Fe, Al, Mn, and Mg. Negative loading of SiO2 could be attributed to acidic pH values so that the groundwater with low pH is saturated with respect to amorphous silica (Lee and Gilkes 2005). Also, negative loading of SiO2 indicates that the effective parameters of this factor are not arising from a common source and Fe, Al, Mn, and Mg have not originated from silicates. According to geological formations, the spatial distribution of FA2 scores (Fig. 9b) is acceptable since the materials of geological units of the west and southwest plains (zones with the lowest value of FA2) are sandstone, conglomerate, and siltstone, in which the quartz is the constitutive main mineral of them, while geological units of the north and northeast plains (zones with the highest value of FA2) have been formed from gypsum and limestone

The third factor (FA3) with variance of about 13% includes HCO3 , SO4 2−, NO3 , and Cr which refers to anthropogenic and agriculture activities. The effective parameters of this factor indicate the composition of agricultural fertilizers; it should be noted that the Cr with negative loading suggests that it is originated from natural and geology processes. As is shown in Fig. 9c, groundwater affected by agricultural activities and contaminated with nitrates are located in the east, west, and northwest areas. Also, high concentration of Cr is located in the vicinity of the central highlands of the plain. Since clay minerals are one of the ingredients of central highlands, it can be concluded that clay is a possible source for Cr (Callender 2004; Bertolo et al. 2011b). The SI of clay mineral (kaolinite) refers to the effects of clay minerals in this area due to the location of the samples of cluster 2.

The fourth factor (FA4) with variance of about 11% includes HCO3 and Mn with positive loadings and As with negative loading, indicating the effective parameters on the release of As. This factor, such as the interpretation that was expressed in section of correlation analysis for As, suggests that HCO3 does not play a role in the release of arsenic from sulfated minerals, and hydroxides of iron and manganese absorb As. The spatial distribution of FA4 scores (Fig. 9d) indicated that arsenic is concentrated in parts of south, northwest, and west. Thus, it can be concluded that arsenic originated from sandstone and shale (Miocene formations).

Evaluation of water quality according to PoS index

The application of PoS index to water samples classified them according to their overall quality degradation. It has to be mentioned that PoS classification is not intended for use characterization (e.g., potability or irrigation) but for generic quality assessments.

Based on the PoS results, the 23 samples are classified into two categories, including medium (class 3) and high (class 4) quality degradation (Fig. 10). Specifically, ten samples (2, 4, 5, 6, 7 14, 15, 17, 20, and 23) belong to class 3 (medium quality degradation) and the other samples belong to class 4 (high quality degradation). Samples of lowest PoS values are located in the southern and northern parts (recharge area) while samples of highest values are located in the discharge area and the eastern part.

Fig. 10
figure 10

Classification of the samples based on PoS index values

According to the cumulative (ΣQf) quality factors of PoS index (Fig. 11), which highlight the percentage of contribution for the examined parameters, the dominant parameter that controls quality deterioration is As, with 60% of contribution. Subsequently, Cd and F with 11.3% abd 8%, respectively, follows. The remaining parameters appear with values smaller than 5%, and their contribution is considered minor or local.

Fig. 11
figure 11

Percentage of contribution of analyzed parameters to the overall quality status of plain based on PoS index

Conclusions

Water samples have electrical conductivity (EC) values ranging from 400 to 5000 μS/cm and pH values ranging from 6.5 to 7.85 which indicate that the water is slightly alkaline. The high values of EC can be attributed to the Miocene formations in the study area. The Piper diagram shows that the mixed Ca–Mg–Cl is the dominant groundwater type, and some of the samples have Ca–HCO3, Ca–Cl and Na–Cl types. The values of the computed SIs for calcite, gypsum, halite, sanidine, and kaolinite minerals display that all groundwater samples are under-saturated with respect to halite and gypsum while they are super-saturated with respect to sanidine and kaolinite. In respect to calcite, the SIs of the samples are both under-saturated and super-saturated. Multivariate statistical analysis indicates that the weathering and dissolution of different rocks and minerals, e.g., silicates, gypsum, and halite, ion exchange, and agricultural activities influence the hydrogeochemistry of the study area. Correlation analysis shows that a possible source for Fe, Al, Mn, and Mg can be attributed to the hydroxide and oxide minerals or sulfated minerals. The correlation of the Cd with Na+ and Cl suggests that Cd readily forms complexes in the solution with evaporites (e.g., NaCl), and the possible origin of Cd can be evaporitic formations (Miocene formations). Negative moderate correlation of Cr with NO3 suggests that Cr is the result of natural and geological processes and is not related to agricultural fertilizers. In addition, correlation of Cr with Cd may be justified with regard to the role of sodium as effective ion in the release of both elements. The correlation coefficients between As and HCO3 , SiO2, Mn, and Fe suggest that HCO3 does not affect the release of As into groundwater and hydroxides of iron, and manganese may absorb arsenic when high concentrations of iron are present.

The cluster analysis divides the samples into two distinct clusters. According to the ANOVA test, there is a significant (sig. < 0.05) difference for EC (and its dependent variables such as Na+, K+, Ca2+, Mg2+, SO4 2−, and Cl), Cd, and Cr variables between clusters 1 and 2, whereas pH, HCO3 , NO3 , F, SiO2,Mn, Al, Fe, and As variables had insignificant (sig. > 0.05) differences between the clusters. Based on the median values, the concentrations of pH, NO3 , SiO2, and As in cluster 1 are elevated compared with those of cluster 2, while their maximum values occur in cluster 2.

Factor analysis illustrates that four factors control the hydrogeochemistry of the study area. FA1 including EC, Ca2+, Mg2+, Na+, K+, SO4 2−, Cl, Cr and Cd parameters shows the general trend of groundwater, the impact of evaporation and salt formations, and the dissolution of minerals of Miocene and Pliocene formations in the area. FA2 includes pH, Mg2+, SiO2, Mn, Al, and Fe which indicates weathering and dissolution of quartz minerals and the origin of Fe, Al, Mn, and Mg. FA3 includes HCO3 , SO4 2−, NO3 , and Cr which refers to the anthropogenic and agriculture activities. FA4 includes HCO3 and Mn with positive loadings and As with negative loading, indicating effective parameters on the release of As.

The joint assessments and compilation of variable data through different methodologies provided significant information about the hydrogeochemistry and water quality of the Kordkandi-Duzduzan plain. Based on factor analysis, the dominant factors that control hydrogeochemistry have been identified and with the aid of spatial analysis have been delineated. The hydrogeochemical signature of groundwater resources was mainly attributed to the impact of the geological substrate and, in general, the interaction with the natural factors; however, locally, the effects of anthropogenic stresses have been identified. In spatial terms, two main clusters have been discriminated, both reflecting significant impact from different sources. The main contaminant which seriously deteriorates water quality according to PoS index is As. Samples of lowest PoS values are located in the southern and northern parts (recharge area) while samples of highest values are located in the discharge area and the eastern part. In addition, it is evident that minerals which are located with the volcanic formations (e.g., sanidine) are supersaturated, in contrast with the minerals which are related to evaporitic formations (e.g., gypsum) which are under-saturated, a fact which implies that the dissolution process for the latter is ongoing. Conclusively, the integrated approach of variable methodologies was applied successfully; evaluations were confirmed and supplemented when needed, hence providing critical information for strategic planning and overall water resource management for the area.