1 Introduction

Water is of paramount importance for human existence. It plays a critical role in the development and maintenance of the ecosystem. Under limited surface water resources availability, groundwater is considered to be a dependable alternative for industrial, domestic, and agricultural uses, especially in arid and semi-arid regions.

Rapid growth in population, urbanization, and industrialization, and unregulated and uncontrolled utilization of groundwater, is reducing the aquifers with serious consequences of dropping the water table and increasing saltwater.

Variation in groundwater chemistry is mostly a function of the interaction between the groundwater and the mineral composition of the aquifer constituents through which it moves. In this regard, several different methodologies have been applied to study, characterize, and evaluate the sources of variation in groundwater geochemistry. Amongst these methods are the multivariate statistical methods, geostatistical techniques, and water quality index.

Multivariate statistical techniques, hierarchical cluster analysis (HCA), and factor analysis (FA) are effective means to resolve hydrological factors such as aquifer boundaries, groundwater flow paths, and hydrochemical parameters (Wang et al. 2001; Locsey and Cox 2003; Belkhiri et al. 2011; Mostafaei 2014; Mohamed et al. 2015; Teikeu et al. 2015), recognise geochemical controls on the composition (Alberto et al. 2001), separate anomalies such as anthropogenic effect (Helena et al. 2000; Pereira et al. 2003), and differentiate some groundwater signatures, including uncontaminated groundwater, sewage pollution, mining activities, and agricultural activities (Love et al. 2004). Hierarchical cluster analysis, HCA, as a multivariate statistical tool has also been frequently employed as a classification tool and to formulate geochemical models on the basis of available data based on factor scores (Meng and Maynard 2001).

Recent progressions in the application of geographic information systems (GIS) have extended its functionality for spatio-temporal data to determine the spatial distribution of groundwater quality parameters and to map groundwater quality assessment using geostatistics (Chen and Feng 2013; Montero et al. 2015; Venkatramanan et al. 2016). Geostatistics employs Kriging, the best linear unbiased estimator (Journel and Huijbregts 1978) for the prediction of missing data at unknown points to map the spatial variability (Ella et al. 2001). Ordinary Kriging (OK), a type of Kriging, is the most commonly adopted method for environmental studies, especially for environmental and water quality studies (Wackernagel 1995).

The water quality can be evaluated using physico-chemical parameters compared to permissible limits prescribed at an international scale (WHO 2011; Ayers and Westcot 1994). The best way to express the quality of water resources for drinking or irrigation is the Water Quality Index (WQI), as it is one of the most effective tools by which water quality data is summarized and well presented (Tiri et al. 2018).

The spatial distribution of water quality index can be mapped for better visualization of the potential zones, and for estimating the extent of the problem prevalence (Enwright and Hudak 2009).

The present work, therefore, focuses on the use of the multivariate statistical analysis, geostatistical modelling, and water quality index of groundwater chemistry data to characterize the spatial groundwater quality evolution process and to identify the controlling factors, which dominate the chemical composition of Boumerzoug-El Khroub valley groundwater, Northeast Algeria.

2 Study area

The Boumerzoug-El khroub valley is located in northeastern Algeria between the Constantine and Ain Mlila mountains (Fig. 1). This vast Mio-Plio-Quaternary plain is surrounded by isolated and abrupt reliefs oriented Southwest–Northeast, which represent the neritic limestone massifs. It presents as cauldron subsidence that extends to an altitude between 600 and 700 m. The study area is subject to a semi-arid climate. Precipitation occurs in an irregular manner, and the rainy season extends from October to May. The average rainfalls are around 512 mm/year and mean annual temperatures are around 15.15 °C.

Fig. 1
figure 1

Map showing sampling site and geology of the study area

The Boumerzoug Wadi, the main stream of the study area, has for the spring from “Aioune (Spring Water) Boumerzoug,” located in the southern part of the city of Ouled Rahmoun. Its course is sinuous on a more or less flat topography. Along the valley, the soil is generally alluvial rather favorable to arboriculture. The rest of the soils are more favorable for cereals (corn and barley). The majority of its inhabitants are concentrated in the cities Constantine (> 500,000 inhabitants) and El-Khroub (> 100,000 inhabitants) working mainly in the industrial and administrative sector (ONS 2017).

The geology of the plain is characterized by three lithostratigraphic sets (Raven 1957; Voute 1967; Coiffait and Villa 1977; Vila 1980; Lahonder 1987): a Lower Jurassic-Cretaceous neritic carbonate complex, covered by a dominant marly age group from Upper Senonian to Paleocene, and an upper set comprising heterogeneous detrital Mio-Plio-Quaternary series.

The aquifer of the Boumerzoug-El Khroub valley consists of Mio-Plio-Quaternary alluvial deposits. The lithology is a detritic set consisting mainly of conglomerate and Miocene sandstone, Pliocene lake limestone, and finally Quaternary conglomerates and sand along the valley of Boumerzoug, especially in the immediate vicinity of El Khroub. The aquifer is recharged by meteoric water (vertical infiltration) and by stream water coming from the surrounding limestone mass. Groundwater hydraulic properties vary in vertical and horizontal dimensions. The flow is from East to West towards the Boumerzoug Wadi, and water table depth ranges from 0.3 to 24 m. The pumping tests applied to wells in different parts of the plain reveal that transmissivity and permeability are about 10−4 m2/s and 10−5 m/s respectively (Boularak 2003), indicating low to medium yields.

3 Materials and methods

3.1 Sampling analysis

The field investigation led us to choose twenty-six wells distributed so as to cover the whole plain. These wells are used primarily for domestic and agricultural purposes. Sampling was done during the high-water period (March 2013). It was carried out after a short pumping period in polyethylene bottles and stored in an icebox at a temperature < 4 °C. Hydrogen potential (pH), electrical conductivity (EC), and temperature (T °C) were measured immediately after field sampling using a HANNA Hi-9813-6 multi-parameter. Subsequently, the samples were transported to the laboratory and analyzed for their major chemical constituents (Ca, Mg, Na, K, Cl, SO4, HCO3, and NO3). The methods of analysis are those recommended by the American Public Health Association (Eaton et al. 2005). The concentrations of Ca and Mg were measured by the volumetric method in the presence of an aqueous EDTA solution; this method was also used for titration of bicarbonates using 0.1 N hydrochloric acid. The chlorides are determined in the neutral medium by a titrated solution of silver nitrate in the presence of potassium chromate. The measurement of sulphates and nitrates was carried out by a spectrophotometric method and that of sodium and potassium by a flame photometer.

3.2 Statistical analysis

The hierarchical cluster analysis (HCA) and Factor analysis (FA)/Principal component analysis (PCA) are multivariate statistical techniques commonly used by scientists on hydrochemistry to classify water samples (Riley et al. 1990; Da Silva and Sacomani 2001; Güler et al. 2002; Demirel and Güler 2006; Belkhiri et al. 2010; Varol et al. 2012; Salman et al. 2014; Foued et al. 2017). These techniques allowed the exploration of the multivariate data holding several variables.

Cluster analysis is a powerful tool for hydrochemistry investigation by grouping water samples into separate groups significant in the geologic and hydrologic context to understand hydrogeochemical process occurring in the study area (Güler et al. 2004; Singh et al. 2017). This unsupervised classification of water quality variables on the basis of their similarities is termed Q-mode classification. Q-mode HCA method on the normalized data set was performed using Ward’s method with Euclidean distance as a measure of similarity.

FA, which uses PCA, is widely used to reduce sets of observations of many variables using associations between them. This reduction is achieved by diagonalization of the correlation matrix which obtains a new data set uncorrelated (orthogonal), arranged in decreasing order of importance named principal components (PCs) (Helena et al. 2000; Singh et al. 2004). Only PCs with eigenvalue > 1 are taken into consideration (Kaiser 1960). Varimax rotation was executed to these PCs to make the factors easier to interpret according to hydrochemical or anthropogenic processes controlling groundwater quality. The terms “strong,” “moderate,” and “weak” are considered to factor loadings with absolute loading values > 0.75, 0.75–0.50, and 0.50–0.30 respectively (Liu et al. 2003).

3.3 Water quality index

The Water quality index WQI is a recognized technique that offers powerful tools that simplify the expression of water quality to the concerned citizens and policymakers (Chauhan and Singh 2010).

It is a numerical expression where water quality data set is summarized into simple terms (excellent, good, poor, etc.) There are various water quality indices (WQI) developed by governmental agencies around the world. WQI was made to assess the suitability of groundwater quality of Boumerzoug valley for human consumption and irrigation using a weighed arithmetic index method given by Brown et al. (1972). This method has been widely used by authors (Amadi 2011; Gebrehiwot et al. 2011; Desai and Desai 2012; Aly et al. 2014; Amaliya and Kumar 2015; Goher et al. 2015; Paul et al.; 2015). The calculation method of WQI is given by the following mathematical formula:

$${\text{WQI}} = \mathop {\sum} \limits_{{{\mathrm{i}} = 1}}^{\text{n}}\,{\text{Q}}_{\mathrm{i}}\,{\text{W}}_{\mathrm i} /\mathop{\sum} \limits_{{{\mathrm{i}} = 1}}^{\text{n}}\,{\text{W}}_{\mathrm{i}}$$
(1)

where \({\text{Q}}_{\text{i}}\) quality rating for the ith parameter, \({\text{W}}\) unit weight of each parameter, n number of parameters.

Calculation of\(Q_{i}\)value

$${\text{Q}}_{\text{i}} = \left[ {({\text{V}}_{\text{i}} - {\text{V}}_{0} } \right)/\left( {{\text{S}}_{\text{i}} - {\text{V}}_{0} } \right)]$$
(2)

\({\text{V}}_{\text{i}}\) the observed value of the ith parameter, \({\text{V}}_{0}\) ideal value of the ith parameter in pure water, \({\text{V}}_{0}\) zero for all parameters except for pH = 7.0, \({\text{S}}_{\text{i}}\) standard permissible value of the ith parameter.

Calculation of \(W_{i}\) value

Calculation of unit weight \({\text{W}}_{\text{i}}\) is inversely proportional to the standard permissible value \({\text{S}}_{\text{i}}\) for water quality parameters.

$${\text{W}}_{\text{i}} = {\text{K}}/{\text{S}}_{\text{i}}$$
(3)

where \(K\) is the proportionality constant of the weights

$${\text{K}} = \frac{1}{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} \frac{1}{{ {\text{S}}_{\text{i}} }}}}$$
(4)

Water quality index is considered excellent, good, poor, very poor, and unsuitable when the value of the index falls between 0–25, 26–50, 51–75, 76–100, and > 100, respectively (Goher et al. 2015) (Table 1).

Table 1 Water quality assessment as per weight arithmetic WQI method

3.4 Geostatistical modeling

Geostatistics was developed by Matheron (1965) for the estimation of the characteristics of the mining deposits. These robust techniques of applied statistics are amply used in Earth sciences, like hydrogeology, to assess the process of spatial distribution of groundwater quality.

Kriging and semivariogram models were performed for the spatial distribution of hydrochemical parameters. Kriging, especially ordinary Kriging (OK), is one of the most popular and powerful linear appropriate interpolation techniques in ArcGIS geostatistical extension. OK generates predictive maps and interpolate the regionalized variables for the unsampled locations with a minimum square error (Sheikhy Narany et al. 2014; Bodrud-Doza et al. 2016). Spatial distribution can be calculated by the following equation (Delhomme 1978):

$$\hat{Z}\left( {x_{0} } \right) = \mathop \sum \limits_{i = 1}^{n} {\text{n}} \uplambda_{i} Z\left( {x_{i} } \right)$$
(5)

where \(\hat{Z}\left( {x_{0} } \right)\) is the estimated value at points \(x_{0}\), \({\text{n}}\) is the number of the sampled point, \(Z\left( {x_{i} } \right)\) is the known value at sampled points \(x_{i}\), and \(\lambda_{i}\) is the weight attributed to the sampled point.

For the best performance of Kriging, it is necessary to check the spatial dependence of the regionalized variables using variographic analysis. The main tool for this analysis is the semivariogram \(\gamma \left( h \right)\), which describes the evolution the semi-variance according to the distance between the samples and thus makes it possible to study the spatial relationship between the data (Hennequi 2010; Arslan 2012). The semivariogram formula commonly used is given as follow:

$$\gamma \left( h \right) = \frac{1}{2n\left( h \right)}\mathop \sum \limits_{i = 1}^{n\left( h \right)} \left( {Z\left( {x_{i} } \right) - Z\left( {x_{i} + {\text{h}}} \right)} \right)^{2}$$
(6)

where \(n\) is the number of pairs of sample points separated by distance h called Lag, \({\text{Z}}\left( {x_{i} } \right)\) is the value of the variable \({\text{Z}}\) at the location of \(x_{i}\), and \({\text{Z}}\left( {x_{i} + {\text{h}}} \right)\) is the value of the variable \({\text{Z}}\) at the location of \(x_{i} + {\text{h}}\).

3.5 Geochemical modeling

The saturation index calculation was done using PHREEQC for groundwater samples. PHREEQC is a computer program which uses equilibrium chemistry of aqueous solutions to simulate chemical reactions and transport processes (Parkhurst and Appelo 1999). The saturation index (SI) can be defined as:

$${\text{SI}} = {\text{Log }}\left( {\text{IAP/Ksp}} \right)$$
(7)

where IAP is the ion activity product and Ksp is the solubility product at a given temperature.

The equilibrium state of the mineral is reached when SI = 0. The positive value of SI represents the oversaturation of groundwater for that mineral, which may precipitate, whereas negative value defines undersaturation of water and mineral phase indicates possible dissolution.

4 Results and discussion

4.1 Groundwater chemistry and water type

The chemical facies of the groundwater is extremely varied. The temperature of the groundwater samples varies from 13 to 19 °C. The EC values of the samples vary from 460 to 2950 μS/cm and the pH values range from 7.3 to 7.9, indicating strongly mineralized and slightly alkaline waters.

Major ion concentrations have been considered on the normalized data to determine possible hydrochemicals groups using Q-mode hierarchical clustering analysis (HCA) technique, which is carried out using Ward’s linkage method with the Euclidean distance for similarity measurement of water samples. Spatial HCA generated a dendrogram (Fig. 2), where groundwater samples were grouped into three groups. EC seems to be a determining factor in differentiating these groups. It increases from group 1 to group 3. These groups are plotted on the Piper diagram (Piper 1944) to identify the geochemical evolution of water type.

Fig. 2
figure 2

Dendrogram of Q mode cluster analysis

Physical and chemical parameters of water groups, including statistical measures, were compared with the World Health Organization (2011) and Food and Agriculture Organization (Ayers and Westcot 1994) and reported in Table 2.

Table 2 Parameter values of the three principal water groups

Group 1 is formed by eleven wells (wells 7, 10, 11, 12, 16, 17, 18, 19, 21, 22, and 24) with a mean value of EC equal to 887.27 μS/cm, indicating low salinity and therefore fresh water. The majority of wells are localized in the recharge area. The order of abundance of the major ions is Ca > Mg > Na > K and HCO3 > SO4 > Cl > NO3 (Fig. 3), and the hydrochemical type is characterized by Ca–Mg–HCO3 and Ca–Mg–SO4–Cl facies (Fig. 4). This group is dominated by bicarbonates (min = 140.30 mg/L, max = 347.7 mg/L, and mean = 239.01 mg/L), however, calcium (min = 72.14 mg/L, max = 140.28 mg/L, and mean = 97.65 mg/L) and sulphates (min = 65 mg/L, max = 165 mg/L, and mean = 91.91 mg/L) are also present. Most samples exceeded the desirable calcium (75 mg/L) (WHO 2011) and bicarbonates (120 mg/L)limit for drinking water, whereas concentrations of nitrate show that only two wells (well 16 and 18) exceed the standards required for consumption (50 mg/L).

Fig. 3
figure 3

Stiff diagram for three water groups

Fig. 4
figure 4

Piper diagram for water samples

Group 2 consists of wells 1, 2, 3, 4, 5, 6, 8, 9, 13, 20, 23, 25, and 26. These wells represent 50% of the water samples and are mainly located along Boumerzoug Wadi and its tributary El Berda Wadi (wells 1 and 2). This group is characterized by high salinity (1310 < EC < 1980 μS/cm, mean = 1651.54 μS/cm), with a clear dominance of calcium, bicarbonate, chloride, and sulphates. The concentration of Ca and HCO3 varies from 100.2 to 220.44 and from 231.8 to 445.3 mg/L with mean concentrations of 160.32 and 327.57 mg/L, respectively. Chloride and sulphates values range from 53.25 to 319.5 mg/L and 52 to 405 mg/L with average values of 163.85 and 141.54 mg/L, respectively. All samples exceeded the desirable limit of Ca and HCO3, whilst only one well exceeds the standards required for consumption for chloride (well 26) and sulphates (well 2). Water type is strongly influenced by the geology of the study area (water–rock interaction), but also by the surface-water/groundwater mixing process through irrigation and during this period of high-water where the main stream reaches its floodplain.

Two wells (wells 14 and 15) represent group 3. Groundwater is highly mineralized (EC = 2780 μS/cm). Ca and SO4 are the most dominant ions, which indicate SO4–Cl–Ca water facies. These two wells didn’t reflect recharge area chemistry; they are strongly influenced by their environment, including agricultural vocation (chemical fertilizers and livestock). The important well depths observed in this group could influence the mineralization by dissolution of Triassic formations (clay, marl, and salt).

On the other hand, the Chadha diagram (Chadha 1999) which is a rather modified version of the Piper diagram, shows that most groundwater samples are characterized by dominance of alkaline earth (Ca2+ + Mg2+) over alkalis (Na + K) and strong acids (SO4 + Cl) over weak acids (HCO3); however, some samples (31%) indicate dominance of weak acids over strong acids. Therefore, most of the groundwater groups fall in the field of Ca–Mg–HCO3 and Ca–Mg–Cl/Ca–Mg–SO4 water type (Fig. 5). These facies indicate the coexistence of the dissolution of both calcite and dolomites and the Ca–Na cation exchange.

Fig. 5
figure 5

Chadha diagram of the groundwater samples

Suitability of the data for FA/PCA was checked using Bartlett’s sphericity and Kaisere–Meyere–Olkin (KMO) tests. Bartlett’s sphericity test of normalized data set is carried out and reveal that χ2 (cal) = 280.39 is greater than the χ2 (crit) = 73.31 at the degree of freedom 55, significant level 0.05 and p value < 0.0001. The value of KMO was 0.61. Hence, these tests indicate that the sampling is adequate for factor analysis. According to Kaiser Criterion (Kaiser 1960), the first three PCs explaining 76.80% of the total variance are chosen to represent the hydrochemical process of groundwater (Table 3).

Table 3 Score of PCA after Varimax rotation

Factor 1 represents 30.70% of the total variance and had strong positive loading on Na and K, a moderately positive loading on HCO3 and Ca, and a strong negative loading on pH, which is probably associated to carbonate and evaporate minerals.

Factor 2 explains about 30.98% of the variance and shows strong positive loadings on EC, Mg, and SO4, a moderately positive loading on Ca and Cl. This factor shows that the EC is due to hardness, chloride, and sulfates, and probably indicates the signatures of water–rock interaction.

Factor 3 has a total variance of 11.2% and shows high positive loading on temperature and a moderately positive loading on NO3. Nitrate is related to anthropogenic activities, such as the agricultural practice (fertilizers, animal waste, etc.).

A scatter-plot (Fig. 6) of PC1 versus PC2 reveals that all water groups are well distinguished from each other in the PC space and absolutely coherent with groupings extracted from Q-mode HCA.

Fig. 6
figure 6

Plots of PC scores for PC1 versus PC2

4.2 Water quality index

The weighted arithmetic water quality index method developed for groundwater parameters represents the overall quality of water according to the degree of purity for any intended use. For the study area, WQI value was computed for drinking and irrigation water using the guidelines of WHO (2011) and of Ayers and Westcot (1994).

Table 4 represents WQI values for groundwater samples.

Table 4 Suitability of groundwater for drinking and irrigation based on WQI

The EC, pH, Ca, Mg, Na, K, Cl, SO4, HCO3, and NO3 have been used to obtain the WQI for drinking. Results revealed that all wells had WQI above 50. However, 42.3% of wells had poor water quality and 50% had very poor water quality. Only one well (well 14) was unsuitable for drinking purpose.

Sodium adsorption ratio (SAR) measures the relative proportion of sodium to calcium and magnesium in irrigation water. A higher SAR may cause long-term damages to the soil structure. This will lead to a decrease in crop production. The US Salinity Laboratory Staff (USSL 1954) recommended the equation given below to calculate SAR:

$${\text{SAR}} = {\text{Na}}^{ + } /\left[ {\left( {{\text{Ca}}^{2 + } + {\text{Mg}}^{2 + )} } \right)/2} \right]^{1/2}$$
(8)

where the concentrations are reported in meq/L.

The SAR values vary from 0.23 to 1.38 and are less than the permissible limit of 15 (Ayers and Westcot 1994) in irrigation water. SAR has been used with EC, pH, Ca, Mg, Na, K, Cl, SO4, HCO3, and NO3 to calculate WQI for irrigation use. WQI intended for irrigation purposes appears to have a low average of 3.51, with minimum and maximum values of 1.36 and 5.78, respectively. Consequently, all the samples are suitable for irrigation purposes.

4.3 Geostatistical modeling

The normality of the analyzed water parameters was considered for the best work of Kriging methods. The best fitted semivariogram models were chosen based on the mean error (ME) and root mean square standardized error (RMSSE) values. The model is considered efficient with the most accurate estimations when ME is minimum and RMSSE is close to unity. The exponential semivariogram model fitted best for all hydrochemical parameters values. Regarding the nugget variance/sill ratio, three classifications are used to explain the spatial dependence of groundwater parameters: the spatial dependence is considered strong when the ratio is < 25%, moderate when the ratio is between 25 and 75%, and weak if the ratio is more than 75% (Table 5). According to nugget/sill ratios, the study area indicates that groundwater chemistry has a strong spatial structure for Mg, Na, Cl, and NO3, and a moderate spatial structure for EC, Ca, K, HCO3, and SO4.

Table 5 Best-fitted variogram models of water quality parameters

Spatial variability of EC (Fig. 7) shows that mineralization increases (> 1500 μS/cm) towards the north, south, and in the center part of the study area, as a result of the leaching of the tellian geologic formation, and agricultural practice such as livestock farms and the extensive irrigated land.

Fig. 7
figure 7

Spatial distribution map for EC

The trend of the distribution of Ca and HCO3 concentration is increasing to the foothills of the carbonate mountains (Figs. 8, 9a), such as marl and limestone formations of tellian domain (Djebel tikbeb), carbonate neritic nappe (Djebel Oum Settas and Mazela), and Pliocene lake limestone surrounding El Khroub.

Fig. 8
figure 8

Spatial distribution maps for the concentrations of major cations: a calcium, b magnesium, c sodium, and d potassium

Fig. 9
figure 9

Spatial distribution maps for the concentrations of major anions: a bicarbonate, b chloride, c sulphate, and d nitrate

Magnesium and sulfates have the same homogenous spatial distributions, which increase into the southwestern part of the valley (Figs. 8b, 9c). SO4 might result from leaching of clays and the dissolution of gypsum and anhydrite present in clay levels (Bouteraa 2008).

Figures 8c–d, 9b show an increasing trend of Na, K, and Cl to the south part of the valley and along Boumerzoug Wadi and its tributary El Berda Wadi. The increasing concentration of Na and Cl in the south part of the study area is assumed to be the result of leaching of triassic clays. Along Boumerzoug valley, Na and Cl concentration may indicate sewage input without treatment and animal manure (Wang et al. 2016).

The nitrate map (Fig. 9d) shows that wells that exceed the standards required for consumption (50 mg/L) are localized around irrigated land indicating the influence of farming inputs.

Table 5 designates that the best-fit semivariogram model used to obtain the most accurate estimations for WQI was an exponential model. The nugget to sill ratio was < 25%, which represent a strong spatial dependency. The WQI map (Fig. 10a) exhibits the spatial variability of groundwater quality for drinking purpose. Most wells situated in the recharge area had a poor water quality, whereas those located along Boumerzoug Wadi had a very poor water quality. Poor water quality may be related to geogenic processes and anthropogenic sources. On the other hand, the distribution of water quality index for irrigation (Fig. 10b) is homogenous and it is found below the lower limit, explaining why all groundwater samples are safe for irrigation.

Fig. 10
figure 10

Spatial distribution map for WQI: a WQI for drinking, b WQI for irrigation

4.4 Hydrogeochemical process

4.4.1 Origin of mineralization

Calcium and magnesium in groundwater result from the leaching of limestone, dolomites, gypsum, and also from the cation exchange process. The reaction of carbonate minerals (calcite and dolomite) with water and carbon dioxide is written as follows:

$${\text{CaCO}}_{{ 3 \left( {\text{calcite}} \right)}} + {\text{H}}_{2} {\text{O}} + {\text{CO}}_{2} \to {\text{Ca}}^{2 + } + 2{\text{HCO}}_{3}^{ - }$$
(9)
$${\text{CaMg}}\left( {{\text{CO}}_{3} } \right)_{{2 \left( {\text{dolomite}} \right)}} + 2{\text{H}}_{2} {\text{O}} + 2{\text{CO}}_{2} \to {\text{Ca}}^{2 + } + {\text{Mg}}^{2 + } + 4{\text{HCO}}_{3}^{ - }$$
(10)

Dissolution of calcite and dolomite can be identified by calculating the Ca/Mg ratio. If the molar ratios of these cations are close to 1, dissolution of dolomite should occur, whereas dissolution of calcite is the dominant reaction when Ca2+/Mg2+ ratio is between 1 and 2 (Mayo and Loucks 1995). Higher Ca/Mg molar ratio indicates non carbonate mineral source, which may play a significant role in the groundwater chemistry.

Figure 11a shows that most water samples (group 1 and group 2) have a ratio > 2, which indicates probably the influence of clay minerals (reverse cation exchange) and/or gypsum dissolution. All water samples of group 3 are characterized by dissolution of dolomite.

Fig. 11
figure 11

Bivariate plots of a well number versus Ca/Mg, b Ca + Mg versus HCO3, c Ca + Mg versus SO4 + Cl, d Ca + Mg versus HCO3 + SO4, e Ca + Mg versus Cl, and f Na/Cl versus Cl

The majority of the water samples present an excess of (Ca + Mg) relative to HCO3 (Fig. 11b), which is why Ca and Mg should be balanced by SO4 and Cl (Fig. 11c).

The plot of (Ca + Mg) versus (HCO3 + SO4) will be close to the 1:1 line if the dissolution of calcite, dolomite, and gypsum are the dominant reactions in a system (Cerling et al. 1989; Fisher and Mulican 1997). If reverse ion exchange is the process, the points take place on the left side the 1:1 line due to excess (Ca + Mg) over (HCO3 + SO4). In the opposite case, the ion exchange is the process due to excess HCO3 + SO4 over Ca + Mg.

Figure 11d indicates that most of the samples are distributed above the line 1:1 (R2 = 0.85) so that clay minerals weathering, with carbonate and gypsum weathering at less degree, are considered as enriching factor for groundwater mineralization.

The plot of Ca + Mg versus Cl and Na/Cl versus Cl shows that the salinity increased with a decrease in Na/Cl and an increase in Ca + Mg, which may be due to reverse ion-exchange in the clay/weathered layers (Sheikhy Narany et al. 2014). Clay minerals have a sheet structure with boundaries and face negatively charged, onto which cations can be fixed and exchanged (Clark 2015) as follows:

$${\text{Ca}}\left( {\text{Mg}} \right) - {\text{Clay}}_{2} + 2{\text{Na}}^{ + } \to {\text{Ca}}^{2 + } \left( {{\text{Mg}}^{2 + } } \right) + {\text{Na}} - {\text{Clay}}_{2}$$
(11)

4.4.2 Geochemical modeling

Groundwater geochemistry is dominated by the interaction between water and the aquifer matrix. The saturation index was applied to predict the reactive mineralogy of the subsurface from the groundwater sample data without collecting the samples of the solid phase and analyzing the mineralogy (Appelo and Postma 1993).

The results of the saturation index calculations for the selected minerals (Calcite, Aragonite, Dolomite, Gypsum, Anhydrite, and Halite) are presented in Table 6.

Table 6 Statistical summary of saturation indexes of minerals in groundwater using PHREEQC

The saturation index of minerals in groundwater samples indicates that only carbonate minerals (calcite, aragonite, and dolomite) tend to precipitate in all groups (Fig. 12). Given the semi-arid climate of the study region, high evaporation and less rainfall (< 600 mm/years) might be responsible for the precipitation of aragonite, calcite, and dolomite (Kumar and Singh 2015). However, anhydrite, gypsum, and halite are in the state of undersaturation, indicating that their soluble component Na, Cl, Ca, and SO4 concentrations are not limited by mineral equilibrium (Güler and Thyne 2004). Anhydrite and Gypsum minerals are in phase to reach their equilibrium. The precipitation of Anhydrite and Gypsum in well 3 can be explained by leaching of the Triassic formation that characterizes the tellian domain and by the presence of Gypsum in the clay and marl levels of Mio-Pliocene formation. Low concentration of Na compared with Cl is probably due to the nature of Na which can be linked with clay minerals by an ion-exchange process. Reverse ion-exchange decreased the concentration of Na and augmented that of Ca, involving the reduction of Gypsum dissolution (Tarki et al. 2010).

Fig. 12
figure 12

Saturation index of a carbonate minerals, b Gypsum–Anhydrite, and c Halite

Another approach to test the proposed hydrochemical evolution is the use of mineral stability diagrams (Drever 1988). Activity plots of log (a 2+Ca /a2 +H ) versus log (a +Na /a +K ), log (a 2+Ca /a2 +H ) versus log (a 2+Mg /a2 +H ), and log (a 2+Mg /a2 +H ) versus log (a +Na /a +K ) indicate four mineral stability for CaO–Na2O–Al2O3–SiO2–H2O (Fig. 13a), CaO–MgO–Al2O3–SiO2–H2O (Fig. 13b), and MgO–Na2O–Al2O3–SiO2–H2O (Fig. 13c) systems at 25 °C and 1 bar. The three water groups are plotted essentially in the Ca-smectite and Kaolinite stability field. Therefore, equilibrium with Ca-smectite and Kaolinite is one of the main processes controlling water chemistry. Therefore the major geochemical reaction controlling groundwater chemistry of Boumerzoug-El Khroub valley can be written as:

$$6{\text{CaAl}}_{2} {\text{SiO}}_{10} ({\text{OH}})_{2} \left( {\text{Ca - smectite}} \right)_{ } + 2{\text{H}}^{ + } + 23{\text{H}}_{2} {\text{O}} \leftrightarrow 7{\text{Al}}_{ 2} {\text{Si}}_{ 2} {\text{O}}_{ 5} ( {\text{OH}})_{4} \left( {\text{Kaolinite}} \right) + {\text{Ca}}^{2 + } + 8{\text{H}}_{4} {\text{SiO}}_{ 4} {\text{a}}^{2} + 8{\text{H}}_{4} {\text{SiO}}_{4}$$
(12)
$${\text{MgAl}}_{ 2} {\text{SiO}}_{ 1 0} ({\text{OH}})_{2} \left( {\text{Mg - smectite}} \right) + 2{\text{H}}^{ + } + 23{\text{H}}_{2} {\text{O}} \leftrightarrow 7{\text{Al}}_{ 2} {\text{Si}}_{ 2} {\text{O}}_{ 5} ( {\text{OH}})_{4} \left( {\text{Kaolinite}} \right) + {\text{Mg}}^{2 + } + 8{\text{H}}_{4} {\text{SiO}}_{ 4}$$
(13)
Fig. 13
figure 13

Mineral stability diagrams

5 Conclusion

Multivariate analysis, geostatistical modeling, WQI, and geochemical modeling could be useful to define and clarify the genetic origin of the factors controlling groundwater chemistry of Boumerzoug-El Khroub’s Valley, Northeast Algeria.

Q-mode cluster analysis identified three main water types based on groundwater quality data sets. Group 1 represents a water sample with low salinity (EC = 887.27 μS/cm) and is mainly localized in the recharge area. Group 2 represents wells localized in transit and discharge areas; this group has moderate salinity (EC = 1651.54 μS/cm) and is dominated by Ca–SO4–Cl facies. The third group has high salinity (EC = 2780 μS/cm) and is dominated by Ca and SO4.

FA/PCA allowed extraction of three PCs that explain 76.80% of the total variance. PC1 and PC2 revealed that the hydrogeochemical composition of groundwater is affected by the geogenic process, which includes the dissolution of carbonate and evaporate rocks, reverse ion exchange, and weathering processes. PC3 is related to the agricultural area where the highest irrigation frequency coincides with this period of plant development.

Geostatistical analysis using ordinary Kriging demonstrated a strong spatial distribution for Mg, Na, Cl, and NO3, and a moderate spatial distribution for EC, Ca, K, HCO3, and SO4.

Mineralization has the tendency to increase along Boumerzoug Wadi, around tellian domain, and towards the hydraulic discharge area. Thus, WQI values revealed the deteriorated drinking water quality from the recharge area to the discharge area and assured the suitability of groundwater for irrigation purposes.

Hydrogeochemical processes were dominated by reverse ion exchange, which controls the groundwater chemistry. Kaolinite and Ca-smectite is one of the processes responsible for hydrochemical evolution in the area. All water groups are undersaturated with respect to evaporite minerals. Per contra, carbonate minerals are supersaturated in all groups.