Introduction

More than 90 % of the Iran’s area is located in arid and semiarid regions. Overpopulation in recent decade has diverted the groundwater consumption from agricultural use to industrial and drinking purposes. At present, about 55 % of the water consumed in Iran is provided from groundwater resources (Zehtabian et al. 2010). High rate of evaporation in addition to low rain fall in arid and semiarid area can lead to groundwater salinity (Umar and Absar 2003), a phenomenon which has been intensified in recent years in Iran. In the absence of major anthropogenic sources, water–rock interaction is the main process that affects the groundwater chemistry in an area (Corteel et al. 2005). These geochemical processes are responsible for the spatial–temporal variations in groundwater chemistry. There have been multiple studies on the impacts of geochemical processes on the quality of groundwater in different parts of Iran (e.g., Taheri Tizro and Voudouris 2007; Jalali 2009; Aghazadeh and Mogaddam 2011). In this respect, Panno et al. (2002) compared geochemical and isotopic techniques for identifying the natural and anthropogenic sources of Na and Cl contamination in groundwater and surface water resources in the Midwestern USA. Graphing techniques were used to discriminate among multiple sources and between unaffected and affected surface and groundwater. These graphical techniques were useful for distinguishing the sources of Na+ and Cl contamination in groundwater in Illinois and can be more widely applicable in other places. Hydrogeochemical characteristics based on bivariate diagrams of major and minor ions were also utilized by Kim et al. (2003) and showed that changes in the chemical composition of groundwater are mainly controlled by the salinization process followed by cation exchange reactions.

Semnan Province is one of the provinces in central parts of Iran that stretches along the Alborz mountain range and borders to Dasht-e Kavir desert in its southern parts (Mirhosseini et al. 2011). The main lithologic units of this area are ophiolitic complex accompanied by Eocene–Oligocene volcaniclastic and basic rock units. The ophiolitic complex is mostly dominated by serpentinized harzburgite and dunite, which is considered as the main body of ultramafic rocks of ophiolitic zones (Hajizadeh et al. 2011). Moreover, cretaceous carbonates are other dominant formations in the region (Bazargani-Guilani et al. 2010).

The main cities in the study area are Shahrood, Damghan and Byar. Considering the chemical composition and ion chemistry of groundwater in the study area, high Ca, Mg, and pH and higher-than-recommended hardness values were detected in the groundwater of Shahrood City and its nearby area reflecting the influence of carbonate rock formations on the groundwater composition (Kazemi 2004). In this respect, the limestone/dolomite has mainly influenced the hardness of the water whereas the samples for which the pH was measured are supersaturated with respect to calcite and were undersaturated with respect to all other mineral phases such as evaporates (Kazemi 2004). Moreover, in a study conducted by Rahimi et al. (2013) on the levels of fluoride in groundwater resources of Shahrood and Damghan, nine out of ninety five drinking water samples (9.5 %) had exceeded the standard level of 1.5 mg/l and the values ranged from 0.052 to 6.87 mg/l in that research.

Due to the arid climate of this province, and low recharge of the groundwater, preservation of the quality of available groundwater resources is of great importance. This study was initiated to identify the principal geochemical processes controlling groundwater quality in eastern part of Semnan Province. In addition, considering the large sample size collected in this research, prediction of groundwater quality using a water quality index was another considered goal in this study.

Materials and methods

Sample collection and field studies

A total of 257 groundwater samples from wells and springs were collected. Polyethylene bottles were used for sampling following rinsing of each bottle with the sample to avoid any contamination during sampling. Electrical conductivity (EC) and pH were recorded in situ using pH meter and a portable EC meter. Samples were transported to the laboratory the same day and following with filtering with 0.45-lm Millipore filter paper and acidification with nitric acid, they were analyzed for cations. For anion analyses, these samples were stored below 4 °C. Major cations (Na+, Ca2+, Mg2+, K+) and anions (Cl, \({\text{SO}}_{4}^{2 - }\), \({\text{HCO}}_{3}^{ - }\), F, \({\text{NO}}_{3}^{ - }\)) were determined using the procedures given in APHA (1995). A factor analysis, with respect to principal component analysis and varimax rotation method, was conducted to find the principal components responsible for the variation found in groundwater quality variables.

Characteristics of the area

The geology of the study area is mainly characterized by Shemshak Formation (including Dark gray shale and sandstone), Lar Formation (including bedded to massive limestone), Dorud Formation (including shale with subordinate sandy limestone), Mobarak Formation (including limestone with black shale), Ruteh limestone (including bedded to massive limestone) and Karaj Formation (including tuff and tuffaceous shale). In addition, there are outcrops of cretaceous rocks and marl with gypsiferous marl in the region as well. The main geological formations have been illustrated in Fig. 1. Regarding the industrial activities in the region, there are just two industrial complexes in the area including Shahrood and Damghan industrial complex located next to these cities; however, these industrial activities are less likely to have an impact of the levels of anions and cations analyzed in this study. The location of these industrial complexes has been given in Fig. 6.

Fig. 1
figure 1

Main geological formations in the study area

Water quality index calculation

To calculate water quality index (WQI), the method proposed by Horton (Horton 1965) and followed by many researchers (e.g., Dwivedi and Pathak 2007) was applied. To sum up, a weight was assigned to each parameter with respect to its importance in the overall quality of water. In the next step, using a quality rating scale for each parameter the final WQI was calculated as the product of the weight and the associated rating scale for each parameter. The detail of this method can be found in other literatures (e.g., Rupal et al. 2012).

Support vector regression (SVR)

Support vector regression (SVM) was one of the modeling procedures used to predict the water quality index given 10 groundwater quality variables (pH, EC, Cl, \({\text{NO}}_{3}^{ - }\), \({\text{SO}}_{4}^{2 - }\), \({\text{HCO}}_{3}^{ - }\), Na+, Ca2+, Mg2+, K+) as the features. A popular regression version of SVM, ɛ-SVM, is used to find a function that has at most ɛ deviations from the actual obtained targets for all the training data, and is as flat as possible (Smola and Scholkopf 2004).

To enlighten this modeling method, consider a set of training points, \(\left\{ {\left( {{\mathbf{x}}_{1} ,{\mathbf{z}}_{1} } \right), \ldots ,\left( {{\mathbf{x}}_{l} ,{\mathbf{z}}_{l} } \right)} \right\}\), where \({\mathbf{x}}_{i} \in R^{n}\) is a feature vector and \({\mathbf{z}}_{i} \in R^{1}\) is the target output. Given the regulation parameter (C > 0) and insensitive loss function (ɛ > 0), the standard form of support vector regression is as follows (Vapnik 1999):

$$\begin{aligned} \mathop {\hbox{min} }\limits_{{\omega ,b,\xi ,\xi^{*} }} \frac{1}{2}\omega^{T} \omega + C\mathop \sum \limits_{i = 1}^{l} \xi_{i} + C\mathop \sum \limits_{i = 1}^{l} \xi_{i}^{*} {\text{subject to}}\,\omega^{T} \phi \left( {x_{i} } \right) + b - z_{i} \le \varepsilon + \xi_{i} ,\,z_{i} - \omega^{T} \phi \left( {x_{i} } \right) - b \le \varepsilon + \xi_{i}^{*} ,\, \hfill \\ \xi_{i} ,\xi_{i}^{*} \ge 0,\quad i = 1, \ldots ,l \hfill \\ \end{aligned}$$
(1)

where \(\xi_{i}\) and \(\xi_{i}^{*}\) are slack variables and ɛ is the accuracy demanded for the approximation. The constant C > 0 determines the trade-off between the flatness of linear functions and the amount up to which deviations larger than ɛ are tolerated (Smola and Scholkopf 2004). Transforming this quadratic programming problem to its corresponding dual optimization problem and introducing the kernel function in order to achieve the nonlinearity, yields the optimal regression function as (Singh et al. 2011):

$$f\left( x \right) = \mathop \sum \limits_{i = 1}^{N} (\alpha_{i} - \alpha_{i}^{*} )K\left( {{\mathbf{x}}_{i} ,{\mathbf{x}}} \right) + b$$
(2)

where C \(\ge \alpha_{i} ,\alpha_{i}^{*} \ge 0,i = 1, \ldots ,N. \alpha_{i}\) and \(\alpha_{i}^{*}\) (with 0 ≤ α i \(\alpha_{i}^{*}\) ≥ C) are the Lagrange multipliers and \(K\left( {{\mathbf{x}}_{i} ,{\mathbf{x}}} \right)\) represents the kernel function.

For building SVR forecasting model, the LIBSVM package proposed by Chang and Lin (2001) was adapted in this study. In addition, RBF kernel was utilized for building the model.

The performance of SVM for regression depends on several parameters such as capacity parameter (C), ɛ-insensitive loss function and the variables associated with each kernel type (Aryafar et al. 2012). A trial and error procedure was followed for optimization of each parameter. The leave-one-out cross validation method was utilized using 80 % of the original data as the training set, and out-of-sample generalization error of the test data set was used for model selection.

Artificial neural network with early stopping

Artificial neural network (ANN) was utilized as another modeling method for prediction of WQI. The linear transfer function and the following transfer function were used for the output and hidden layers, respectively:

$${\text{y}}_{\text{j}} = { \tanh }\left( {\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{d}} {\text{w}}_{\text{ij}} {\text{x}}_{\text{i}} + {\text{b}}_{\text{j}} } \right)$$
(3)

where \({\text{w}}_{\text{ij}}\) and \({\text{b}}_{\text{j}}\) are the weight and bias parameters in which “i” and “j” subscripts refer to the input and neuron, respectively. In addition, Levenberg–Marquardt algorithm was used to update the weight and bias of the network. To avoid the risk of over-fitting which is a common problem during neural network training, early stopping was used. To keep within the scope of this paper, we limited our survey of ANN models to the feed-forward neural network with one hidden layer.

Results and discussion

Principal geochemical processes controlling groundwater quality

Descriptive statistics associated with each groundwater quality variable are given in Table 1. The order of abundance of anions was Cl > SO4 > HCO3 > NO3 > F in which the average values were 509.41, 404.24, 120.48, 14.51, 0.59 mg/l, respectively. On the other hand, for cations Na+ and Ca2+ with average values of 442.10 and 135.09 mg/l were the most abundant parameters. Moreover, the results of factor analysis for water quality parameters using principal component analysis and varimax rotation method are rendered in Tables 2 and 3. The Kaiser–Meyer–Olkin (KMO) test is a representative test of the sampling adequacy to conduct factor analysis. There is no cutoff point associated with this test; however, if the test result is smaller than 0.5, the factor analysis is not suitable (Wu and Kuo 2012). The result of KMO test was 0.788, indicating the suitability of factor analysis. In addition, Chi-square distribution (χ 2) of Bartlett’s test of sphericity was high (1669.46), and highly significant, implying the existence of a common factor among the relevant matrices of the parent population (Wu and Kuo 2012).

Table 1 Descriptive statistics for samples groundwater quality variables
Table 2 Initial and rotated sum of squares associated with each components
Table 3 Rotated component matrix of groundwater quality variables

There are many criteria for retaining the number of factors. For instance, according to Kaiser Criterion (Kaiser 1960), only factors with eigenvalues greater than 1 are retained. However, Jolliffe (1972) believed that Kaiser’s criterion was too large and suggested using a cutoff of 0.7 on the eigenvalues instead. Therefore, based on the Jolliffe’s criterion, eight components were kept accounting for 90.97 %of the total data variance. According to the results of initial and rotated sum of squares associated with each component (Table 3), ten initial water quality variables were reduced to five components accounting for 87.03 % of the total variance. It should be noted that due to missing values for bicarbonate in 17 stations, this parameter was excluded from the factor analysis. The first factor with an eigenvalue of 4.47 accounts for 44.73 % of the total variance (Table 2) and is the most important component. It has a high loading with chloride (0.932), sulfate (0.766), potassium (0.814), sodium (0.942), magnesium (0.911) and EC (0.778) as well (Table 3). As explained by Kumar et al. (2006), chloride levels are higher in area covered with sand dunes (like that of the study area). In addition, due to morphology of the area, high chloride weathering of ridge material can also contribute to the elevated levels of chloride (Kumar et al. 2006). The reaction of rainwater with evaporated deposits in the sand dunes can enhance the Cl, \({\text{HCO}}_{3}^{ - }\) and Na+ as well (Subramanian and Saxena 1983). In urban area of the region, wastewater is the most likely source of extra chloride in the groundwater (Kazemi 2011); however, as explained by Jalali and Kolahchi (2008), the addition of salt to animal food and application of their manures in agricultural fields is a possible source of chloride in the study rural area next to geological origin of this element.

Sulfate levels in this research fluctuated between 9.63 and 4753 mg/l with an average level of 404.24 mg/l. Regarding the morphology of the region, the concentrations of sulfate have been proved to be higher in sand dunes (665–2531 mg/l) compared with that of quartzite (75–750 mg/l) and alluvium (200–831 mg/l) in a study conducted in Delhi, India (Kumar et al. 2006). In comparison with other groundwater resources in arid and semiarid area, the levels are in the same range as that of Ejina Basin in China in which average values as high as 565.95 mg/l have been recorded (Wen et al. 2005), whereas these are higher than the average sulfate levels of 88.64 mg/l in Hail Province in Saudi Arabia (Zaidi et al. 2014). In this respect, as explained by Doulati Ardejani et al. (2011), high levels of sulfate in some parts of the aquifer(as high as 4753 mg/l) may result from pyrite oxidation due to prevalent coal washing waste dump.

The sources of Mg in the aquifer are mainly from the dissolution of dolomite (Zaidi et al. 2014). The average concentrations of calcium and magnesium were 47.78 and 135.08 mg/l, respectively. As a whole, the average concentration of magnesium is higher than that of calcium. Since these two species have originated from geological formations and the solubility of CaCO3 is much lower than that of MgCO3, so, Ca2+ is precipitated as CaCO3 resulting in the lower values of this cation in the groundwater.

The fixation of K+ by clay minerals and the greater resistance of this cation to weathering result in its low levels in natural groundwater (Subba Rao 2002). The average value of this cation was 3.35 mg/l which is far less than that of the other cations due to the foregoing reasons.

The second factor, with an eigenvalue of 1.12, accounts for 11.23 % of the total variance and has a high loading with fluoride (0.966). Fluoride concentration in most natural groundwater resources is <1 mg/l (Hem 1985). The main contributing factor for the concentrations of fluoride in groundwater is the solubility of fluorite (CaF2) as the common fluoride mineral. In addition, through ion exchange reactions fluoride is adsorbed onto clay minerals including gibbsite, kaolinite and halloysite (Vengosh and Pankratov 1998). Another possible source for fluoride is evaporitic and crystalline rocks (both magmatic and metamorphic) (D’Alessandro et al. 2008) which are prevalent in the region. According to a recent study, Semnan Province is located in high-fluoride regions of Iran (Mesdaghinia et al. 2012). The mean of fluoride in this study area was 0.6 mg/l which is comparable with that of 0.64 mg/l reported from a study conducted on 78 sampling wells by Mesdaghinia et al. (2012).

The third factor encompassed 10.42 % of the total variance and has an eigenvalue of 1.04. It also has a high loading with pH (0.986). The fourth factor with an eigenvalue of 1.04 accounts for 10.35 % of the total variance. In addition, this factor is highly loaded with nitrate. The average value of nitrate in the groundwater of the study area was 14.51 mg/l which is less than the permissible level of 40 mg/l; however, values as high as 113.22 mg/l have also been recorded in the region indicating gross pollution in some parts of the aquifer. The sources of this cation however are mainly due to the application of fertilizer and manure in agricultural fields (Berenji 1998) plus urban and rural absorbing wells (Kazemi 2011). For instance, the application rate of nitrogen fertilizers in two towns in the region (Mojen and Tash) is twofold and threefold higher than the common rate in other parts of Iran due to the tendency of farmers to use these fertilizers (Berenji 1998). On the contrary, in the urban area of the region, disposal wells and deep cesspits have been attributed as the main cause of higher-than-normal values of nitrate in the previous studies (e.g., Kazemi 2011).

The fifth factor has a high loading with calcium (0.982) and accounts for 10.31 % of the total variance as well. Because of the prevalence of limestone in the region, most of the calcium in groundwater has emanated from this mineral.

On the contrary, the correlation coefficients among groundwater quality variables are rendered in Table 4. In general, most ions are positively correlated with Cl, and Na, K, Br, Mg, and SO4 show a strong correlation with Cl, indicating that such ions are derived from the same natural source which is geological formations of the study area. With respect to the correlation coefficient between Na+ and Sulfate (r = 0.78), it can be concluded that the excess of sodium in these samples mostly results from the dissolution of sodium sulfate minerals. The low correlation coefficient between Ca2+ and \({\text{HCO}}_{3}^{ - }\) (r = 0.12) shows that calcite may not be the source of Ca2+ in the study area. The Sharood Aquifer (the biggest city in the study area) is an unconfined alluvial aquifer bordered by limestone/dolomitic mountains in the north of the city to the marly–gypsiferous outcrops in the south (Kazemi 2011). Dolomite, is a sedimentary rock composed primarily of the mineral dolomite, CaMg(CO3)2. It is thought to be formed by the postdepositional alteration of lime mud and limestone by magnesium-rich groundwater. On the contrary, gypsum is a soft sulfate mineral composed of calcium sulfatedihydrate, with the chemical formula CaSO4·2H2O (Klein et al. 1985). Therefore, the high levels of calcium and magnesium in the region can be attributed to existence of these geological formations in the study area.

Table 4 Correlation coefficients among groundwater quality variables

High correlation between sulfate and Mg2+ (r = 0.66) may highlight the contribution of magnesium sulfate minerals to the levels of Mg2+ in the region. The sources of bicarbonate can be attributed to limestone and dolomite of Elica and Lar formations (Doulati Ardejani et al. 2010).

A Na/Cl molar ratio can be used to detect the source of Na+ in the groundwater. For instance, Na/Cl molar ratio greater than one indicates that the excess of sodium has probably originated from silicate-weathering reactions, whereas if halite dissolution is responsible for Na, the Na/Cl molar ratio is approximately equal to one (Meybeck 1987). If silicate weathering is a probable source of sodium, the water samples would have \({\text{HCO}}_{3}^{ - }\) as the most abundant anion (Rogers 1989). This is because of the reaction of the feldspar minerals with the carbonic acid in the presence of water, which releases \({\text{HCO}}_{3}^{ - }\) (Elango et al. 2003). Among groundwater samples, about 82 % have Na/Cl molar ratio greater than one indicating that Na release from silicate weathering is an important process in the study area. The plot of Na/Cl versus EC would give a horizontal line if evaporation is dominant in the area (Jankowski and Acworth 1997) meaning that no mineral species is precipitated.

With respect to the plot of Na/Cl versus EC (Fig. 2a), it can be concluded that most of the samples lie above equiline suggesting silicate weathering as the dominant process for the excess Na in the study area. In addition, if silicate weathering is the dominant process, the plot of HCO3 versus Na should have samples falling above the equiline (Elango and Kannan 2007). Regarding Fig. 2b, it is obvious that the majority of points lie above the equiline thus confirming the results of Na/Cl versus EC plot.

Fig. 2
figure 2

Plot of Na/Cl versus EC (a) and plot of bicarbonate versus sodium (b) for the sampling stations

On the other hand, if dissolution of calcite dolomites and gypsum is responsible for the production of Ca2+, Mg2+, \({\text{SO}}_{4}^{2 - }\) and \({\text{HCO}}_{3}^{ - }\), then a charge balance should exist between the cations and anions (Jalali 2006). In the absence of such a relationship, ion exchange process will shift the points to the right of the resultant 1:1 line of the plot of Ca + Mg versus SO4 + HCO3 whereas the points will be shifted to the left side of this line in the case of reverse ion exchange process (Kumar et al. 2006; Fisher and Mulican 1997). The clustering of points around and below 1:1 line indicates the dominance of ion exchange process which is due to excess bicarbonate; however, a few points lie above this line proving the existence of reverse ion exchange in the study although to a lesser extent as well (Fig. 3).

Fig. 3
figure 3

Plot of Ca + Mg versus SO4 + HCO3 (a) and plot of Ca + Mg versus Cl (b) for the sampling stations

On the other hand, since Ca and Mg do not increase with increasing salinity (Fig. 3b), it is an indication of reverse ion exchange in the clay/weathered layer (Kumar et al. 2006) confirming the above-mentioned results.

The (Ca2++Mg2+)/\({\text{HCO}}_{3}^{ - }\) is a good indication for the sources of Ca and Mg in the groundwater. As explained by Sami (1992), if dissolution of carbonates in the aquifer and weathering of accessory pyroxene or amphibole minerals are responsible for the presence of Ca and Mg in the aquifer, this ratio would be about 0.5. As it is obvious from the plot of (Ca2++Mg2+)/\({\text{HCO}}_{3}^{ - }\) versus salinity (e.g., Concentration of Cl), the Ca and Mg concentrations are added to the solution at a greater rate than the increase in the salinity of the system (Fig. 4a). These high ratios cannot be attributed to HCO3 depletion; as under the existing alkaline conditions, HCO3 does not form carbonic acid (H2CO3) (Spears 1986). High ratios, therefore, indicate other sources for Ca and Mg, such as reverse ion exchange (Rajmohan and Elango 2004). A ratio <0.5 may be due to the exchange of calcium and magnesium in water by sodium bound in the clay. As a whole, the salinity level does not have a significant impact on this ratio.

Fig. 4
figure 4

Plot of (Ca + Mg)/HCO3 versus Cl (a) and plot of Ca/Mg versus sample number (b) for the sampling stations

Study on the Ca/Mg ratio contains valuable information about the source of these cations in the aquifer. That is if Ca/Mg = 1, dissolution of dolomite should occur, whereas a higher ratio is indicative of greater calcite contribution (Maya and Loucks 1995). In addition, higher Ca/Mg molar ratio (>2) indicates the dissolution of silicate minerals, which contribute calcium and magnesium to groundwater (Katz et al. 1998). Considering Fig. 3b, most of the samples lie around or slightly below Ca/Mg = 1 line indicating the dissolution of dolomite. There are some other samples having a ratio between 1 and 2 implying the dissolution of calcite. Those with greater ratios are indicative of the effect of silicate materials on the aquifer.

Prediction of WQI

The optimized values of regularization (C) and epsilon value (ɛ) parameters for modeling with SVMs were 270 and 5, respectively. Since the gamma parameter was the most sensitive parameter during SVM model building, the results for different values of this parameter are rendered in Table 5.

Table 5 Results of optimization of gamma parameter for RBF kernel

Regarding this table, the coefficient of determination (R 2) for the out-sample data has increased from 0.04 to 0.93 whereas that of mean squared error (MSE) for the test data set has reduced from 383.57 to 35.05. The predicted values of WQI for the training and test data have been compared, and the results have been given in Fig. 5 to show the performance of SVM. The correlation coefficients for the training and test data set were 0.97 and 0.96, respectively. On the contrary, the results of ANNs with early stopping have shown roughly the same performance with respect to Fig. 5 resulting in correlation coefficients of 0.98 and 0.96 for the training and test data set, respectively.

Fig. 5
figure 5

Comparison between the observed and predicted values of WQI for training (a) and test (b) data of SVM and training (c) and test (d) data of ANNs

Although the danger of over-fitting is one of the problems during ANNs training in many of the previous researches (e.g., Sakizadeh et al. 2015; Piotrowski and Napiorkowski 2013; Giustolisi and Simeone 2006) and SVMs have outperformed in many of the previous researches (Aryafar et al. 2012; Yoon et al. 2011; Behzad et al. 2009; Lamorski et al. 2008; Gill et al. 2006) however, this study indicated that in cases in which the number of samples is high enough compared with that of the number of features(e.g., water quality variables), using an algorithm to avoid over-fitting, the same results can be obtained with ANNs modeling. Since the prediction of WQI was successful, it shows that the resultant WQI is a good representative of the available groundwater quality parameters. Therefore, the map of WQI for the study area was produced by ArcMap version 10.1, using IDW method (Gong et al. 2014).

Considering Fig. 6, there are some areas with WQI values as low as 21 in the vicinity of Byar City. The nitrate and fluoride concentrations in this station, for instance, were 49.39 and 3.93 mg/l, respectively. The area around Shahrood City, however, had the best quality compared with other parts of the study area. As a whole, with respect to the measured parameters, the WQI for 8 % (22 stations) of the sampling wells and springs was <50 and for 17 % (44 stations), the calculated index was <60 which are important concerns that loom large.

Fig. 6
figure 6

Overall quality of groundwater with respect to WQI

Conclusion

A geochemical investigation was conducted in the groundwater of eastern part of Semnan Province to identify the geochemical characteristics controlling groundwater quality. This study demonstrated that the overall groundwater chemistry of the region is controlled by the rock–water interaction. There are some special conclusions drawn by studying the molar ratio of cations and anions in this research which deserve attention. Among groundwater samples, about 82 % of samples have Na/Cl molar ratio greater than one indicating that Na release from silicate weathering is an important process in the study area. The clustering of points around and below 1:1 line indicates the dominance of ion exchange process with respect to the plot of Ca + Mg versus SO4 + HCO3, which is due to excess bicarbonate. Moreover, it can be concluded that most of the samples lie above equiline given the plot of Na/Cl versus EC, suggesting silicate weathering as the dominant process for the excess Na in the region. On the other hand, since Ca and Mg do not increase with increasing salinity, it is an indication of reverse ion exchange in the clay/weathered layer. Higher Ca/Mg molar ratio (>2) shows the dissolution of silicate minerals, which contribute calcium and magnesium to groundwater. In this study, most of the samples lie around or slightly below Ca/Mg = 1 line indicating the dissolution of dolomite. Factor analysis reduced the original water quality variables to five components accounting for 87.03 % of the total variance. The molar ratio between some groundwater quality parameters highlighted geological sources of the studied cations and anions in the region. With respect to the results of WQI calculation, the area around Shahrood City had the best quality compared with other parts of the study area. The prediction of WQI was implemented through SVM and ANN methods. The finding of this study in this regard was that in cases in which the number of samples is high enough compared with that of the number of features (e.g., water quality variables), the same results for both of SVM and ANN can be obtained.