Introduction

Karstic groundwater systems are among the most important, yet the most vulnerable fresh water resources. The uncontrolled population growth is an issue having serious impact on the availability of fresh water. The major karstic groundwater systems in Turkey mostly exist in the Taurus Mountain range which extends along the Mediterranean Sea. Owing to the well-developed karst, the aquifers discharge through large karstic springs at low altitudes, mostly as coastal and to a lesser extent as submarine springs (Ekmekci 2003). Coastal karstic springs are even more vulnerable to changes in groundwater systems (Arfib et al. 2007; Montety et al. 2008; Najib et al. 2017). Salinization due to sea water intrusion induced by exploitation of the aquifer to meet the increasing demand for fresh water is the primary threat on these water resources. Prediction of the hydrogeological behavior of karstic springs in response to exploitation is crucial for an effective management of the karst aquifer. This paper presents a complex coastal karstic spring zone where salinity is an issue. The complexity of the system is revealed in the irregular distribution of salinity at the closely located spring outlets. Chemical characterization of these coastal karstic springs was the major objective of this study. This objective was achieved by employing multivariate analysis on large volume of hydrochemical data collected from the field.

Multivariate statistical analysis techniques such as factor analysis (FA) and cluster analysis (CA) are commonly carried out on hydrogeological data sets for the characterization of the groundwater systems (Arslan 2013; Behera and Das 2018; Belkhiri et al. 2011; Bilgin and Konanc 2016; Fatoba et al. 2017; Li et al. 2011; Kazakis et al. 2017; Martinez-Tavera et al. 2017; Masoud et al. 2018; Siepak and Sojka 2017; Subbarao et al. 1996; Yidana et al. 2018). The main applications of FA are to reduce the number of variables and to detect structure in the relationships between variables or to classify them (Davis 1973; Liu et al. 2003; Matiatos et al. 2014; Semar et al. 2013; Simonov et al. 2003; Zhang et al. 2016). In addition to FA, cluster analysis (CA) can be used to classify groups of hydrochemical data according to their similarities to each other and distinguish different groundwater systems (Filho et al. 2017; Suk and Lee 1999). All these studies have proved that multivariate statistical analysis is very efficient in characterizing groundwater systems particularly in terms of water quality.

The study area is located at the coast in the southwestern part of Turkey in a semi-arid region (Fig. 1). The local people suffer from water scarcity especially in summer months because the area is a touristic attraction site. The population is around three thousand in winter, whereas it increases to ten thousand in summer time. The mean annual precipitation and temperature in the region are 1146 mm and 15 °C, respectively. The precipitation is higher than the regional average, however due to the high annual air temperature about 70% of the precipitation is lost by evapotranspiration. This study focuses on the karst spring zone, known as the “Azmak springs” that discharge through about 150 outlets along 2 km. The spring outlets are aligned along a major fault line. The specific electrical conductivity (EC25) which is an indication of salinity ranges between 1300 and 15,000 µS/cm. The springs flow into a natural channel which also collects groundwater flow from the surrounding karst and alluvial aquifer as well as the seasonal surface runoff. The total streamflow at the channel (named the Azmak Stream) is measured at the automatic Azmak Gauging Station, located close to the sea. The average total flow at the Azmak Gauging Station is calculated as about 11 m3/s and the average specific electrical conductivity is about 5000 µS/cm. This value represents the complete mixing of all flow components including the karstic coastal springs, the alluvial groundwater and the surface runoff.

Fig. 1
figure 1

Location map of the study area

The salinity of the karst springs is high and not suitable for drinking. Multivariate analysis techniques were also employed to identify salinity sources and pollution indicators in the region. In addition, analysis of seasonal changes of salinity and hydrochemical properties of springs also provides an important tool for a better understanding of the hydrogeological system. In this study, seasonal changes were also considered to evaluate the hydrochemical characterization of the Azmak streamflow karst system.

Geology and hydrogeology

The watershed of the coastal spring zone which covers an area of 460 km2 is located on the western coast of Turkey. The area is located on the thrust zone between the allochthonous Bodrum Nappes and the autochthonous Menderes Massif. The geological formations consist of a sequence of autochthonous and allochthonous rock masses of Paleozoic to Cenozoic age (Fig. 2). The autochthonous Menderes Massif’s bedrocks consist of Paleozoic schists which are overlain by pervious, extensively karstified carbonate units of Mesozoic age. The allochthonous Bodrum Nappes consisting of impervious non-carbonate rocks and karstic carbonate rocks overthrust these two autochthonous units. The Plio–Miocene age post-tectonic units composed of conglomerates, overlie all older units. The major aquifers are the well-karstified autochthonous and allochthonous carbonate rocks in the study area (Acikel 2012).The alluvial deposits in the Gokova plain constitute the third extensive aquifer, which is partly under confined conditions. The karstic aquifers discharge mainly through the coastal and submarine springs, whereas alluvial aquifer discharges through the Azmak streamflow as baseflow.

Fig. 2
figure 2

Geological map and a general cross-section of the study area (Acikel 2012)

Based on detailed analyses of hydrogeological, hydrochemical and isotopic data, Acikel (2012) has distinguished three main circulation systems in the study area. The sampled water points have been classified to belong to a shallow circulation system, deep circulation system and those affected by seawater mixing (Fig. 3). Figure 3 is a plot of δ18O versus specific electrical conductivity (EC25). It depicts 3 clusters of waters, namely low EC25-depleted δ18O (indication of high altitude of recharge area and deep circulation), low EC25-enriched δ18O (indication of local recharge and shallow circulation) and increasing δ18O with increasing EC25 (indication of mixing with seawater). Figure 3 suggests that the wells tap shallow circulated groundwater and the majority of the springs originate from deep circulated groundflow.

Fig. 3
figure 3

Plot of EC25 and δ18O

Materials and methods

Sampling and analysis

During field studies, a total of 13 springs (SP-1, SP-2, SP-3, SP-4, SP-6, SP-7, SP-8, SP-9, SP-10, SP-11, SP-12, SP-13, SP-14) of 150 coastal springs were selected for sampling and chemical analysis. The springs are numbered according to their distance from the sea (SP-1 is the farthest, SP-14 is the closest to the sea). The Azmak Gauging Station (AGS) was also selected as a sampling point. In addition, four wells which represent the karst groundwater (BH-1 and BH-4), confined plain aquifer (BH-2), unconfined plain aquifer (BH-3) and seawater (SeaW) were sampled. A map showing the sampled points is given in Fig. 4. The basic information on the sampling points is given in Table 1.

Fig. 4
figure 4

Sampling map of the study area

Table 1 Basic information on the sampling points

These 19 water points were sampled for the dry and wet seasons in 2010 and 2011 to study the seasonal variability of hydrochemical characterization on the sampling points. Thus, a total of 38 samples were collected and analyzed for this study. During sampling, some parameters were also measured in the field. The specific electrical conductivity (EC25), pH and temperature (T) were carried out by a multiprobe (YSI-56 MP5™) with an accuracy of ± 0.01 µS/cm, ± 0.01 units and ± 0.01 °C, respectively. Major cations (Ca2+, Mg2+, K+, Na+), anions (Cl and SO42−, HCO3) and trace element (B, Br, Sr and Li) were analyzed at the ALS Laboratories (Sweden).

In situ measurements and hydrochemical analysis

Major ions (Ca, Mg, Na, K, Cl, SO4, HCO3) and trace elements (Br, B, Sr, Li) of the sample points were analyzed and in situ measurements (T, EC25, TDI and pH) carried out at the sampling points during the dry and wet seasons. The results given in Table 2 were evaluated using a semi-log (Schoeller) diagram, ionic ratios and statistical parameters such as the coefficient of variation.

Table 2 Values of the variables during dry and wet seasons

Multivariate statistical analysis

SPSS 11™ (Statistical Package for Social Science) was used to perform the correlation among variables and multivariate analysis for sampling points in the study area. Multivariate statistical methods are ever-expanding set of techniques for the classification, modeling and interpretation of large datasets from environmental monitoring programs and hydrochemical quality evaluations (Simeonov et al. 2003). Statistical methods such as cluster analysis (CA) and factor analysis (FA) are among the most commonly used multivariate statistical methods. These methods were also employed on 38 samples for 8 variables in this study.

For statistical analysis, the Shapiro–Wilk statistics test was applied to check if the data are normally distributed. Shapiro–Wilk test results were greater than 0.834 and 0.702 for all variables except Ca for dry and wet season, respectively. For dry and wet seasons, Mg, Na, K, Cl, SO4, EC25 and B were log-transformed to accommodate normally distributed data. Then, all variables were standardized to avoid misclassification because of the differences in dimensionality by calculating their standard scores (z-scores) as follows:

$${Z_i}=\frac{{{X_i} - X}}{S},$$
(1)

where \({Z_i}\) is the standard score of the sample i; \({X_i}\) is the value of sample i; X is the mean, and S is the standard deviation (Güler et al. 2002).

Correlation analysis

Correlation analysis is a common and useful statistical tool indicating the strength of the association between variables. Correlation coefficients (r) of the variables were used to measure relationship between variables in the current study. The correlation coefficients of the all variables were calculated using Pearson’s correlation analysis and are given in Table 3. In Table 3, where + 1 indicates a perfect positive relationship and 0 indicates no relationship between the correlated parameters, all correlation coefficients are positive expressing the similarity in the same direction.

Table 3 Correlation matrix of variables

Principal component analysis (PCA)/factor analysis

PCA is one of the multivariate statistical techniques which explains the variance of a large set of variables and transforming into a smaller set of independent variables (Singh et al. 2004). The Kaiser–Meyer–Olkin (KMO) and Bartlett sphericity tests were applied to the correlation matrix of variables for testing of the PCA compatibility. The KMO indicates the proportion of variance in variables. High values (close to 1.0) generally indicate that a factor analysis may be useful with these data. The Bartlett sphericity tests hypothesize that the correlation matrix is an identity matrix, which would indicate that variables are unrelated and therefore unsuitable for structure detection. Small values (less than 0.05) of the significance level indicate that a factor analysis may be useful (IBM SPSS eleven). In order to distinguish the variations of variables, dry and wet seasons data were performed in the current study. During the factor analysis, variances and eigenvalues were calculated and varimax rotation was used to obtain interpretable factors (Kalayci 2016).

Cluster analysis

CA is a well-known classification tool which endeavour detecting either the distance or the similarity between the objects to be clustered which used to group of samples according to their similarities and typically illustrated by a dendrogram which provides a visual summary of the clustering processes (Michalic 2008). The analysis was carried out with the “Z-score” correction. The classification is termed Q-mode classification which is commonly applied to water-chemistry investigations in order to define groups of samples that have similar chemical and physical characteristics (Güler et al. 2002). CA was applied on 19 sampling points using normalized Ca, Mg, Na, K, Cl, SO4, EC25 and B values using Euclidean distance with Ward’s method in the study.

Results and discussion

Evaluation of in situ measurements and hydrochemical analysis results

The hydrochemical analyses results including the major ion (Ca, Mg, Na, K, Cl, SO4, HCO3), in situ measurement (T, EC25, TDI and pH) and trace element (Br, B, Sr, Li) are summarized in Table 2 for dry and wet seasons. Based on the field measurements, the temperature of Azmak springs and groundwater in boreholes show very little variation (Cv = 4 for springs, Cv = 2 for boreholes) and is generally about 16 °C. The average EC25 value of springs is 5341 µS/cm (Cv = 0.82) for dry and 5402 µS/cm (Cv = 0.84) for wet season. For groundwater in boreholes it is 712 µS/cm (Cv = 0.34) and 674 µS/cm (Cv = 0.38) for the dry and wet seasons, respectively. Regarding all major ions and trace elements for the springs, the coefficients of spatial variations of Na, K, Cl, Br, B and Li are higher than 90%. The high variation implies the seawater mixing ratio on spring discharges. For the boreholes, the spatial variations of Na, Cl and Br are high (around 90%) during the two seasons because one of the boreholes (BH-3) is apparently affected by seawater intrusion.

The major ion content of the springs was evaluated using Schoeller diagrams. The dominant cation content is ordered as Na + K > Mg > Ca and anions as Cl > SO4 > HCO3 + CO3 for the majority of the springs for both the dry and wet seasons (Fig. 5). Spring waters, thus are classified in two groups; as very saline springs and moderately saline springs. The groundwater in boreholes is classified as fresh water. The diagrams also imply that the majority of the springs have originated from the same lithological units. Furthermore, seawater intrusion is a major impact on the chemistry of sampling points and all springs have different ratios of seawater mixing. On the other hand, the seawater mixing ratio is not associated with distance from the sea.

Fig. 5
figure 5

The Schoeller (1977) diagrams of sampling points for dry and wet seasons in the study area

Ionic ratios were used to determine the key factors that have influenced the composition of groundwaters. In the current study, the plot of Mg/Ca versus SO4/Cl was used to assess the effects of seawater intrusion on groundwater quality (Fig. 6). The plot showed that most samples have high Mg/Ca and low SO4/Cl ratios, indicating the influence of seawater on water composition. The Mg/Ca ratio increases, but SO4/Cl ratio decreases with the proportion of seawater in the mixture. Figure 6 indicates that three main groups of samples can be distinguished on the basis of the mixing ratio namely: seawater (sea); mixed water (springs) and freshwater (groundwater from boreholes). A closer look at the spring data in the plot of Fig. 6 reveals that three different groups can also be distinguished as more saline springs (9023 µS/cm < EC25 < 54,575 µS/cm), saline springs (4415 µS/cm < EC25 < 7225 µS/cm) and brackish springs (1236 µS/cm < EC25 < 4020 µS/cm). The pattern of the plots does not change significantly between dry and wet seasons (Fig. 6).

Fig. 6
figure 6

The graph of Mg/Ca versus SO4/Cl values of the sampling points

Evaluation of multivariate statistical analysis results

As the first step of statistical analysis, values of all variables were presented in box and whisker plots based on minimum, maximum and quartile values for the dry and wet seasons (Fig. 7). The box and whisker plots primarily indicate that groundwater is saline based on measurements of Na, Cl, Br, B, Sr, EC25, TDI and values of variables do not have major seasonal changes. Ca, Mg, K and Li have the lowest values in the dry and wet seasons.

Fig. 7
figure 7

Box and whisker plots for Ca, Mg, Na, K, Cl, SO4, HCO3−,TDI, T, pH, EC25, Br, B, Li and Sr of water sampling points

Correlation analysis

The correlation matrix among all variables is given in Table 3 for both the dry and wet seasons. According to this table, most of the correlation coefficients are higher than 0.90 in the dry and wet seasons. There are significant and positive correlations which may suggest that the specific electrical conductivity is function of Mg, Na, K, Cl, and B. This high correlation implies that the salinity is mainly controlled by these ions. In the dry season, high positive correlations of Na with K (0.985) and Cl (0.998), Ca with SO4 (0.919), Mg with B (0.928), K with B (0.981), Cl with Na (0.998) and B (0.955), B with K (0.981) and Na (0.962) suggest that seawater intrusion might be playing a major role in the hydrochemistry of the sampling points. In the wet season correlations between Mg and Na (0.883), K (0.891), Cl (0.874), SO4 (0.891), B (0.925); between K and Cl (0.960), SO4 (0.773); between SO4 and Ca (0.907), Mg (0.891), K (0.773), B (0.828) still high and indicative of seawater intrusion. By comparison, the correlation coefficients obtained for dry and wet seasons differ showing a decrease for some of the ions in dry season. Significance tests showed that these differences are not statistically meaningful.

Factor analysis

Ca, Mg, Na, K, Cl, SO4, and EC25, B were used for factor analysis as variables in a factor analysis of the water quality data. The factor analysis was undertaken using the varimax rotation procedure to maximize the sum of the variance of factor coefficients and a three-factor model was determined (Table 4). According to Table 4, three factors explain 98% and 91% of the total variance in the dry and wet seasons, respectively, and scree plots for sampling points showed a distinct change of slope after the three factors (Fig. 8).

Table 4 Varimax-rotated factor matrix of the all sampling points during the dry and wet seasons
Fig. 8
figure 8

Scree plots of the eigenvalues for dry season and wet season

The terms ‘strong’, ‘moderate’, and ‘weak’ as applied to factor loadings, refer to the absolute loading values of 0.75, 0.75–0.5 and 0.5–0.3, respectively, in the analysis (Liu et al. 2003). In the dry and wet seasons factor 1 accounts for 50 and 48%, respectively, of the total variance and has strong positive loadings of Na, Cl, K, B, EC25. This factor regarded as an indication of seawater intrusion into the aquifer implies the general trend of hydrochemical characteristics of the study area. Factor 2 accounts for 26% and 27% of the total variance in the dry and wet seasons and strong loadings on Ca suggest that factor is related to mixing with freshwater. Factor 3 accounts for 23% of the total variance revealed moderate loadings for Mg, SO4 and EC25 in the dry season. By contrast, 15% of the total variance has moderate loadings for Mg in the wet season which suggests that freshwater inputs influence water quality at this time of the year.

Cluster analysis

Cluster analysis was used for grouping water samples based on their chemical and physical characteristics at each of the sampling points (Shrestha and Kazama 2007). For the hierarchical cluster analysis (HCA), all variables were standardized and Ward’s method with Euclidean distances was used to produce dendrograms. HCA was applied for the 19 sampling points using 8 variables (Ca, Mg, Na, K, Cl, SO4, EC25 and B) for the dry and wet seasons. The results of the analysis are presented as dendrograms in Fig. 9. According to the dendrograms, the study area was classified into two main groups as seawater dominated (A) and fresh water dominated (B). These two main groups (A and B) were further divided into two subgroups as A1, A2 and B1, B2.

Fig. 9
figure 9

Dendrograms of the hierarchical cluster analysis of sampling points during a dry and b wet seasons

Clusters A1 and A2 which have the highest values of Na, K, Cl, B and EC25 are the most saline water points in the study area. Cluster A1 (SP4, SP6, SP7, AGS, SP13) contains the sampling points defined as seawater-dominated mixed water where the EC25 ranged from 4339 to 9330 µS/cm. Cluster A2 (SP8, SP12 and SeaW) contains the most saline sampling points including the seawater sampling point (11,981 µS/cm EC25 < 54,575 µS/cm). Cluster A2 is extremely affected by seawater intrusion. Clusters A1 and A2 represent springs affected by seawater intrusion. The springs included in these clusters are not necessarily close to the sea. Some of the springs included, for instance in cluster A2 are located closer to the sea, while others in the same cluster are located much farther. This is because the salinity of springs does not follow a pattern designated by the distance to the sea.

Four wells with specific electrical conductivity ranging from 492 to 1021 µS/cm fall within cluster B1 which implies that they are relatively unaffected by seawater intrusion. The majority of springs from the deep circulation are clustered in subgroup Cluster B2 (SP1, SP2, SP3, SP9, SP10, SP11, SP14) which contains brackish water with seawater characteristics (1232 µS/cm < EC25 < 3682 µS/cm). SP1, SP2 and SP3 are located the farthest from the sea are located on the upstream of the Azmak stream. Cluster B2 is moderately affected by seawater intrusion.

All sample points were grouped in same clusters for the dry and wet seasons except for SP3, SP10 and BH-3. These three sampling points were affected by the seasonal effect. After rainy season these sampling points showed different behavior and were grouped into different clusters.

Conclusions

In situ measurements, hydrochemical analysis and multivariate statistical methods (correlation analysis, factor analysis and cluster analysis) were carried out for data from 19 sampling points during the dry and wet seasons in the Azmak spring zone, Mugla-Turkey. The assessment was carried out for Ca, Mg, Na, K, Cl, SO4, EC25, TDI, Br, B, Li and Sr for all of the sampling sites. According to the results of the hydrochemical analysis, the majority of the coastal springs are brackish due to elevated concentrations of major ions (Na, K, Cl) and trace elements (Br, B, Li) caused by seawater intrusion. Correlation analysis within the study area shows that there are significant and positive correlations between variables which suggest seawater intrusion in the springs.

Three factors were determined in factor analysis that was carried out on the water quality dataset produced in this study. The first factor is interpreted as being associated with the effects of seawater intrusion, the second factor appears to be associated with the effects of mixing with freshwater and the third factor appears to be associated with variations fresh water circulation patterns on a seasonal basis.

Cluster analysis using in situ field measurements and data from chemical analyses identified two main groups and four subgroups. Cluster A (A1 and A2) appears to be associated with samples that have a chemical composition similar to seawater and cluster B (B1 and B2) is associated with samples that are moderately affected by seawater intrusion. The cluster analysis has confirmed the classifications by conventional diagrams and plots given in Figs. 3 and 6. On the other hand, these conventional diagrams failed to show the effects of seasonal changes, whereas cluster analysis clearly indicated that some of the sample points changed their position on the dendrograms.

This study has indicated that multivariate statistical analysis techniques are useful methods for the description of groundwater flow mechanisms in the study area. The study has shown that seawater intrusion is a problem for the drinking water sources because almost all karst springs in the area are influenced by seawater. The salinity of the springs is not directly related to their distance from the sea which indicates that recharge and salinization mechanisms in the area are complicated.