Introduction

Groundwater contributes to the major source of drinking water in most of the urban and rural areas. It is also an alternative source for irrigation, industrial and commercial purposes (Mishra and Bhatt 2008). Nowadays, the quality of the groundwater deteriorates due to natural and human activities. Therefore, groundwater quality monitoring is given higher priority in environmental protection policies (Mahmood et al. 2011). Geostatistics methods such as inverse distance weighted, Kriging and co-Kriging were used to predict the spatial distribution of groundwater characteristics and to interpolate the groundwater quality (Mehrjardi et al. 2008). The multivariate techniques can be effectively used to demarcate regional and geological factors controlling water chemistry (Vasanthavigar et al. 2009). Multivariate mapping techniques results are useful for the hydrogeochemical spatial variability studies (Arabi et al. 2013). Statistical hydrochemical analysis shows groundwater contamination is closely associated with land use classes and activities (Matiatos and Evelpidou 2013). Multivariate statistical techniques are useful for complex data interpretation, to identify pollution source and water quality variations within a lake base (Kazi et al. 2008). Factor analysis can be used to study anthropogenic effect and intrusion of salt water into the groundwater (SenthilKumar et al. 2008). Principal component analysis, factor analysis and cluster analysis (CA) can be used to evaluate temperate and spatial variations to interpret large water quality data sets (Yang et al. 2009). The possible sources that influence water systems can be identified by these techniques (Reghunath et al. 2002; Lambrakis et al. 2004; Nosrati and Van Den Eeckhaut 2012).

Davraz (2015) studied the groundwater contamination in southwestern Anatolia, Turkey, and concluded that Mn, Al, Fe, As and F contents of groundwater were above critical values. Edet (2009) used factor analysis to determine the source of ions and total hydrocarbon content in the groundwater of Niger delta, Nigeria. Five factors which cause contamination in the groundwater were determined.

In Coimbatore district, surface water sources including river, reservoirs and groundwater contribute to drinking water. The groundwater along Noyyal is not suitable for drinking and irrigation purposes (Santhosh and Revathi 2014). In most of the places along Noyyal River in Coimbatore district, the groundwater is not suitable for irrigation and industrial purpose because of high range of electrical conductivity and total hardness (Sundar and Saseetharan 2008). The Noyyal River of Cauvery basin, which is feeding the different tanks, is polluted by domestic and industrial sewage (Priya et al. 2011). The groundwater quality was unfit for drinking in some areas within Singanallur sub-basin of Coimbatore district scoring a water quality index greater than 100 (Priya et al. 2012). Seasonal variations in the groundwater quality of Coimbatore city were analyzed, and the changes in quality based on water quality index are represented using ArcGIS (Jebastina and Prince Arulraj 2015). Evaporation, silicate weathering, ion exchange and dissolution of carbonate minerals are the major geochemical processes in the district (Sajil Kumar and James 2016).

The objective of the present study is to determine the contaminants in groundwater and the source of contaminants using statistical methods.

Study area

Coimbatore district is situated in the western part of Tamil Nadu, India, which lies between 10°10′N and 11°30′N latitude and 76°40′E and 77°30′E longitude. The total area of the district is 4850 km2. The average annual rainfall ranges between 550 and 900 mm. The main rivers in the district are Bhavani, Noyyal, Amaravathi and Ponnani. The temperature of the district ranges from 19.8 to 35.9 °C. The geological formations of the district are Archean crystallines, alluvial and colluvial formations (CGWB 2008). The district is divided into eight taluks, namely Annur, Coimbatore North, Coimbatore South, Kinathukadavu, Mettupalayam, Pollachi, Sulur and Valparai. The study area with the sampling points is shown in Fig. 1.

Fig. 1
figure 1

Study area with sampling sites

Sample collection and analysis

Twenty-seven borewell samples were collected from the study area during pre-monsoon period of 2012 and 2013. The collected samples were analyzed for pH using pH meter, EC using conductivity meter and TDS by gravimetric method. The samples were analyzed for major ions by adopting standard methods (APHA 1998). The amount of chlorides (Cl) was determined by standard silver nitrate titration. Sodium (Na+) and potassium (K+) were determined by flame photometry, whereas calcium (Ca2+) and magnesium (Mg2+) were determined by titration with standard EDTA. Nitrate (NO3 ) and fluoride (F) were determined by ion-sensitive electrode. The amount of sulfate (SO4 2−) was determined by spectrophotometer and bicarbonate (HCO3 ) by titration.

Data analysis

Statistical computations were done using the software EXCEL 2003 and SPSS 20 (Statistical Package for the Social Sciences). Deterministic interpolation techniques create surfaces from measured points, based on the extent of similarity (inverse distance weighted) or based on the degree of smoothing. Inverse distance weighted method was used to create the spatial distribution maps of electrical conductivity. Spatial distribution of factor scores was done using the software ArcGIS 10.1.

Results

The descriptive statistics of groundwater quality based on the twenty-seven water samples during 2012 and 2013 are summarized in Table 1. It gives the minimum, maximum, mean and standard deviation of the results for each of the twelve parameters.

Table 1 Groundwater chemistry statistics

Box plot is a statistical tool used to represent the parameters influencing water chemistry in the study area. The variations of the major groundwater ion concentrations are shown in box and whisker plot (Fig. 2a, b). Mg2+, Na+, Ca2+, K+, Cl, HCO3 , SO4 2−, NO3 and F ions influence in the year 2012. Na+, Ca2+, Mg2+, K+, Cl, HCO3 , SO4 2−, NO3 and F ions influence in the year 2013.

Fig. 2
figure 2

a Box plot of year 2012. b Box plot of year 2013

Electrical conductivity is a good indicator to assess the quality of groundwater. Electrical conductivity concentration depends on basic rock type, soil and the amount of rainfall received (Balachandar et al. 2010). Spatial distribution of electrical conductivity is shown in Fig. 3a, b during the years 2012 and 2013.

Fig. 3
figure 3

a EC during 2012. b EC during 2013

Statistical analysis including correlation analysis, principal component analysis (PCA) and CA was performed on the groundwater quality data to ascertain the relationship between variables, to determine the source of ions in groundwater and to group the sampling sites. A correlation analysis was performed to describe the degree of relation between two hydrochemical parameters. The interrelationships between the variables are determined using the correlation matrix. The correlation matrices for the years 2012 and 2013 are represented in Tables 2 and 3, respectively.

Table 2 Correlation matrix for the year 2012
Table 3 Correlation matrix for the year 2013

The number of principal components considered to understand the underlying data structure is given by the scree plot (Liu et al. 2003). From the scree plots (Fig. 4a, b), a drop in slope is noted after three components in the year 2012 and two components in the year 2013. Therefore, two principal components from both the years are taken during the present study which explains about 77% of the total variance for both the years.

Fig. 4
figure 4

a Scree plot for 2012. b Scree plot for 2013

The variability among observed variables is described by factor analysis in terms of factors. A factor is said to be significant if its eigenvalue is greater than unity. The factor analysis score is given in Table 4.

Table 4 Multivariate factor score for the years 2012 and 2013

During the year 2012, factor 1 contributes to 61% of the total variance with high loadings of Ca2+, Mg2+, Cl, SO4 2−, EC and TDS. Factor 2 contributes to 16% of the total variance. During the year 2013, factor 1 contributes to 66% of total variance with high loadings of Na2+, Mg2+, Cl, HCO3 , SO4 2−, NO3 , K+, F, EC and TDS. Factor 2 contributes to 12% of total variance with high loadings of pH, which indicates biogenic control of pH value. To determine the spatial variation of factor representation and to identify zone of representation, factor scores were calculated. Factor scores are computed for each sample by a matrix multiplication of the factor score coefficient with the standardized data. The value of each factor score represents the importance of a given factor at the sample site. A factor score >+1 indicates intense influence by the process. Very negative values <−1 reflect areas virtually unaffected by the process, while near-zero scores reflect areas with only moderate effect of the process (Prasanna et al. 2010). Regression method was used to determine factor score in the study area. Spatial variation of factor score is done using ArcGIS 10.1. The factor score values of each sampling point are given in Figs. 5a, b and 6a, b.

Fig. 5
figure 5

a Factor score I during 2012. b Factor score I during 2013

Fig. 6
figure 6

a Factor score II during 2012. b Factor score II during 2013

A CA classifies a set of samples into classified groups called clusters. CA methods are hierarchical clustering, two-way joining (block clustering) and k-means clustering. Hierarchical Q mode CA was performed to group the sampling sites. The data were represented by a horizontal hierarchical tree plot or dendrogram (Fig. 7) with three individual groups in both the years.

Fig. 7
figure 7

Dendrogram from CA

Discussion

Hydrogeochemical parameters

The values of the hydrogeochemical parameters presented in Table 1 show large fluctuations during both the years. The pH value shows that the water is slightly alkaline in both the years. Maximum pH values are recorded in Pollachi and Kinathukadavu taluks which are predominated by Archean crystalline rock. Electrical conductivity ranges from 120 to 5200 and 170 to 5740 µs/cm, respectively, during the years 2012 and 2013. Higher EC values were noted in the northern taluks during both the years. The concentration of Ca2+ ranges from 8 to 140 and 2 to 152 mg/L during the years 2012 and 2013, respectively. The average values of Mg2+ are 50.09 and 32.67 mg/L during the years 2012 and 2013. Higher concentrations of Ca2+ and Mg2+ are due to dissolution of carbonate minerals (Nosrati and Van Den Eeckhaut 2012). High amount of Na+ ion was noted in northeast and southeast region. This is due to the direction of flow of groundwater and residence time (Sajil Kumar and James 2016). Concentration of K+ averages 20.07 mg/L in the year 2012 and 18.89 mg/L in the year 2013. Higher concentration is observed in Pollachi, Kinathukadavu and Mettupalayam taluks, which is due to leaching from agricultural areas (Juned and Arjun 2011).

Natural geological processes contribute to the concentration of bicarbonates. Eastern side taluks have low concentration of bicarbonate as the movement of groundwater in the district is from western to eastern side (Sajil Kumar and James 2016). Mettupalayam taluk is found to have higher Cl concentration during both the years, which is predominated by peninsular gneiss rock type. Higher concentration of SO4 2− (76.58 mg/L) was observed during the year 2012, which can be due to oxidation of industrial sulfur gases entering the ground during precipitation (Saxena 2004). Higher concentration of NO3 was observed during the year 2013, which is due to leaching from agricultural areas (Nolan et al. 1998). Disposal of human and domestic wastes into the environment also contributes to the NO3 concentration in groundwater (Edet 2009). The increase of F content in groundwater originates from the solution of apatite and more commonly from the solution of fluoride-bearing micas (Davraz 2015).

Box and whisker plot in Fig. 2a shows that Mg2+, Na+, Ca2+, K+, Cl, HCO3 , SO4 2−, NO3 and F ions influence in the year 2012. The source of magnesium may be from dissolution of gneiss. The origin of Cl is usually from dissolution of halite in groundwater (Mullaney et al. 2009). Figure 2b shows that Na+, Ca2+, Mg2+, K+, Cl, HCO3 , SO4 2−, NO3 and F ions influence in the year 2013. Gneiss is the major geological formation in the study area which contributes to the higher concentration of Na+, Ca2+, K+ and HCO3 (Sajil Kumar and James 2016). High concentration of fluoride in groundwater is associated with rock types, especially granite and gneiss (Kim et al. 2011).

The spatial distribution of electrical conductivity in Fig. 3a, b shows that higher values of EC (>5000 µS/cm) are observed in the northern taluks, indicating polluted nature of groundwater due to leaching of ions (Thilagavathi et al. 2012). Industrial disposal and domestic sewage also contribute to this (Uwidia and Ademoroti 2011). The southern taluks have lower concentration of EC, indicating good quality of groundwater.

Statistical analysis

Computation of correlation matrix is the foremost step in multivariate statistical analysis. The correlation matrix for the years 2012 and 2013 is shown in Tables 2 and 3, respectively. Correlation coefficient gives the extent of statistical relationship between the parameters. Correlation coefficient >0.7 represents excellent correlation, and correlation coefficient <0.3 represents poor correlation. In the study area, during both the years there exists a good correlation between EC and TDS due to linear correlation between dissolved ions and conductivity. In 2012, there is an excellent correlation between Mg2+ and Cl. Good correlation exists between Ca2+–Mg2+, Na+–K+, Na+–HCO3 , K+–HCO3 , Ca2+–Cl. Poor correlation exists between F and other ions. Correlation of Cl with Mg2+ and Ca2+ indicates dissolution and leaching of salts (Prasanna et al. 2010). Correlation of HCO3 with Na+ and K+ indicates chemical weathering (Prasanna et al. 2010). Significant correlation of SO4 2− with other ions and NO3 with K+ indicates influence of human activities on the system (Srinivasamoorthy et al. 2012). In 2013, there exists an excellent correlation between Na+ and Cl indicating silicate weathering and anthropogenic sources (Srinivasamoorthy et al. 2012). Good correlation exists between Mg2+–Na+, HCO3 –Mg2+, HCO3 –Na+. There exists a good correlation of Cl with Ca2+, Mg2+, Na+ indicating dissolution of salts (Prasanna et al. 2010). There exists a good correlation of SO4 2− with other ions indicating anthropogenic impact (Odat 2015). Poor correlation exists between F and other ions.

PCA is used to identify patterns by compressing the data without much loss of information (Smith 2002). PCA converts the original variables into uncorrelated new variables called principal components, which are linear combinations of original variables (Sarbu and Pop 2005). From the scree plots (Fig. 4a, b), two principal components from both the years are taken in the present study. R mode factor analysis gives useful information concerning relationships between elements and is more widely used for the interpretation of geochemical data (Odokuma-Alonge and Adekoya 2013). The factor analysis score is given in Table 4. During the year 2012, factor 1 contributes to 61% of the total variance with high loadings of Ca2+, Mg2+, Cl, SO4 2−, EC and TDS. The influence of Ca2+, Mg2+ may be due to carbonate dissolution (Prasanna et al. 2010). The influence of Cl may be from secondary salt precipitation due to agricultural return flow (Kumar et al. 2009). The influence of SO4 2− is due to weathering action of dissolution of gypsum. Factor 2 contributes to 16% of the total variance. During the year 2013, factor 1 contributes to 66% of total variance with high loadings of Na2+, Mg2+, Cl, HCO3 , SO4 2−, NO3 , K+, F, EC and TDS. High loading of Na2+ indicates natural weathering and ion exchange process (Drever 1997). Influence of Cl indicates leaching of industrial effluents (Kumar and Alappat 2005) Mg2+ and HCO3 from silicate weathering of rocks; SO4 2− is due to weathering action of dissolution of gypsum; K+, NO3 might be from leaching of fertilizers from agriculture land (Mahlknecht et al. 2004); and F indicates leaching from fluoride-rich minerals. Factor 2 contributes to 12% of total variance with high loadings of pH, which indicates biogenic control of pH value. The factor score values are shown in Figs. 5a, b and 6a, b. Factor I and factor II show that the northern taluks and the eastern sides have high factor score than the western side. Figure 7 shows the dendrogram of hierarchical Q mode CA. “Distance” in CA is used to represent how close each pair of objects is. Euclidean distance is the most common distance measurement in CA. The Euclidean distance between any two objects i and k (d ik) is given by Eq. 1.

$$d_{\text{ik}} = \sqrt {\mathop \sum \limits_{j = 1}^{N} \left( {X_{ij} - X_{kj} } \right)^{2} }$$
(1)

In CA, it is desirable that the distances between objects within a cluster are small and the distances between different clusters are large (Yang 2004). The distance between clusters is determined by the methods which determine the relationship between clusters called linkage. There are several linkage methods such as complete linkage, single linkage, average linkage and Ward’s linkage. Ward (1963) proposed a clustering procedure seeking to form the partitions P n, P n−1,…, P 1 in a manner that minimizes the loss associated with each grouping and to quantify that loss in a form that is readily interpretable. From Fig. 7, group I sites (1, 3, 7, 8, 9, 11, 13, 14, 15, 16, 18, 20, 26) are from western side of the study area which is less polluted. Group II sites (2, 4, 5, 10, 12, 17, 19, 21, 22, 23, 24, 25, 27) are from eastern side of the study area which has more ions concentration due to flow of groundwater toward east (Sajil Kumar and James 2016). Group III site (6) has more EC (>5000 µs/cm).

Conclusions

Carbonate dissolution, silicate weathering, ion exchange, leaching of industrial effluents and leaching of fertilizers from agricultural lands alter the groundwater chemistry. Mg2+, Na2+, Cl, HCO3 ions dominate the quality of groundwater during the year 2012, and Na2+, Ca2+, Cl and HCO3 ions dominate the quality of groundwater during the year 2013. Ion exchange process and dissolution of gypsum have taken place. Spatial distribution of EC shows that northern taluks of the district have higher concentration of EC. EC and TDS are highly correlated. Two factors were derived from factor analysis with 66% of the total variance which shows carbonate and gypsum dissolution, anthropogenic activities and industrial discharge in the study area. Three clusters were formed by Q mode CA from which the eastern side sampling sites of the study area are having more ion concentration due to the groundwater flow direction and residence time.