Introduction

Water is a prime natural resource and plays remarkable role in shaping the land and regulating the climate and life on earth. Water present beneath the earth surface in the saturated zone is called groundwater, and the upper layer of saturated zone is known as water table. Geological formations such as aquifers retain groundwater. An aquifer can be defined as “a formation that contains saturated permeable material sufficient enough to yield significant quantities of water to springs and wells”. Groundwater quality analysis determines the potability of water used for public water supply, industrial applications, irrigation, etc. (Pranavam et al. 2011). Groundwater being a renewable source has high value as a source of freshwater. About 97.2% of groundwater is salty and present in the oceans, only 2.8% is available as freshwater and fit for human consumption, out of which 20% constitutes groundwater (Kumar and Kumar 2013). The quality of water is of great concern as it is directly affiliated to the human interest. Groundwater is clean and pollution free as compared to surface water (Akinbile and Yusoff 2011). With the extraction rate of 600–700 km3/year, groundwater has become world’s most extracted resource. Nearly 70% of India’s agricultural food production is dependent on groundwater for irrigation purposes (Oladipo et al. 2011). Moreover, 85% of urban population in India has access to potable drinking water, while 20% of the available drinking water meets the health and quality standards set by WHO (Singh 2000).

Mineral composition of the aquifer and interaction of groundwater with it mainly controls the chemistry of groundwater (Nabila et al. 2014). The variation in chemical composition of groundwater is controlled by hydro-chemical processes like dissolution, ion exchange, precipitation, absorption and desorption with the residence time occurring along the flow path (Apodaca et al. 2002). The expansion of agricultural field and decreasing pattern of usage of surface water accelerates the over exploitation of groundwater.

Increase in population, industrialization, urbanization and intense agricultural activities in the world results in deficit of water in arid and semiarid regions. Nearly 92% of groundwater extracted is used for the agricultural purposes and 5% and 3% for industrial and domestic sector, respectively (Khurana and Sen 2008). Prolonged release of industrial effluents, sewage, pesticides and fertilizers usage, waste dumping leads to groundwater pollution and subsequently to health problems (Ramakrishnaiah et al.2009). Groundwater quality deterioration is comparatively more acute in the areas having dense population with localization of industries. The recent studies in Haryana (India) conclude that exploration rate is higher than its recharging rate, inapt discarded liquid and solid wastes, dearth of awareness and stern implementation of law and poor management results in declination of groundwater quality. Improper waste disposal and overexploitation of resources and speedy widening of urban areas have deteriorating effect on groundwater quality. Anthropogenic sources have contaminated the groundwater which has now become a serious problem.

Humans have been affected by “groundwater contamination and its management”. Therefore, to understand the overall water quality, water quality index (WQI) has a vital role to play. WQI is used as a tool to sum up the water quality parameters into a “simple index” which then helps to interpret water quality into a single numerical value. There are different water quality classification models which are based on water quality index. Integration and interpretation of parameter values results in different WQI.

Focusing on above aspects of groundwater contamination, the aim of this study was to investigate the groundwater quality of Beri block and nearby villages of Jhajjar district in Haryana (India). Water quality index was calculated based upon the physico-chemical analysis of groundwater samples.

Material and methods

Study area

Beri town is a block in Jhajjar district. It came under Jhajjar district from 15 July 1997. It is one of the seats of Haryana Vidhan Sabha Constituency. Beri is both a town and a municipal committee in the Jhajjar district of Haryana. Seventy-seven villages come under the Beri tehsil making it one of the biggest tehsils of Haryana. It is situated on State Highway No.122 (Gurgaon-Jhajjar-Bhiwani-Hisar). Beri is the administrative headquarters of Beri sub-division (a Tehsil before August 15, 2009) in Jhajjar district in the state of Haryana, India, with coordinates: 28.7° N 76.5833° E.

Sample collection

Water samples were collected from Beri block and six of its nearby villages in the month of December, 2015. Samples from 24 different sources were collected randomly in plastic polyethylene bottles. Table 1 shows the villages and source of groundwater collection. Plastic bottle (2 L) having stopper was used. Water samples were preserved and transported following standard method (APHA 2005). The bottles were completely filled without any air space and then sealed to prevent any leakage. Individual bottles were labelled with the name and point of sample collection. Figure 1 shows the sampling sites in Beri block.

Table 1 Location and
Fig. 1
figure 1

Map revealing the sampling sites of the studied area

Physico-chemical characteristics of groundwater samples

The groundwater samples were analysed for various physico-chemical parameters which includes pH, TDS, EC, sulphate, phosphate, sodium, potassium, total alkalinity, total hardness, calcium, magnesium and chloride. Heavy metals like lead, zinc, iron and chromium were also analysed. The physico-chemical analysis of water samples was carried out in accordance with standard analytical methods (APHA 2005).

Multivariate statistical techniques

Multivariate statistical techniques are useful in setting up complex relationship among various data sets. Principal component analysis (PCA) is “a multivariate method of data reduction which aims at finding the components that explains the major variation within the given data” (Danielsson et al. 1999). Each component is a weighted, linear combination of the original variables.

All experiments were carried out in triplicate, and finally, their mean and standard deviation (SD) were calculated. Results were expressed as mean ± SD. Pearson correlation was used to evaluate the relationship in between all the parameters.

Multivariate statistical techniques are helpful in simplifying and organizing large data to give significant results (Laaksoharju et al.1999). In present study, groundwater samples were subjected to two multivariate statistical techniques to analyse the parameters. For multivariate statistical calculations, software STATIXL 2.0 was used. Multivariate statistical tools such as cluster analysis (CA), correlation analysis and principle component analysis (PCA) were used to find source of pollutants in environmental studies (Rahman et al. 2014; Mendiguchίa et al. 2004; Han et al. 2006).

Principal component analysis

Principal component analysis (PCA) is widely used to reduce data and to extract a small number of latent factors for analysing relationships among the observed variables (Farnham et al. 2003; Gou et al. 2007). PCA was performed to extract principal components (PC) from groundwater data and from all the sampling point, to evaluate spatial variations and possible source of pollutants in groundwater. With the use of correlation matrix, each variable is normalized to unit variance and therefore contributes equally. As the correlation matrix is symmetrical, the eigenvector is mutually orthogonal. Typically, the data matrix is reduced to four or five principal component loadings that account for the majority of variance. First PC loadings explain most of the variance, whereas progressive loadings explain lesser.

Cluster analysis (CA)

On the basis of their similarities in chemical properties, elements of different sources were classified by cluster analysis (Rahman et al. 2014). By clustering, similar objects were grouped in similar class (Danielsson et al. 1999). The common approach of Hierarchical agglomerative clustering gives intuitive similarity relationships between any one sample and the entire data set, and is represented with a dendrogram (McKenna 2003). A visual summary of the clustering process represented by a dendrogram reveals groups and their proximity with a reduced dimensionality of the original data. Cluster analysis was applied to the experimental data which was standardized in Z-scale transformation (Liu et al. 2003).

Water quality index

Water quality in a single value can be analysed by water quality index (WQI). It is basically used for the detection and evaluation of water pollution. WQI is the reflection of composite influence of different quality parameters on the overall quality of water. Finally, for computing the WQI, the sub-indices (SIi) were first calculated for each parameter and then used to compute the WQI as in the following equations (Kangabam et al. 2017):

$${\text{SI}}_{i} = {\text{RW}} \times Q_{i}$$
$${\text{WQI}} = \sum {\text{SI}}_{i}$$

Result and discussion

All the collected groundwater samples were analysed for various physico-chemical parameters along with heavy metals, and results were concluded as shown in Table 2. Table 3 shows correlation between all the studied parameters. The results of the present study were compared with the standards provided by IS 10500–2012 and WHO (2002) as shown in Table 4.

Table 2 Statistics showing physico-chemical parameters and heavy metals
Table 3 Pearson correlation coefficient between physico-chemical parameters along with heavy metals
Table 4 Standards for drinking water prescribed by IS 10,500:2012 and WHO

Physico-chemical characterization of groundwater samples

The pH measures the hydrogen ion concentration and alkalinity or acidity in water. In water system, all chemical and biological reactions mainly depend on the pH (Rao 2006). pH was found positively correlated with electrical conductance and total alkalinity (Gupta et al. 2009). pH has no direct adverse health effects. pH below 4.0 produces sour taste and above 8.5 shows alkaline taste. In this study, pH was found to be in the range of 5.84 to 7.97. pH of sample S19 was below the prescribed limit. pH shows weak positive correlation with alkalinity.

Hardness does not possess any adverse effects (WHO 2009). However, some heart diseases have been reported. Hardness above 150–300 mg/l may cause kidney problems and kidney stone formation as it causes unpleasant taste and reduce ability of soap to produce lather (Patil and Patil 2010; Saravanakumar and Kumar 2011). Hard water is unsuitable for domestic use. In the present study, the total hardness was observed in between 440 to 2120 mg/l. Total hardness had strong positive correlation with calcium and magnesium (r values above 0.8) as shown in Table 3.

Alkalinity in various samples ranges from 40–1220 mg/l. The most important components of alkalinity are carbonates and bicarbonates. Alkalinity is the quantitative capacity of an aqueous solution to neutralize acids. The alkaline water may decrease the solubility of metals. High concentration of alkalinity in water leads to white deposits on fruits and leaves and might also be related with health hazards (Mondal et al. 2005). Out of 24, 12 samples have high alkalinity than the prescribed limit as given by BIS. Alkalinity shows weak positive correlation with Fe and Pb as shown in Table 3.

Conductivity is an important criterion in determining sustainability of water (Acharya et al. 2008). The electrical conductivity is valuable indicator of the amount of ionic materials dissolved in the water. Conductivity is also affected by temperature: the warmer the water, the higher will be conductivity. In the present study, conductivity of different water samples varies from 0.45 to 9.29 mS. Higher EC values are directly related to higher amount of ions which may lead to different problems. EC shows very strong positive correlation with Ca2+, Mg2+, SO42−, Cl, TH, TDS and K+ ions.

High TDS was observed in the groundwater samples which suggested a downward transfer of leachate into groundwater. High concentration of TDS decreases the palatability of water. High TDS (more than 500 mg/l) in water can also cause gastro-intestinal irritation in humans (BIS 2012). In the present study, TDS ranges from 265 to 5740 mg/l. Out of 24 samples, 10 samples were found to have higher values of TDS than the prescribed limits. TDS shows very strong positive correlation with EC, TH, Ca2+, Cl and strong positive correlation with K+, SO42−and Mg2+.

Anion chemistry

Anions like sulphate, chloride and bicarbonates are present mainly in the groundwater (Younger 2007). Excess of chloride in inland water is usually taken as index of pollution. Surplus amount of chlorides in inland water are due to the presence of salts like potassium, calcium and sodium. The concentration of chloride (mg/l) in samples varies from 539.6 to 3692 mg/l. All the samples have high value of anions than prescribed limits given by WHO and BIS which indicates that water is not fit for drinking purpose. High concentration of chloride is the result of domestic waste and disposal by human activities (Jha and Verma 2000). Chloride ions are usually more lethal in comparison with plants and shows good sign of pollution (Chapolikar et al. 2009). The high level of chloride often causes kidney and heart illnesses (WHO 1997). Chloride shows moderately positive correlation with chromium.

The concentration of sulphate in all the studied water samples was found to be within the permissible limits. Sulphate compounds paved their way into the groundwater due to oxidation of sulfite ores, presence of shales or the industrial wastes. When sulphate is present with magnesium and sodium and exceed the amount of 400 mg/L, it may cause irritation of gastro intestinal tract (Shankar et al. 2008). In the present study, sulphate concentration ranges between 0.29 and 3.81 mg/L. All the samples were within the permissible limits as prescribed by BIS. Sulphate shows moderate positive correlation with Cl ions.

In natural water, phosphorus may occur in the form of phosphates (polyphosphates, orthophosphates and organically bound phosphates). Phosphorus is also present in animal and human waste and may be added by soil erosion, fertilizers and industrial wastes. Algal blooms are formed by extensive growth of algae due to the addition of phosphorus and results in cultural eutrophication and also cause oxygen depletion. Phosphate can also enter via unprotected septic tanks, effluent containing excreta and also from detergents. In the present, study concentration of phosphates varies from 0.01 to 0.30 mg/l and all the samples were found to have phosphate content well within the limit. Phosphate shows weak positive correlation with Zn and Cr ions.

Cation chemistry

Sodium, magnesium, potassium and calcium are the chief cations present in most of the groundwater (Younger 2007). In the present study, sodium concentration varies from 15 to 155 mg/l and concentration of potassium ranges from 1 to 58 mg/l. Sodium shows moderate positive correlation Cl ions. Potassium shows moderate positive correlation with SO42− and Cl ions. Calcium ranges from 40 to 480 mg/L. Magnesium ranges from 356 to 1720 mg/l. Out of 24 samples, 17 samples were found to have high concentration of magnesium when compared to BIS. Total hardness shows positive significant correlation with Mg2+, Ca2+, SO42−, Na+, Cl, Fe, K+, Cr and Zn. Calcium shows strong positive correlation with Cl, Mg2+ and SO42−, and magnesium shows strong positive correlation with K+, SO42− and Cl.

Metal ions

Generally, the groundwater gets contaminated with metal ion due to the phenomenon like rock and mineral weathering, effluents containing wastes and also the sewage discharge in runoff and also in land water. Heavy metals like Cu, Co, etc., are essential for human beings, and some are highly toxic if present even in trace amounts (Pb, Cr, Cd, etc.). Chromium is mainly utilized in paints pigments, metal alloys, paper, cements, rubber and the cement producing plants. Chromium at even low levels may result in ulceration and skin irritation. Its long-term exposure may damage liver and kidney. Prescribed limit of chromium (total Cr) as per BIS is 0.05 mg/l. In the present study, out of 24 samples, 6 samples were found to have negligible amount of chromium and rest of the samples have chromium concentration higher than the prescribed limit. Chromium shows moderate positive correlation with Zn and Cl.

Zinc concentration varies from 0.11 to 2.41 mg/l. Symptoms like dehydration, vomiting, abdominal pain and electrolyte imbalance in humans may be seen due to toxicity of zinc metal (Dohare et al. 2014). The concentration of zinc in all the samples was below the prescribed limits given by BIS.

Lead processing industries and lead containing effluents are the main culprits for the presence of lead in water. Pollution of lead causes aquatic life disruption and reduces quality of water and wildlife diversity. It can also reduce the diversity of wildlife. Industrial runoff sometime contains lead which can find its way into the food chain. Consequences may include the death and illness in humans as well as in fishes and other organisms due to lead consumption. The concentration of lead ranges from 0.01 to 0.79 mg/l.

Iron occurs naturally in soil, sediments and groundwater. Iron exists naturally in rivers, lakes and underground water. In water, iron is found in two forms, i.e. in insoluble ferric iron and in soluble ferrous iron. Natural deposits, refining of iron ores, effluent from industrial wastes and corrosion of iron-containing metals release iron in water. Excess consumption of iron may damage the kidneys, liver and blood vessels, bloody stool, vomits and can even cause death. Permissible limit of iron is 0.3 mg/l as per BIS. Iron concentration varies from 0.57 to 1.67 mg/l in the present study. All the samples exceeded the prescribed limits. Iron shows weak positive correlation with lead and zinc.

Water quality index

In the present study, water quality index of all the analysed parameters was calculated and summarized in Tables 5 and 6 represents water quality of the samples depending upon their water quality index.

Table 5 Water quality index of the groundwater samples
Table 6 Water quality on the basis of WQI

The computed water quality index can be classified as.

Multivariate statistical analysis

Principal component analysis

Principal components analysis (PCA) is a type of multivariate statistical analysis which has been used in forming a small number of uncorrelated variables from a large set of data. PCA aims at explaining the maximum quantity of variance within fewest numbers of principal components (PCs). It is commonly used as one step in a series of analysis. In Table 7, the calculated factor loading with cumulative % and % of variance is explained by each other. In PCA, out of the 16 PCs with their eigenvalues greater than one was chosen. In Table 7, for the first 5 PCs, component loadings from the PCA are shown. Principal component with eigenvalue greater than one was selected and arranged in Table7. These are the first five components and account for 80% of the total variance. The variance observed for PC1, PC2, PC3, PC4 and PC5 was 41.559, 13.447, 9.477, 8.399 and 7.505, respectively. The bold marked loads show the maximum existing correlation between variables and corresponding component. The values of PCA can be ‘cleaned up’ by means of a varimax rotation (Knudson et al. 1977) of the eigenvalues. By this method, varivalues and varifactors (VFs) are obtained in which original variables participate more clearly. When that transformation was carried out and the significant VFs extracted by using the same criteria as that for PCA, we found a spread of the variance (i.e. information) among 8 VFs was obtained only the first two VFs have contributions from more than one of the original variables. Table 8 shows the first 5 VFs spanning 65.023% of the variance, as opposed to 80% explained by the same number of PCs. VF1 with 33.6% of the variance has TDS, EC, total hardness, calcium, magnesium, potassium, sulphate and chloride as main contributors. VF2 with 9.958% of variance has total alkalinity, iron and lead as main contributors. In VF2, pH and chromium were cleaned up by varimax rotation.

Table 7 Loadings of physico-chemical variables on significant principal components for groundwater samples
Table 8 Loadings of physico-chemical variables on components rotated according to the varimax method for groundwater samples

Cluster analysis

The result of the hierarchical cluster analysis of sampling sites using the Ward’s method was given as a dendrogram in Fig. 2. Some of the samples from same village were clustered together. Villages group in three different clusters depending upon the groundwater quality. Cluster 1 consists of 12 sampling locations, namely sample no S1, S2, S4, S6, S7, S12, S13, S15, S16, S19, S21 and S23. Cluster 2 has 8 sampling locations, namely S3, S8, S9, S17, S18, S20, S22 and S24. Cluster 3 has 4 sampling locations consisting of S14, S11, S10 and S5. A separate dendrogram showing the cluster analysis of parameters is shown in Fig. 3. pH, EC, Ca2+, Na+, K+, SO42−, PO42−, Fe, Cr, Pb and Zn form cluster 1 which might be explained by the leaching of ions to the aquifer. Cluster 2 consists of TDS and chlorides, while cluster 3 consists of TA, TH and Mg2+. Clustering similarities may depend upon the aquifer similarity, and maybe some samples were of same aquifer while other samples from different aquifers. Groundwater flow also affects the clustering of samples. As we know that fertilizers and leaching also pollute the groundwater. So, similar agricultural practices may also affect clustering of samples.

Fig. 2
figure 2

Dendrogram showing the hierarchical cluster analysis using the ward method of sampling sites

Fig. 3
figure 3

Dendrogram showing the hierarchical cluster analysis using the ward method of analysed parameters

Conclusion

Present study reveals the current state of water quality of Beri block and its nearby villages which is definitely not good. Leaching and capillary action of water may be the contributing factors. Few parameters, viz. hardness, alkalinity, chlorides, sodium and TDS, were higher than the prescribed limits suggested by BIS. Total hardness shows positive significant correlation with Ca, Mg, Cl, SO42−, Na, K, Fe, Cr and Zn. TDS shows positive significant correlation with EC, TH, Ca, Mg, SO42− and Cl. EC shows positive significant correlation with Ca, Mg, SO42− and Cl. WQI shows that all the water samples are of poor, very poor and unsuitable quality. One sample was found to be of poor quality, eight samples were of very poor quality, and rest fifteen samples were of unsuitable water quality.

PCA allowed the reduction of 16 variables into 5 significant PCs that explains 80% information of the data set. PC1 shows high positive values for TDS, EC, TH, Ca2+, Mg2+, Cl and SO42− accounting for 41.5%. Total five varimax factors account for 65 percent of the total variance. Reason of high concentration of parameters may be minerals which changes under redox environment, anthropogenic causes as well as natural effect of percolation or infiltration in monsoon season. Results obtained from cluster analysis are in agreement with the PCA outcome. This study provides information about the quality of drinking water in the studied area which will help in improvement of public health. Thus, this study illustrated the usefulness of multivariate statistical techniques for the analysis and interpretation of complex data set, water quality assessment and complex interactions among variables.