Keywords

1 Introduction

Groundwater has a major role in water supply system for drinking, irrigation and industrial purposes. Exploitation of groundwater in India collectively surpassed total US and Chinese use, thus holding the world’s first position on the list [10]. However, according to the CGWB report, about 245–109 m3 of groundwater is utilized to meet the demand in India’s agricultural sector. About 65% of the world’s groundwater is used for drinking purposes; about 20% is used for agriculture purposes and 15% is used for industrial benefits, placing immense stress on this critical resource as demand rises [12]. Human actions such as land use/land cover changes, infiltration from polluted crops, geological formation, change in depletion of rainfall and decrease in precipitation infilteration influence the quantity/quality of groundwater and result in contamination of the groundwater [14]. Regardless of the unsafe and improper disposal of sewage, industrial waste, groundwater pollution is also a major environmental issue in recent times [7]. Groundwater biological, physical, and chemical factors may be affected due to pollution. The water quality index (WQI) assessment of drinking water quality was developed in the 1970s by the Oregon Department of Environmental Quality to evaluate and summarize water quality coditions [1, 3, 8, 16]. Monitoring the quality of water plays a crucial role in protection of environment, humans and marine ecosystem integrity. In this context, the water quality index (WQI) methodology is an important tool that offers policymakers and interested individuals in the study area knowledge about water quality [14]. GIS has been a commonly used method for manipulating multivariate values, processes and spatial mapping outputs, helping to draw a conclusion on the environmental and geological scenario [6]. Due to geogenic and anthropogenic processes, several studies have been published on groundwater quality and pollution in the literature [1, 3,4,5, 9, 11, 16, 13]. The present research is an attempt on the GIS-based groundwater quality index to determine its adoptability for human use and to communicate for successful management to policy makers and local bodies/people. The aim of this paper is to understand the hydro-geochemical cycle of groundwater using physico-chemical parameters and to determine the groundwater quality of the NOIDA for drinking and other human uses using the WQI and GIS techniques. In fact, the outcome of these investigations provides baseline data on quality of groundwater status in the chosen study area, which benefits in groundwater resources management and the importance of volatility limit the pH should be in to understand its effect on water quality, i.e. WQI. Machine learning techniques such as multivariable linear regression, support vector regression, and decision tree regression were used to predict the pH of the water body based on its geographical location.

2 Study Area

Noida is located in the Uttar Pradesh state’s Gautam Buddh Nagar district, near to Delhi’s NCR. The geographical location of Noida is between 28° 26′ 39′′ N–28° 38′′ 10′′ N and 77° 29′′ 53′′ E–77° 17′′ 29′′ E. Spanning Geographical Area of 203 km2 and with a population of 642,381 according to the 2011 census. The City’s climate is sub-humid, and is marked by hot summer and cold season bracing. In the most part of the year Noida has a hot and humid climate. Annual rainfall averages 642.0 mm. The region typically has flat topography from north-east to south-west (< 1°) with gradual slope. The current study area is 204 m above mean sea level near to Parthala Khanjarpur village situated at the northeastern and with the minimum elevation of 195 m above mean sea level near Garhi Village located at south-west region. Noida city uses water source from Ganga Canal interception at Masoori Dasna, situated in Ghaziabad, and a normal occurrence of groundwater. Noida falls under the Yamuna River catchment area; it is bound by the Yamuna River to the west and south-west, and by the Hindon River to the south-east. In the south, the city is bounded by the two rivers, Yamuna and Hindon, meeting point (Figs. 1 and 2).

Fig. 1
figure 1

Study area

3 Materials and Methods

Fig. 2
figure 2

Flow chart adopted for methodology

3.1 Methodology

The current study focuses on data that are gathered from the field and produced through laboratory research. Physico-chemical analysis of groundwater samples taken from Noida was analysed at environmental laboratory of Civil Engineering Department, JSS Academy of technical education Noida. Results from the physico-chemical analysis of groundwater samples which were acquired from randomly distributed fifty-one different sites, such as public hand pumps, borwells and maintained according to the methods recommended in the American Public Health Association manual (APHA-2320 1999). Using Global Positioning System, geospatial specifics of sample sites were determined. An attempt is made to define the groundwater quality index and compared the results with the values specified in water quality standards such as the Bureau of Indian Standards (BIS), the Indian Standards (IS) [2] and the World Health Organization [15]. All the concentrations of the chemical are measured in mg/l. On an alternate side multivariable linear regression, support vector regression, and decision tree regression are used to predict the pH of locations and using the Bureau of Indian Standards (BIS), the limit of pH we understood and upon applying it as a limiting feature the range of WQI is understood for the locations which are in the pH limits.

3.2 Physico-Chemical Parameters

3.2.1 pH

The pH of the groundwater tested is between 7.1 and 8.9 representing acidic to alkaline groundwater samples, accordingly. After evaluation of the test samples pH value shows that all water samples were in the BIS [2] allowable limits (6.5–8.5) except at eight locations (Fig. 3).

Fig. 3
figure 3

pH distribution in different groundwater samples

3.2.2 Turbidity

Turbidity is calculation of a liquid’s relative visibility. It is an optical characteristic of water that is a measure of the amount of light that is spread in the water by substance when a light is shined through the water sample. Extreme turbidity is aesthetically unattractive in potable water, and may also cause health issue. The GW8 sample shows the highest turbidity with a value of 56 NTU which is preceding the permissible limits for drinking water recommended in [2] (Fig. 4).

Fig. 4
figure 4

Turbidity distribution in different groundwater samples

3.2.3 Total Hardness

Total Hardness indicates that 68.6% of the groundwater sample lies below the permissible limit (600 mg/l) and 31.4% groundwater are above the permissible limit (600 mg/l) as prescribed in BIS [2]. High TH concentrations above the permissible limit were found at 16 locations in the study region between 600 and 1780 mg/l (Fig. 5).

Fig. 5
figure 5

Total hardness distribution in different groundwater samples

3.2.4 Chlorides

The chloride level at Baraula Village, Sector 49 is 3288.98 mg/l where the limit is 1000 mg/l and the presence of sewage treatment plant attributing the values is observed using satellite image by visual confirmation. The effect of high Chlorides levels in groundwater results in indigestion, taste palatability and corrosion of pipes (Fig. 6).

Fig. 6
figure 6

Chloride distribution in different groundwater samples

3.2.5 Total Alkalinity

At GW45 the Highest total alkalinity value of 695 mg/l was found and at GW31 the lowest value of 200 mg/l. The alkalinity value was found to be higher than the appropriate level for drinking water as recommended in BIS [2] (Fig. 7).

Fig. 7
figure 7

Total Alkalinity distribution in different groundwater samples

Fig. 8
figure 8

Carbonate distribution in different groundwater samples

3.2.6 Carbonates

Carbonate value of GW33 was found above the allowable limit being as mentioned in BIS [2] for drinking water (Fig. 8) and 270 mg/l was the lowest value of carbonates that was found at GW1 with a value of 45 mg/l.

3.2.7 Bicarbonates

GW49 sample has bicarbonate value higher than the limit prescribed by the BIS [2] for drinking water (Fig. 9) with 645 mg\l and 10 mg\l of bicarbonate was found at GW15.

Fig. 9
figure 9

Bicarbonate distribution in different groundwater samples

3.2.8 Water Quality Index Computing

The WQI was measured using the drinking water quality requirements recommended by the World Health Organization (WHO) and the Bureau of Indian Standard (BIS). The weighted arithmetic index method was used in groundwater WQI calculation (Brown et al. 1972). The WQI is computed in 3 steps. In the first step, a weight (wi) has been assigned to 07 parameters (pH, TH, Total Alkalinity, Chlorides, Turbidity, Carbonates Bicarbonates) according to its relative significance in the overall quality of drinking water (Table 1) in the first step. 2nd step involves calculating relative weight (Wi)

$$W_{{\text{i}}} = Wii = {\text{1n}}Wi$$
(1)
Table 1 Relative weight of physico-chemical parameters

where Wi is the relative weight and wi is the weight of each parameter and the total number of parameters is n, respectively.

3rd step consists of calculating the quality rating scale (qi) for each parameter determined by dividing its standard concentration as specified by the BIS guidelines

$$q_{i} = (ci/si) \times 100$$
(2)

where qi is the quality rating scale, the concentration of each chemically parameters is ci in each water sample in mg/l, si is the Indian standard water parameters in mg/l as guided by the guidelines of BIS.

Further each physico-chemical parameter, its concentration will be divided by the permissible limits (as mentioned in the BIS) and it will be multiplied by 100.

$$q_{i} = (ci/si) \times 100$$
(3)

Next the Sli is defined first for each physico-chemical parameter derived by the following equation for the calculation of water quality index.

$$Sl_{i} = W_{i} \times q_{i}$$
(4)
$$WQI = \sum\limits_{i = 1}^{N} {Sli}$$

where the sub-index of i th parameters is Sli, the rating of each concentration of i th parameters is qi, and the number of parameters is n. Table 2 displays the measured water quality index in five categories of ranging from > 50 (excellent water) to > 300 (water unsuitable for drinking) range water quality index for drinking purposes.

Table 2 Groundwater classification related to WQI range

3.3 Spatial Analysis

Spatial analysis of different physico-chemical parameters was performed in GIS environment with an open source QGIS programme. The map of the pH, TH, Total Alkalinity, Chlorides, Turbidity, Carbonates and Bicarbonates was prepared for Noida using inverse distance weighted (IDW) interpolation technique.

3.4 Machine Learning Techniques

This machine learning technique is an augmentation to Linear regression. Which, using the linear dataset helps in predicting the required result.

\(\left\{ {y_{i} ,x_{i1} , \ldots ,x_{ip} } \right\}_{i = 1}^{n}\) Consisting of n instances.

$$y = \beta_{0} + \beta_{1} x_{1} + \beta_{2} x_{2} + \cdots + \beta_{k} x_{k} + \in$$
(5)

where

\(\beta\)i = slope constants.

\(\in\) = Error.

xi = input variable.

y = output variable.

When the dataset is passed through the algorithm it iterates one by one and finds error and minimizes it with each iteration thus improving the accuracy and the last iteration, yields the best results.

3.4.1 Support Vector Regression

It works on the principle that classes have a separating hyperplane between them which differentiates one class from the other keeping the property of each class in consideration. The procedure in this algorithm invokes the property and minimizes the error by subsequent iterations.

$${\text{minimize }}\frac{1}{2}\left\| w \right\|^{2}$$
(6)
$${\text{subject}}\;{\text{to}}\left| {y_{i} - \left\langle {w,x_{i} } \right\rangle - b} \right| \le \varepsilon$$
(7)
$$y = wx + b$$
(8)

where xi = training sample.

\(\left\langle {w,x_{i} } \right\rangle + b\) = prediction of sample.

And ϵ = Threshold free parameter.

The approach for prediction starts by feeding xi to Eq. (7) and finding the which ensures Eq. (6) is minimized. Then after the result of the hyperplane parameter is used in predicting the y value which is afterwards fed in R2 method for finding the accuracy.

3.4.2 Decision Tree Regression

It is a tree-like regression algorithm based on decision tree which divides the classes into subsets which are treated as nodes of the tree. The resulting outcomes are considered from the leaf nodes and decision nodes. It priorities the nodes and traversing through the tree gives the prediction result.

Top-down tree construction method is used in this regression. The probabilities of occurring classes are used in this regression.

$${\text{Gini}}\;{\text{Index}}(GI) = 1 - \Sigma Pi2$$

Pi = Probability of occurrence of Pi.

The tree is constructed to ensure that the next class is selected such that the Gini index remains as low as possible. The target is finding the lowest Gini impurity node as the leaf node. The outcome which is predicted is a real number.

4 GIS Statistical Model

The spatial distribution and the spatial modelling in the present study have achieved by inverse distance weighted (IDW) interpolation technique and the groundwater quality index is determined according to BIS [2]. The IDW is an interpolation method that represents difference and continuous type of spatial attribute in the region [17]. The value measured by IDW interpolation is a weighted average of the ground sampling points that are neighbouring it. Weights are determined by reversing the distance from the origin of an observation to the position of the predicted value [17].

5 Results and Discussions

Noida is an industrial area of Gautham Buddh Nagar District UP. Without adequate treatment, the industrial effluents pollute water bodies, rivers, etc. Untreated effluents from sewage treatment plants percolate into the groundwater making it unfit for drinking and other use.

Alongside the machine learning models gave astonishing results which would help in understanding the relation of the pH to the WQI and understanding the mapping relation between these two (Fig. 10; Table 3).

Fig. 10
figure 10

Spatial distribution of physico-chemical parameters

Table 3 R2 scores of pH

Then by segmenting the WQI values it is found that the water quality index values which are said as the “Excellent water” resulted in 5 pH values in the limits prescribed by BIS [2] and only 5 WQI which could be classified as “Excellent water”.

6 Spatial Distribution of GIS-Based WQI

The WQI calculated for assessing Noida’s groundwater quality. Calculated groundwater samples WQI values ranged from 47.10 to 192.10 (avg. 78.14). The highest value was observed at the groundwater samples collected at the Gulavali sector 162 and lowest WQI value was observed in Momnathal, sector 150. The WQI in the samples could be because of natural and anthropogenic activities. The GIS-based WQI analysis shows the groundwater samples, 9.9% excellent water, 74.5% of good water and 15.6% of poor water. These results reveal that the samples collected at the study area are moderately contaminated and inappropriate for direct use in drinking. It is advised to consider any treatment methods before utilizing it (Fig. 11; Table 4).

Fig. 11
figure 11

Spatial distribution of GIS-based WQI

Table 4 Water quality index for groundwater samples

7 Conclusions

The GIS-based water quality index analysis reveals that the 84.4% of collected groundwater samples falls in excellent and good water category and 15.6% groundwater samples falls in poor category. The results revealed that water quality index varies from 47.10 to 192.10. The water quality index less than 50 is considered as excellent water, 50–100 is considered as good water, 100–200 is considered as poor water, 200–300 is considered as very poor water and WQI greater than 300 will be considered as unfit for drinking. The highest WQI values founded in high in GW19, GW27, GW31, GW34, GW36, GW43 and GW89 groundwater samples which lie in the poor water category. The study recommends some treatment considering for drinking purpose and locals in that area need to treat the water before usage. The GIS-based analysis suggests sewage treatment plants, industries for considering treatment of water before discharging into the water bodies. The study recommends continuous monitoring of groundwater quality, and implementation of methods and techniques for improving water quality. Further, it is advised to avoid water consumption from bore wells and hand pumps should be treated to avoid unnecessary health disorders. Predicting and mapping the pH value with the WQI gave us insight of the volatility of WQI on the pH, although “good water” in WQI index is also fit for usage but the pH if it is in limit contributes to the “Excellent water”.