1 Introduction

With huge growth in human population, the quantum of waste that is produced is increasing day by day. For handling such huge quantity of waste, mostly landfills are used as the end solution for any waste that cannot be processed further or otherwise. Across the world, landfills have become a common practice for municipal solid waste disposal. The solid waste lying in landfills or open dumps causes groundwater infiltration either by precipitation or leachate percolation or pesticides washing off from surface and going to aquifers or any other mode of infiltration. Therefore, area near landfills has greater probability of groundwater contamination. Though in many instances the landfill are designed with liners to prevent leachate going to underground waters but due to faulty design many times percolation happens and groundwater gets contaminated (Nagarajan et al. 2012). Landfill leachate if not collected properly or treated and disposed suitably contaminates groundwater and surface water as percolation through soil may join confined or unconfined aquifers (Bashir et al. 2009).

With different types of land spaces existing, sanitary landfills have been developed as a viable method to reclaim derelict land by disposal of solid waste in those landfills. The properly designed landfills for solid waste management has helped in reducing adverse environmental impacts because of activities such as open-air burning and open-pit dumping. However, if landfills are not managed and designed properly, they might have environmental concerns because of gas production and leachate formation (Abd El-Salam and Abu-Zuid 2015). The leachate finding its way through infiltration contaminates groundwater by adding either dissolved organic matter or compounds such as ammonium, calcium, magnesium, sodium, potassium, iron, sulfates, chlorides and also heavy metals (Lee and Lee 1993; Christensen et al. 2001). This had been assessed by experimental identification and through mathematical modeling (Moo-Young et al. 2004). The physico–chemical analysis of leachate from landfill site along with heavy metals showed higher concentration of chlorides, nitrates, sulfates, phenol, zinc and also high COD renders the groundwater quality unfit for domestic uses (Singh et al. 2008).

In order to understand the status of groundwater quality and its subsequent alterations, it is important to have models that can predict groundwater quality. Groundwater quality is deteriorated if water quality parameters exceed beyond the permissible limit as it becomes unacceptable after that. Groundwater becomes unfit for domestic as well as irrigation purposes if physico–chemical parameters such as color, conductivity, total solids, hardness, chlorides, NH3–N exceed the permissible limit specified by EPA (Fatta et al. 1999). The models help in understanding the changes that may occur with passage of time in groundwater quality. The use of model becomes important as monitoring is expensive and time-consuming, and also at many instances it is not possible to draw the samples from certain points (Nas and Berktay 2010).

With the advancement and availability of resources in soft computing techniques, models are continuously adopted and coupled with other regression models (Charulatha et al. 2017) to know the groundwater quality. These models are helpful in understanding the scenarios with reference to many parameters that cannot be easily represented by mathematical equations (Kuo et al. 2004). Researchers have tried to study the contamination of groundwater due to leachate percolation (Mor et al. 2006) near unlined landfill site. Dependency on groundwater for domestic purposes is also high in many parts of India. Groundwater-level prediction for different aquifers has been done by using artificial neural network (ANN) in India and abroad (Nayak et al. 2006; Mohanty et al. 2010; Daliakopoulosa et al. 2005). Few studies had been conducted on groundwater quality prediction using ANN (Yesilnacar et al. 2008), but quantifying the effect of space and time has not been done extensively in the model itself.

Models such as neural models help in predicting the groundwater quality well in advance for understanding future scenario. These models in addition have various applications in field of environment and ecology which range from predicting water quality in river stretches (Singh et al. 2009) to neural modeling of marine phytoplankton primary production (Mattei et al. 2018). Therefore, in this study ANN has been developed to predict total hardness in groundwater at different locations in an Integrated Solid Waste Management (ISWM) site near Mumbai coastline. The study also aims at mapping the variation in the groundwater quality at this site commencing from baseline till 2017 by modeling total hardness using neural networks and developing spatial interpolation maps for understanding the variation with space and times.

2 Study area

The coastline of Mumbai city is indented with numerous creeks and bays which are under tidal effect of Arabian Sea. The city receives peak rainfall in the month of July and is under the influence of south-west monsoon winds from June to September. October and November are the post-monsoon seasons. The site for ISWM facility, comprising a sanitary landfill, is located in the eastern suburbs of Mumbai near mangrove swamp area as depicted in Fig. 1 with latitudes 19°7ʹ13.82″N and 19°7′40.20″N and longitude 72°56′48.77″E and 72°57ʹ26.34″E with elevation above mean sea level (m) of 1 m. Groundwater samples from 10 monitoring wells were collected every month for year 2016 and 2017 during the operation of sanitary landfill. The collected samples were analyzed for pH, total dissolved solids, chlorides, nitrates, sulfates and total hardness. The baseline study mandated in Municipal solid wastes (Management and Handling) rules, 2000, was conducted in 2014 and has reported high concentrations of dissolved solids and other parameters exceeding the desirable limits as per standards given in MSW rules, 2000. This highlights that groundwater quality at the ISWM site was already exceeding the standards; and to understand any further deterioration in groundwater quality, neural model predicting the hardness was used. The model will help in assessing the performance of sanitary landfill site with respect to prevention of leachate contamination and further degradation in groundwater quality.

Fig. 1
figure 1

Groundwater sampling locations at ISWM site

3 Methodology

3.1 Using artificial neural network (ANN)

The capability of neural model to mimic complex relationships makes it very demanding in today’s world. This becomes important because generating data from machines, experiments or observations is relatively easy and well established but interpreting the interlinking of parameters and variables involved in the phenomenon and processes is complex and difficult (Smith 1994). This is the reason why when data are available and cross-relations are not known neural networks are used. Therefore, for developing an optimized neural model for designated purpose, first it has to be trained and then tested. While making neural models, learning algorithms are used for developing neural architecture and the algorithm giving best results in testing is used for final model development. In this study, three learning algorithms, namely Levenberg–Marquardt (LM), gradient descent with momentum back propagation (GDM) and BFGS quasi-Newton back propagation (BFG), were used to analyze the performance of individual algorithm toward predicting groundwater quality near landfill site. Equations (1), (2) and (3) present the approximation used to update the values when using LM, GDM and BFG learning algorithms, respectively.

$$ y_{k + 1} = y_{k} - \left[ {J^{\text{T}} J + \mu I} \right]^{ - 1} J^{\text{T}} e $$
(1)
$$ {\text{d}}Y = \mu *{\text{d}}Y_{\text{prev}} + \eta *\left( {1 - \mu } \right)*{\raise0.7ex\hbox{${\text{dperf}}$} \!\mathord{\left/ {\vphantom {{\text{dperf}} {{\text{d}}Y}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${{\text{d}}Y}$}} $$
(2)
$$ y_{k + 1} = y_{k} - A_{k}^{ - 1} g_{k} $$
(3)

3.1.1 Neural architecture

In artificial neural networks, there are mainly two broad types of architecture—feedforward networks and feedback networks. Feedforward or multi-layer networks have been extensively used while making neural model. The model may consist of several layers as hidden layer and one output layer. In most of the applications where neural network is used, one or two hidden layers are kept in the architecture. The number of neurons in each hidden layer can be fixed by trial and error process and also depending upon the complexity of process to be modeled (Dogan et al. 2009). Figure 2 shows the architecture for neural network for this study in which all inputs and output are highlighted and the neurons in hidden layer with different combinations were tested (Palani et al. 2008).

Fig. 2
figure 2

Neural network architecture with eight inputs and one output and neurons were varied between 2 to 20

3.1.2 Data for neural model

In this study, data were derived from sampling and monitoring of ten locations near ISWM site as mentioned earlier, continuously for 2 years from 2016 to 2017 for groundwater quality testing. The data set for groundwater quality has values of pH, total dissolved solids (TDS), total hardness (TH), chlorides (Cl), nitrates (NO3) and sulfates (SO42−) with time and locations. All the parameters were tested as per standard protocol “Standard Methods for the Examination of Water and Wastewater (APHA)” 21st Edition. The total data size for making neural model has 160 sampled results for all locations for 2 years. Once the input matrix is finalized, the data are processed, randomized and normalized because it has to pass through activation function which has boundary values as well. The normalization can be done in any range from [− 1, 1] or [0, 1], while in this it has been taken as 0.1 to 0.9 to avoid saturation in data (Basheer and Hajmeer 2000).

3.1.3 Deciding input and output

In this study, out of all parameters, eight were selected as inputs and one as output. The inputs and output selected are given in Table 1. The inputs mentioned in Table 1 were decided so because from theories and processes, it is known that multivalent cations associated with certain anions impart hardness in water. The major hardness-causing cations are calcium, magnesium, strontium, ferrous and manganous ions. These divalent cations are associated with anions, and their availability in natural waters is shown in Table 2 (Sawyer et al. 1994, p. 487).

Table 1 Variables selected as input and output for neural model
Table 2 Major cations causing hardness and the anions associated with these divalent cations

In addition, for the neural model input, time and space for individual data point was also taken as inputs. This was done to understand the effect of each parameter with space and time since no equation exists correlating effect of space and time together. Therefore, neural model has been developed by incorporating locations and time. The relative effect of space and time on hardness of water had been worked out using sensitivity analysis in later section. The statistical parameters for all variables used in neural model are shown in Table 3.

Table 3 Statistical Parameters for all inputs and output

3.1.4 Training and cross-validation

In this study for training the network, three algorithms Levenberg–Marquardt (LM), Gradient descent with momentum back propagation (GDM) and BFGS quasi-Newton back propagation were used. All these algorithms were used to determine which one works better and is suitable for predicting total hardness at different points for study area (Mohanty et al. 2010). To prevent model from over-training, k-fold cross-validation was used to partition data sets for training and testing. The value of k can vary, but over here it has been taken as 10. Cross-validation was done to assess the performance of model under all the three learning algorithms used. In each round, the original data set is partitioned into training (k − 1)-fold and testing data (onefold) as shown in Fig. 3 and the process is repeated k times such that testing set is used exactly once for validation. After “k” repetition, average error is calculated and used to make further analysis.

Fig. 3
figure 3

k-fold cross-validation and every time random fold is selected as testing set

3.1.5 Activation function, learning rate and momentum constant

While making neural model, the inputs are passed through activation function, log-sigmoid between input layer to hidden layer, and linear transfer function between hidden to output layer was used as a transfer function in this study. The learning rate (η) and momentum constant (µ) are internal parameters (Maier and Dandy 1996) and are used in LM and GDM algorithms (Moreira and Fiesler 1995). The default value in GDM and LM algorithm of learning rate is 0.1 and momentum constant is 0.09. During training, different values of momentum constant and learning rate at neurons 9, 10, 11 and 12 as shown in Fig. 4 were used.

Fig. 4
figure 4

Difference between training and testing errors at different neurons when (η) and (µ) were changed as per the cases from 1 to 4

3.2 Spatial interpolation for groundwater quality

Spatial interpolation is used for estimating the physical and chemical constituents for locations where it cannot be measured (Murphy et al. 2010). It serves as a ready reference for understanding the impact of different parameters at different locations. The study area shown in Fig. 1 can be seen under the ingress of seawater. Sampling points MW6 to MW10 are toward the creek. This typical location and mixing of seawater with groundwater pose problem to understand the exact water quality of groundwater and the regions influenced by this activity. Therefore, spatial interpolation maps were plotted for total hardness using ArcGIS Geostatistical Analyst Extension and Inverse Distance Weighing (IDW) method for generating maps for the year 2014, 2016 and 2017 for all 10 locations.

From Fig. 5, it can be seen that in 2014, groundwater quality of very small area around monitoring well MW2 and MW3 was within the desirable limits of total hardness concentration (300 mg/L), whereas the groundwater quality around MW1 was within the permissible limits (600 mg/L). The values exceeded by 30% and 60% of permissible limit for region around MW5 and MW8. The rest of the region in the study area has very high values of total hardness as can be seen in the figure also. This shows that the entire area has vast groundwater storage that cannot be used for domestic or any other purposes. Since points MW6 to MW10 are toward the creek, it is very obvious that they would be influenced by seawater ingress.

Fig. 5
figure 5

Map representing total hardness at 10 locations in year 2014

On comparing Figs. 5 and 6, it can be observed that the region for very high total hardness concentration has increased significantly which means that in 2 years there has been more ingress of seawater; and therefore, the regions which were above 60% of permissible value (Fig. 2) now have concentrations in the highest range as shown in Fig. 6. The region with permissible hardness concentration has reduced around MW1 and MW2 from 2014 to 2016, and extent of seawater ingress has increased affecting the groundwater quality at other locations.

Fig. 6
figure 6

Map representing total hardness at 10 locations in year 2016

From Fig. 7, it can be observed that the major region of site has again reached to permissible values of concentration of hardness. This situation may have developed because of the engineered landfill site which has stopped seawater ingress to other regions.

Fig. 7
figure 7

Map representing total hardness at 10 locations in year 2017

4 Results and discussion

4.1 Neural architecture

In this study, to obtain a neural model for modeling the total hardness in groundwater, the optimum number of neurons in hidden layer was fixed by calculating root mean square error (RMSE) during training and testing by varying number of neurons in that layer. The number of neurons was varied from 2 to 20 in the layer. The difference in error during training and testing was calculated, and the instance where the error in testing phase was less as compared to training phase was chosen finally for those many neurons in hidden layer for the final neural model. Figure 8 shows the performance of model at different number of neurons with root mean square error during training and testing. The instance where testing error was least was taken as the optimal case. It was observed that with 12 neurons in hidden layer, the predicting power of model was the best.

Fig. 8
figure 8

RMSE during training and testing

Therefore, the resultant final neural model for predicting total hardness in groundwater at a solid waste dumping site has eight inputs, 12 neurons in hidden layer and one output. The model with 12 neurons in hidden layer was trained using Levenberg–Marquardt (LM) algorithm for predicting the total hardness in groundwater. The statistical parameters to assess the performance of the model are presented in Table 4.

Table 4 Performance parameters of trained artificial neural model

As pointed in section training and cross-validation, tenfold cross-validation was used for data division and training to prevent model from overfitting. Hence, the model was run with different combinations of training and testing sets such that all possible combinations were included and each fold was used once for validation. Figure 9 shows the difference between training and testing error versus random fold selected while doing k-fold cross-validation, where value of k is taken as 10.

Fig. 9
figure 9

Difference between training and testing error with different folds

4.2 Comparison of performance of model

The obtained neural model with 8-12-1 architecture was trained and tested for two other learning algorithms which are gradient descent with momentum back propagation (GDM) and BFGS quasi-Newton back propagation (BFG). The purpose was to understand the change in predictive power of model on using different learning algorithms. The performance was evaluated by RMSE and r against all the three learning algorithm as shown in Table 5. Value of RMSE and r can be calculated using Eqs. (4) and (5).

$$ {\text{RMSE}} = \left[ {\frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {{\text{Value}}_{m} - {\text{Value}}_{p} } \right)^{2} }}{N}} \right]^{1/2}$$
(4)
$$ r = \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {{\text{Value}}_{p} - \overline{{{\text{Value}}_{p} }} } \right)\left( {{\text{Value}}_{m} - \overline{{{\text{Value}}_{m} }} } \right)}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{N} \left( {{\text{Value}}_{p} - \overline{{{\text{Value}}_{p} }} } \right)^{2} } \sqrt {\mathop \sum \nolimits_{i = 1}^{N} \left( {{\text{Value}}_{m} - \overline{{{\text{Value}}_{m} }} } \right)^{2} } }} $$
(5)

where Valuem is result obtained from experiments, Valuep is result predicted by model, Valuem is mean of results from experiment and Valuep is mean of results from model.

Table 5 Performance of all algorithms tested for neural model

From Table 5, it can be observed that at same architecture with different learning algorithms the predictive power of model changes by seeing the coefficient of correlation (r). It was also observed that predictive power of LM is better than GDM and BFG, while BFG is better than GDM. Thus, it can be clearly said that changing algorithm for the same problem will require optimizing all internal adjustable parameters again even in the case when all the algorithms used are meant for regression only.

In Figs. 10 and 11, it can be seen that while using LM algorithm, during training almost all predicted values for different instances match with experimental values and in testing phase also the results predicted were close to experimental values. Thus, it shows that LM-trained neural model for this study has better predictive potential and thus can be used for prediction.

Fig. 10
figure 10

Training performance of LM algorithm with 8-12-1 architecture

Fig. 11
figure 11

Testing performance of LM algorithm with 8-12-1 architecture

4.3 Momentum constant and learning rate

When using back propagated algorithm, there is an error function that needs to be minimized. The minimization of error function is done using gradient descent technique. With each iteration, the correction in weights is done by using learning rate (η) and momentum constant (µ) and this µ was introduced by Rumelhart-86.2 for incorporating the effect of previous iteration on the present one. From Fig. 4, it can be seen that keeping η and µ very low as in case 2 the difference between root mean square error of training and testing is more than other three cases except at one instance. Seeing the performance of case 1, 3 and 4, all are giving minimum at 11 neurons; but at 12 neurons, case 4 gives minimum difference between errors so case 4 was selected for final model. As pointed earlier, the predictive power with 12 neurons was better; so keeping 12 neurons, the η and µ were selected.

4.4 Sensitivity analysis

To understand the effect of all input parameters on total hardness (TH), the sensitivity analysis was performed by using the weights of trained model by adopting the methodology given by Garson (1991). The final weights corresponding to 12 neurons in hidden layer with LM algorithm were used to perform the sensitivity analysis. Table 6 shows the final weights connection to input and neurons and to output. This analysis helps in understanding the effect of individual parameter on output.

Table 6 Weight matrix for inputs and hidden layer (Wi) and hidden layer and output (Wo)

Table 6 is used to know the relative importance of each input variable, and Fig. 12 shows the contribution of each toward the total hardness. From Fig. 12, it can be seen that concentration of nitrates is highly contributing to total hardness of water with 22.4%, while second is concentration of sulfates with 17.9% this is interesting to know the combination of cations and anions existing in the water. The importance of space and time can be analyzed by this analysis which cannot be done through any mathematical equation. The effect of time is 12.5%, while effect of space is 14.75% (with latitude and longitude together).

Fig. 12
figure 12

Relative importance of each variable

5 Conclusions

  1. 1.

    In this study, it was observed that adjusting internal parameters (such as μ and η) while making neural model is of utmost important. It is also inferred that with less data, the changes in all internal parameters while optimizing model need careful and step-by-step changes.

  2. 2.

    It was observed that all the three algorithms tested for same neural model LM algorithm performed the best.

  3. 3.

    After performing sensitivity analysis, it was observed that space and time are also very important to understand the variability in total hardness concentrations in groundwater.

  4. 4.

    The results of sensitivity analysis and interpolation maps show that space and time are playing major roles in deciding the regions under desirable and permissible total hardness concentration.

  5. 5.

    From interpolation maps, the probable movement of seawater that contributes to hardness can be understood. This in future will help in estimating the regions which will be more prone to variations.

  6. 6.

    It can also be concluded that as the testing performance of all the three algorithms is not below 70%, one can go on for fixing the neurons first while making a neural model. Training algorithm may be changed afterward for same architecture depending upon availability of computational time as LM algorithm needs more time compared to other two BFGS and GDM.

This study quantifies the effect of space and time together on total hardness of groundwater. The neural model developed in this study was able to predict values for total hardness concentration for all the points. This study presents the interpolation maps for total hardness of groundwater which helps in understanding the regions under variable groundwater quality. In addition, further study is required to study the presence of any leachate component in groundwater by also performing heavy metal analysis.