Abstract
Water quality experiments are difficult, costly, and time-consuming. Therefore, different modeling methods can be used as an alternative for these experiments. To achieve the research objective, geospatial artificial intelligence approaches such as the self-organizing map (SOM), artificial neural network (ANN), and co-active neuro-fuzzy inference system (CANFIS) were used to simulate groundwater quality in the Mazandaran plain in the north of Iran. Geographical information system (GIS) techniques were used as a pre-processer and post-processer. Data from 85 drinking water wells was used as secondary data and were separated into two splits of (a) 70 percent for training (60% for training and 10% for cross-validation), and (b) 30 percent for the test stage. The groundwater quality index (GWQI) and the effective water quality factors (distance from industries, groundwater depth, and transmissivity of aquifer formations) were implemented as output and input variables, respectively. Statistical indices (i.e., R squared (R-sqr) and the mean squared error (MSE)) were utilized to compare the performance of three methods. The results demonstrate the high performance of the three methods in groundwater quality simulation. However, in the test stage, CANFIS (R-sqr = 0.89) had a higher performance than the SOM (R-sqr = 0.8) and ANN (R-sqr = 0.73) methods. The tested CANFIS model was used to estimate GWQI values on the area of the plain. Finally, the groundwater quality was mapped in a GIS environment associated with CANFIS simulation. The results can be used to manage groundwater quality as well as support and contribute to the sustainable development goal (SDG)-6, SDG-11, and SDG-13.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
In arid and semi-arid areas such as Iran, due to water scarcity, studies of groundwater are very important for water resources protection and planning. More than 90% of the water consumed in Iran consists of groundwater. Since Groundwater is less susceptible to bacterial pollution and evaporation than surface water, it is more important. In recent decades, human activities such as agriculture, manufacture, and urban development have affected groundwater quality negatively. Therefore to optimally manage water resources, it is necessary to study water quality. Water qualitative measurements and experiments are difficult, costly, and time-consuming, therefore, researchers applied various models to simulate groundwater quality (Haddad et al. 2013; Millar et al. 2019; Wu et al. 2020). Application of artificial intelligence (AI) in groundwater modeling was initiated in the early 1990s and so far, many studies have been conducted on the successful application of this method (Nikoo and Mahjouri 2013; Tongal and Booij 2018; Li et al. 2020; Maliqi et al. 2020; Pal and Chakrabarty 2020; Mosaffa et al. 2021).
Artificial neural networks (ANN) are one of the most widely used AI methods in hydrological modeling. One of the distinguishing features of the ANN method is to provide an applied solution to simulate in a water resources system (Hsu et al. 1995; Wang et al. 2010; Gholami et al. 2021; Huang et al. 2021). Later ANNs were combined with fuzzy logic methods and this resulted in the advent of neuro-fuzzy systems such as the co-active neuro-fuzzy inference system (CANFIS) (Pramanik and Panda 2010; Varvani and Khaleghi 2019). CANFIS as a multi-layer feed-forward network uses Gaussian functions for fuzzy sets and also can combine the benefits of both neural and fuzzy networls (Ullah and Choudhury 2013; Choubin et al. 2020a, b; Fang et al. 2020). Further, the self-organizing map (SOM) method or a Kohonen map is another common method in the field of AI (Kohonen 1982, 1984, 2005; Laaksonen and Honkela 2011; Mosavi et al. 2020a). It uses unsupervised, competitive learning, and then combines the goals of the projection and clustering algorithms (Klobucar and Subasic 2012). The SOM method builds a two-dimensional map (lattice) of feature space, and it is suitable to deal with noisy, non-stationary, and non-Gaussian data (Iwashita et al. 2011). A SOM is a type of ANN that is trained by using unsupervised learning to produce an output. It shows in the input space of the training samples, called a map. A SOM requires a neighborhood (i.e., the adjacent nodes) function to preserve the topological properties of the input space (Aneetha and Bose 2012). The SOM algorithm is a powerful tool to analyze the data (Aneetha and Bose 2012; Toor and Singh 2013; Singh 2014; Jassar and Dhindsa 2015).
SOM has been developed for a wide range of water resources issues and modeling for water quality assessment (Ehsani and Quiel 2008; Iwashita et al. 2011; Haider et al. 2015; Mosavi et al. 2020b). Numerous studies showed that one of the most superior classification accuracy results have been achieved by using SOM as well-justified machine learning techniques (Muller and Van Niekerk 2016; Gholami et al. 2020). The advantage of SOM compared to other methods is that in this method, the error due to the complexity of the model as well as the error related to forecasting is minimized and this model can work with less training data and less variables. However, it is sensitive to variations depending on the training data. Further, its run-time is shorter than other methods (Lin et al. 2016). Although SOM has excellent features, there are still limited studies using SOM in modeling groundwater quality in arid regions with limited daily climatic data because of its short history. Artificial intelligence-based methods can also earn better than relationships between input and output parameters (Besalatpour et al. 2014; Mosavi et al. 2020c). Each of the three methods of ANN, neural fuzzy (CANFIS) and SOM showed a high performance in hydrological modeling processes. Previous studies on water quality using ANNs (Ghose et al. 2010; Pal and Chakrabarty 2020), CANFIS (Memarian et al. 2016; Allawi et al. 2018) and SOM (Han et al. 2011; Li et al. 2020) have shown their performance in hydrological modeling.
In water quality studies, it is necessary to use an appropriate index. In recent decades, several studies have used various indices for groundwater quality investigations (Giljanovic 1999). Numerous studies have been conducted to measure the groundwater quality index (GWQI). The GWQI was introduced by Brown et al. (1970) and later improved by the Scottish Development Department (1975), following the suggestion in Horton (1965) that the various water quality data could be aggregated into one overall index (House and Ellis 1987). Gholami et al. (2015a, b) utilized statistical and GIS techniques to classify groundwater quality in Mazanmdaran plain (north of Iran). Singh et al. (2011) applied a GIS-based multi-criteria analysis by assigning weights to different water quality parameters (Pirasteh et al. 2006; Farjad et al. 2012). They grouped water quality into six classes ranging from very mild to unfit for drinking. They found that in most of the study areas, water quality varies from moderate to good except in some areas where the groundwater quality is classified as ‘poor to unfit’.
In recent years, much progress in artificial intelligence techniques has been seen and applied in many cases, including environmental and water quality. The large-scale availability of high-quality and geospatial data and the advances of both hardware and software have contributed to the GIS modeling efficiency and creating smart maps. Processing the data and transforming them into computer vision and programming language incorporated with machine learning enabled the us to create applied maps (Li and Hsu 2020; Xie et al. 2020).
Three (SOM, ANN, CANFIS) have been used in several studies on modeling the quantity and quality of water. A review of previous studies showed that each of these methods has high performance in water quality modeling. We intend to evaluate their performance with similar data in order to compare their performance and determine the most efficient method, and finally to produce groundwater quality maps by automatically integrating the optimal method in a GIS. This study aims to test the use of three AI methods of and in combination with GIS to model groundwater quality. The results present the capabilities of a co-active neuro-fuzzy inference system (CANFIS), ANN, SOM, and GIS rapidly and accurately in simulate groundwater quality.
2 Materials and Method
2.1 Study Area
The study area is a plain, in the north of Iran (Mazandaran province), with an area of approximately 10,000 km2 located between 35° 55′ to 36° 45′ N latitude and 50° 30′ to 53° 50′ E longitude (Fig. 1). The southern coasts of the Caspian Sea include plains made of Quaternary sediments. The study plain is located between the Caspian Sea in the north and the Alborz highlands in the south. This plain has a very low slope and its elevation varies from −27 m on the coast to 100 m in the south of the plain. Various land uses can be found, including paddy lands, residential areas, water resources, gardens and limited forest lands. Its surface water resources include rivers and wetlands, and its groundwater resources include unconfined and confined aquifers. In recent decades, due to the favorable climatic conditions and proximity to the sea, the tourism industry in this area has grown significantly. Population growth, a limited number of industries, agricultural development (use of pesticides and fertilizers) without observing environmental standards have increased pollution and reduced the quality of water resources in the study plain. Therefore, it is important to take the necessary measures for the protection of water resources.
2.2 Determination of Groundwater Quality Index
In this study, eight water quality parameters including cations and anions (K+, Na+, Ca2+, Cl−, Mg2+, and So42−), pH, and total dissolved salt (TDS) were selected. These parameters were used for estimating the groundwater quality index. Because of the lack of microbial pollution measurements in the region, there is a limitation in defining the water quality index. Out of 200 drinking water wells in the study area, 85 wells were selected based on their existing qualitative secondary data (water quality experiments). The selected wells had adequate water quality data over a 6-year period from 2012 to 2017 compared to shorter periods for other wells. (ABFAR 2017). Estimating the groundwater quality index (GWQI) for 85 wells was done using quantitative secondary data from (i.e., 6-year data with four samples per year). The location of these drinking water wells in the study area is shown in Fig. 1. Before looking into the groundwater quality index, it is necessary to select a standard criterion to decide the parameters’ greatest values. National standards of Iran for potable water quality including the eight parameters are given in Table 1.
Equation (1) was applied to estimate the groundwater quality index (GWQI) based on the standard values given in Table 1:
where Ci is the concentration of the chemical parameter (mg/L), CSi is the Iranian drinking water standard for each chemical parameter and wi is the relative weight of each chemical parameter. Each parameter has a different weight in terms of its contribution to water quality. Then the corresponding weight values of the parameters were aggregated using some types of sum or mean (e.g., arithmetic, harmonic, geometric), often including individual weighing factors (Horton 1965; Melloul and Collin 1998). To estimate the final index, by aggregating all the normalized parameters, the weights of parameters in the final GWQI are defined based on their participation in the water quality determination. The weight of participation of each parameter in the final groundwater quality index depicts in Table 2.
The GWQI values were classified based on Saeedi et al. (2010) into three classes; high (GWQI > 0.15); low (GWQI < 0.04) and suitable (0.04 < GWQI < 0.15). GIS was applied for data collecting and processing. Different base maps were used in the GIS environment with a resolution 50 by 50 m: a digital elevation model (DEM), the transmissivity of aquifer formations, the water table depth (TAMAB 2017), residential and industrial areas using topographic maps of the region, and GWQI values using the secondary qualitative data (ABFAR 2017). Table 3 shows the factors affecting groundwater quality and the GWQI index in some drinking water wells.
2.3 Geospatial Artificial Intelligence (GeoAI) for Groundwater Quality Simulation
In the modeling process, ANN, SOM, and CANFIS methods were used to simulate groundwater quality. Moreover, the same data were used for the training and test processes for each method. First, all data were divided into two splits: training data (60% for modeling and 10% for cross-validation), and test data (30% of the data). Data were normalized and the training process was performed. Then, the testing or validation was carried out and further the results were compared with the same statistical indices (R-sqr, MAE, and MSE). A sensitivity analysis of the model inputs was performed to select the main parameters of groundwater quality. Studying the effect of inputs on the outputs and determine the importance of each input. The results showed that the transmissivity of aquifer formations, groundwater depth, and distance from industrial areas were significant factors in determining the groundwater quality and are the best inputs in groundwater quality simulation. In the following, the methodology of each of the three methods is described.
2.3.1 Artificial Neural Network (ANN)
An ANN consists of three layers, an input layer, a hidden layer, and an output layer. A network can have more than one hidden layer. In this study, multi-layer perception (MLP) was used to simulate groundwater quality. The MLP was generated by adding one or more hidden layers to a one-layer perception, and this proposed topology can solve complex problems (Tokar and Marcos 2000). The determination of the network’s best structure and the number of neurons is important in network planning. MLP is the most realistic neural network architecture for classification or regression problems based on the literature (Lim et al. 2000; McGarry et al. 2001; Cohen and Intrator 2002; Gholami et al. 2015a, b). A three-layer (input, hidden, and output) feed-forward neural network was used with different learning techniques to simulate the groundwater quality index (GWQI). The feed-forward neural network is the first and simplest of artificial neural networks devised. In the feed-forward network, the information moves in only one direction forward, from the input nodes, through the hidden nodes, and the output nodes.
The MLP network was trained and optimized by a trial- and- error method. The trail-and error approach is the most applied method to find the optimum structure and the learning technique (Isik et al. 2013). The goal of network training is to determine how the network can simulate the relation between inputs and output variables. The modeling process was executed by NeuroSolutions software. The numbers of hidden neurons was varied from 1 to 10. The Levenberg–Marquardt (LM) training technique was found to be the best learning technique in groundwater quality simulation. Network training is part of the main stages in modeling using ANN. In the training process, the weight coefficients were calculated in intermediate and output layers (Tokar and Marcos 2000; May et al. 2008; Castelletti et al. 2012). The ANN model structure's determination involves defining the number of layers, the number of nodes in each layer, and how they are linked (Isik et al. 2013). The trial–error method and sensitivity analysis selected the appropriate input parameters. Eight input patterns were examined, and their performance was compared (Eq. 2 to Eq. 9):
where \(GWQI\) is the groundwater quality index, \(T\) is the transmissivity of aquifer formations (m2/day), \(Gw\) is the groundwater depth (m), and \(L_{c}\) is the distance of a well from contaminant and residential centers (m). P and F are populations, and the number of households in the area 1 km2 and H is the elevation (m).
2.3.2 Co-Active Neuro-Fuzzy Inference System (CANFIS)
In this study, a neuro-fuzzy hybrid model was utilized as well. The neural–fuzzy network is a feed-forward network supervised by a neural network learning algorithm through back-propagation during network training. Here various input vectors and an output vector were utilized. In the design of the neuro-fuzzy hybrid model, a trial-and-error approach determined the structure of the optimized input. The difference between the rate of observed and simulated water quality index is the objective function and the goal is to equalize the simulated and observed values. The rate of instantaneous error will be equal to zero according to Eq. (10):
where Ji (n) is the network moment error and represents the total error for neuron i in the output layer. ti (n) represents the desired target output of the ith network in the nth iteration, and ai (n) represents the predicted output from the system and is the actual output at each iteration. The model's selected weight was modified by estimating the output error and application of the back-propagation process. Weight correction was done using a gradient descent method and according to Eq. (11):
where Wij(n + 1) is the synaptic weight to the ith neuron in the output layer from the jth neuron in the previous layer. wij(n) is the rate of the weight in nth iteration, n denotes the steps of the iteration. g is the extent of step size or the learning rate coefficient because of controls the speed (Loganathan and Girija 2013), di (n) is the local error of the modeling. It has been estimated from Ji(n) in nth iteration, xi (n) is the regressor vector, and di(n)xi(n) is the gradient vector of the performance surface at iteration (n) for the ith input node.
The modeling process included the separation of training data and test data, how to optimize, select the optimal network structure and test the network in all three methods were the same.
2.3.3 Self-Organizing Map (SOM)
SOMs, which are nonlinear time-series models, are among those supervised learning methods that are used for classification and regression (Vapnik 1998; Bahrami et al. 2016). SOMs find an optimized decision-making plan to separate two mentioned classes so that two classes have the highest margin of separation in binary classification. In the SOM algorithm (Kohonen map), the dimensions of the data are cut through the self-organizing neural network application. This technique aims to reduce the dimensions of the data to one or two dimensions. In this method, the only known parameter is the input neurons, while the weight and output neurons are the unknown parameters that must be found. Further, each input vector x € Rn can be evaluated with the mi: Rn in any metric, usually the Euclidean distance (the length of a line segment between the two points). In fact, the SOM defines a mapping from the input data space Rn onto a regular two-dimensional array of nodes.The winning node c is then calculated by (Eqs. 12 & 13):
where x is mapped onto c relative to the parameter values mi.
The updated formula is as follows (Eq. 14):
where t is the discrete-time coordinate and hc,i is a function defining the neighborhood. The initial values of the mi can be random (Kohonen et al. 1996).
2.4 Evaluation of ANN, SOM, and CANFIS Performances
In the training phase, the best inputs were determined by changing the pattern of data inputs and analyzing the ANN, CANFIS, and SOM sensitivity to input data. In the next step, each of the three models’ optimum structure was selected by utilizing a trial-and-error approach and testing their performance and error criteria. The models' performance was evaluated by using the mean squared error (MSE) and the coefficient of determination (R-sqr), defined as (Eqs. 15 & 16):
where Qi is the observed value, \(\mathop {Qi}\limits^{ \wedge }\) is the simulated value, and \(\overline{Qi}\) is the mean of the observed data and \(\tilde{Q}{\text{i}}\) is the mean of the simulated data and n is the number of data points.
Finally, the observed GWQI values and the simulated GWQI values were compared in the test stages of the three used methods. The results of the test stages and the error criteria values were used to evaluate the methods' performance and to select the best method for simulation of the groundwater quality.
2.5 Groundwater Quality Mapping
Based on the results, the fuzzy neural network has the best results in simulating groundwater quality, and needs three inputs (Gw, Ta, and Dindustry). Therefore, the CANFIS network was coupled with GIS for the generation of a groundwater quality map. Raster layers corresponding to the three mentioned factors were prepared and were combined at a resolution of 50 by 50 m. One can use a higher spatial resolution which will lead to more accurate input values, e.g. the distance from the pollutant centers. In the CANFIS network environment using the tested optimal network, GWQI was estimated for ten thousands pixels. GWQI values of 85 studied drinking water wells were overlapped on the simulated raster layer of groundwater quality. The results’ accuracy and precision were evaluated by comparing the simulated values of GWQI and the observed values of GWQI. The layer of groundwater quality was presented as a groundwater quality map after classification.
3 Results
3.1 Groundwater Quality Modeling
The groundwater quality index (GWQI) in 85 studied wells based on observational data ranged from 0.05 to 0.34 and the mean GWQI was estimated at 0.21. The mean groundwater depth varied from one meter in the coasts to 40 m in the south of the plain. The transmissivity of aquifer formations has a range between 75 m2/day in fine-grained formations to 3000 m2/day in light-textured structures. Distance from industries is an important and influential factor in groundwater quality, and ranged from 10 m to several kilometers in study wells. The correlation coefficient between the GWQI and the factors of distance from industries, the groundwater depth and transmissivity of aquifer formations was estimated to be −0.56, −0.68 and 0.55, respectively. Therefore these factors are the most important ones in the groundwater quality assessment in the study plain. There is an inverse relationship between groundwater quality and distance from industry, and the closer we get to industry, the larger the pollution (Gholami et al. 2015a, b). The transmissivity of aquifers is also directly related to the emission of pollutants (Shi et al. 2016). Moreover, the higher the groundwater depth, the greater the risk of contamination (Adamowski and Chan 2011; Gholami et al. 2015a, b). Other factors such as precipitation, population and site elevation did not have a strong significant relationship with the GWQI in the study plain.
The modeling process was performed using the same data and three methods to simulate the groundwater quality. Table 4 illustrates the results of the training stage (R-sqr and MSE). The cross-validation stage was performed for suitable training, and MSE values in the three applied methods were estimated from 0.008 to 0.02.
Table 5 shows the test stage results to evaluate the performance of three models in GWQI estimation. In the training stage of the ANN method, the mean square error (MSE) and the coefficient of determination (R-sqr) were 0.006 and 0.95, respectively. While, in the test stage, they were 0.005 and 0.73, respectively. The optimal ANN was a MLP network with two hidden neurons, the LM learning technique, and the tangent hyperbolic transfer function. In the training stage of the CANFIS method, the MSE and R-sqr were 0.0004 and 0.98, respectively. While, in the test stage, they were 0.004 and 0.89, respectively. In the training stage of the SOM method, it was revealed that the MSE and R-sqr are 0.0008 and 0.96, respectively. In the test stage, they were 0.008 and 0.8, respectively.
According to the results, the CANFIS method has the best results in groundwater quality modeling among the three applied methods. Figure 2 shows a comparison between the simulated values and the observed values of GWQI by using the three methods.
3.2 Groundwater Quality Mapping
Groundwater quality of the study plain using the predicted GWQI values by the tested CANFIS network and the GIS tool was simulated. After classification of the simulated GWQI, the groundwater quality map was presented in Fig. 3. Moreover, GWQI values of 85 drinking water wells were overlapped on the groundwater quality map to assess the accuracy and precision of the results by comparing the predicted values and the observed values. The classified GWQI map was used for evaluating the groundwater quality in the Mazandaran plain. According to the produced groundwater quality map, most of the study plain area has moderate groundwater quality and a limited part has very good groundwater quality or polluted groundwater quality.
4 Discussion
In the training stage, it was found that the best inputs include three inputs: groundwater depth, distance from the pollutant centers, and transmissivity of aquifer formations. The mentioned factors are the most important factor in groundwater quality in the study area (Gholami et al. 2015a, b; Sahour et al. 2020). According to the statistical analysis, these factors have the highest correlation with groundwater quality. The main issue in the modeling process is the selection and accuracy of inputs and output. The groundwater depth and transmissivity of aquifer formations were measured by Mazandaran regional water company (MRWC). The distance from the pollutant centers can be measured using satellite images and GIS. Therefore, the models inputs are available.
The results showed that the ANN has a high performance in groundwater quality simulation. The ANN can be applied to simulate hydrologic variables in an extensive area. The results also showed that the MLP network produces the best performance for the LM learning technique and the tangent hyperbolic transfer function. Using ANN for hydrologic simulation followed good results in the past, and, in most cases, there has been a high correlation between simulated and observed values (Crawford and Linsley 1966; Chang et al. 2002).
According to the results of Table 4 the results of the three models and associated GeoAI are acceptable and satisfactory in the training phase. The results showed that the SOM has an acceptable accuracy in estimating the GWQI index in the test stage.
The results of this study also indicated the acceptable ability of the CANFIS network in the simulation of groundwater quality. Previous research results showed high performance of the neuro-fuzzy network with the structure of the Takagi–Sugeno–Kang (TSK) model in estimation and hydrologic simulations as well (Jang et al. 1997; Talei et al. 2010; Heydari and Talaee 2011). The results of present study showed that the neuro-fuzzy network has a higher performance in estimating the groundwater quality index (R-sqr = 0.89 in the test stage). Such results are consistent with the results of other researchers (Heydari and Talaee 2011; Allawi et al. 2018).
Comparing the observed and the estimated values of GWQI showed the performance of the used methods, and the high performance of the procedure of incorporating the applied methods to GIS (Krishna et al. 2008). All three methods of ANN, SOM, and CANFIS had acceptable accuracy in simulating the groundwater quality. But, the CANFIS method has the higher performance than the SOM and ANN methods. The maximum error in the simulation was observed in the minimum and maximum values of the GWQI. Models are generally more efficient in medium values and the maximum error in the modeling process is observed in the maximum and minimum data (Sahour et al. 2020). In the discussion of water quality, accurate estimation of maximum and minimum water quality index values is important to identify contaminated zones or zones with very good drinking water quality. Comparison of the results of the test phase has shown that the neural fuzzy method has the least error in modeling the maximum and minimum values compared to the other two methods, which is a potential for identifying contaminated zones with high accuracy. Finally, the method used to classify water quality in general or to achieve water quality classes have performed very well.
It is possible to estimate groundwater quality indices using artificial intelligence in a short period for sites without qualitative measurements. Therefore, the coupling of artificial intelligence and GIS is a promising way forward to support groundwater quality assessments and management. The groundwater quality map was generated by the coupling of CANFIS and GIS tools. Fortunately, the quantitative data of network inputs are available in the study area. Therefore, one can use the present methodology to model groundwater quality on the study plan's surface.
The produced groundwater quality map can be a tool for groundwater quality classification, planning water resources management, and their quality protection. Further, in the discussion of land management and the establishment of different industries and landuses, the mentioned map can be used, for example, in places with high pollution, the establishment of new industries should be prevented. The artificial intelligence models used in the present study have their own advantages and disadvantages. These methods are more accurate than traditional methods such as regression method. It is also possible to define different management scenarios in the form of changing inputs for the model and evaluate their effect on the groundwater quality. For example, changes in industrial distances from a specific location or fluctuations in groundwater depth can be used for the tested model and their effect on water quality in the model can be evaluated. They also have the ability to provide multiple models with a variable number of inputs appropriate to the region (existing data) but their performance will be different. In addition to all the advantages mentioned, these models are black boxes and their use requires expertise in the field of artificial intelligence. Moreover, their combined use with a GIS is not as easy as the traditional methods such as regression methods. Considering all the above issues, their use is recommended for high performance and quick modelling in water resources studies.
5 Conclusion
The results show that all models have acceptable accuracy in predicting the groundwater quality, but the fuzzy neural network has the highest performance. This study’s findings show that the groundwater quality assessed using the CANFIS model was in good agreement with experimental data. Hence, it can be concluded that CANFIS has successfully assessed the groundwater quality. All of the applied methods can be used in hydrological or environmental modeling. Further, the coupled artificial intelligence methods and GIS can be used for zoning groundwater quality and as a tool for planning and managing water resources. Coupling artificial intelligence with GIS tools can provide practitioners with an easily interpretable water quality production map to manage these resources. The best groundwater modeling results were found by using the fuzzy neural network in Mazandaran Plain, but the other methods such as ANN or SOM may have better results in the other places. Therefore, it can be recommend to apply this proposed approach of combining three models in other geographical locations. It can be also concluded that GeoAI mapping performance depends on inputs and output data and the modeling process. Therefore, using different artificial intelligence machine learning algorithms for groundwater quality assessment in future studies was suggested. These models can be used to predict and map the groundwater quality in the other regions, but they need to be optimized and tested. However, It can be concluded that the result of this study has good potential to contribute to SDG-6 (clean water and sanitation), SDG-11 (sustainable city and communities), and SDG-13 (climate action).
Data Availability
All data and materials as well as software application or custom code support our published claims and comply with field standards.
Change history
06 January 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11269-021-03049-1
References
Adamowski J, Chan HF (2011) A wavelet neural network conjunction model for groundwater level forecasting. J Hydrol 407(1–4):28–40. https://doi.org/10.1016/j.jhydrol.2011.06.013
Allawi MF, Jaafar O, Mohamad Hamzah F, Mohd NS, Deo RC, El-Shafie A (2018) Reservoir inflow forecasting with a modified coactive neuro-fuzzy inference system: a case study for a semi-arid region. Theor Appl Climatol 134(1–2):545–563
Aneetha AS, Bose S (2012) The combined approach for anomaly detection using neural networks and clustering techniques. Comput Sci Eng Int J 2(4):37–46
Bahrami S, Ardejani FD, Baafi E (2016) Application of artificial neural network coupled with genetic algorithm and simulated annealing to solve groundwater inflow problem to an advancing open-pit mine. J Hydrol 536:471–484
Besalatpour AA, Ayoubi S, Hajabbasi MA, Gharipour A, Yousefian Jazi A (2014) Feature selection using the parallel genetic algorithm for the prediction of the geometric mean diameter of soil aggregates by machine learning methods. Arid Land Res Manage 28(4):383–394
Brown RM, McClelland NI, Deininger RA, Tozer RG (1970) A water quality index: do we dare? Water Sewage Works 117:339–343
Castelletti A, Galelli S, Restelli M, Soncini-Sessa R (2012) Data-driven dynamic emulation modelling for the optimal management of environmental systems. Environ Model Softw 34:30–43
Chang FJ, Chang LC, Huang HL (2002) Real-time recurrent learning neural network for stream-flow forecasting. Hydrol Processes 16(13):2577–2588
Choubin B, Borji M, Sajedi Hosseini F, Mosavi AH, Dineva AA (2020a) Mass wasting susceptibility assessment of snow avalanches using machine learning models. Sci Rep 10(1):18363. https://doi.org/10.1038/s41598-020-75476-w
Choubin B, Sajedi Hosseini F, Fried Z, Mosavi AH (2020b) Application of bayesian regularized neural networks for groundwater level modeling, 2020, CANDO-EPE 2020—Proceedings, IEEE 3rd International Conference and Workshop in Obuda on Electrical and Power Engineering, 9337753:209–212
Cohen S, Intrator N (2002) Automatic model selection in a hybrid perceptron radial network. Inf Fusion 3(4):259–266
Crawford NH, Linsley RK (1966) Digital simulation in hydrology: Stanford watershed model IV, technical report 10-department of civil engineering. Stanford University, Stanford
Ehsani AH, Quiel F (2008) Geomorphometric feature analysis using morphometric parametrization and artificial neural networks. Geomorphology 99(1–4):1–12
Fang Z, Wang Y, Peng L, Hong H (2020) A comparative study of heterogeneous ensemble-learning techniques for landslide susceptibility mapping. Int J Geog Inf Sci 35(2):321–347
Farjad B, Helmi ZMS, Thamer AM, Pirasteh S (2012) Groundwater Intrinsic vulnerability and risk mapping. Water Manage 165(8):441–450
Gholami V, Darvari Z, Mohseni Saravi M (2015a) Artificial neural network technique for rainfall temporal distribu-tion simulation (case study: Kechik region). Caspian J Environ Sci (CJES) 13(1):53–60
Gholami V, Aghagoli H, Kalteh AM (2015b) Modeling sanitary boundaries of drinking water wells on the Caspian Sea southern coasts. Iran Environ Earth Sci 74(4):2981–2990
Gholami V, Khaleghi MR, Taghvaei E (2020) Groundwater quality modeling using self-organizing map (SOM) and geographic information system (GIS) on the Caspian southern coasts. J Mountain Sci 17:1724–1734. https://doi.org/10.1007/s11629-019-5483-y
Gholami V, Sahour H, Amri MAH (2021) Soil erosion modeling using erosion pins and artificial neural networks. Catena 196:104902. https://doi.org/10.1016/j.catena.2020.104902
Ghose DK, Panda SS, Swain PC (2010) Prediction of water table depth in western region, Orissa using BPNN and RBF neural networks. J Hydrol 394(3–4):296–304
Giljanovic NS (1999) Water quality evaluation by index in Dalmata. Water Res 33(16):3423–3440
Haddad OB, Tabari MMR, Fallah-Mehdipour E, Mariño MA (2013) Groundwater model calibration by meta-heuristic algorithms. Water Resour Manag 27:2515–2529
Haider H, Singh P, Ali W, Tesfamariam S, Sadiq R (2015) Sustainability evaluation of surface water quality management options in developing countries: multicriteria analysis using fuzzy UTASTAR method. Water Resour Manag 29:2987–3013. https://doi.org/10.1007/s11269-015-0982-2
Han HG, Chen QL, Qiao JF (2011) An efficient self-organizing RBF neural network for water quality prediction. Neural Netw 24(7):717–725
Heydari M, Talaee PH (2011) Prediction of flow through rockfill dams using a neuro-fuzzy computing technique. J Math Comput Sci 2(3):515–528
Horton RK (1965) An index number system for rating water quality. J Water Pollut Control Fed 37(3):300–306
House MA, Ellis JB (1987) The development of water quality indices for operational management. Water Sci Technol 19(9):145–154
Hsu KL, Gupta HV, Sorooshian S (1995) Artificial neural network modeling of the rainfall-runoff process. Water Resour Res 31(10):2517–2530
Huang PC, Hsu KL, Lee KT (2021) Improvement of two-dimensional flow-depth prediction based on neural network models by preprocessing hydrological and geomorphological data. Water Resour Manage 35:1079–1100
Isik S, Kalin L, Schoonover J, Srivastava P, Lockaby BG (2013) Modeling effects of changing land use/cover on daily streamflow: an artificial neural network and curve number based hybrid approach. J Hydrol 485(2):103–112
Iwashita F, Friedel MJ, Roberto C, Filho S (2011) Using self-organizing maps to analyze high-dimensional geochemistry data across Paraná, Brazil. Conference: 15th Simpósio Brasileiro de Sensoriamento Remoto, April 2011
Jang JSR, Sun CT, Mizutani E (1997) Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. Prentice-Hall, Hoboken
Jassar KK, Dhindsa KS (2015) Comparative study and performance analysis of clustering algorithms. Int J Comput Appl 975:8887
Klobucar D, Subasic M (2012) Using self-organizing maps in the visualization and analysis of forest inventory. iForest Biogeosc For 5(5):216–223
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69
Kohonen T (1984) Self-organization and associative memory. Springer, Berlin
Kohonen T (2005) “Intro to SOM”. SOM Toolbox. Retrieved July 18 2006
Kohonen T, Hynninen J, Kangas J, Laaksonen J (1996) SOMPAK: The self-organizing map program package. Report A31. Helsinki University of Technology, Laboratory of Computer and Information Science, Espoo, Finland. http://www.cis.hut.fi/research/som_lvq_paks.html
Krishna B, Satyajit Rao YR, Vijaya T (2008) Modeling groundwater levels in an urban coastal aquifer using artificial neural networks. Hydrol Processes 22(8):1180–1188
Laaksonen J, Honkela T (eds) (2011) Advances in self-organizing maps, WSOM 2011. Springer, Berlin
Li W, Hsu CY (2020) Automated terrain feature identification from remote sensing imagery: a deep learning approach. Int J Geogr Inf Sci 34(4):637–660
Li J, Shi Z, Liu F (2020) Evaluating spatiotemporal variations of groundwater quality in northeast Beijing by self-organizing map. Water 12(5):1382. https://doi.org/10.3390/w12051382
Lim TS, Loh WY, Shih YS (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithm. Mach Learn 40(3):203–238
Lin GF, Wang TC, Chen LH (2016) A forecasting approach combining self-organizing map with support vector regression for reservoir inflow during Typhoon periods. Adv Meteorol. https://doi.org/10.1155/2016/7575126
Loganathan C, Girija KV (2013) Hybrid learning for adaptive neuro fuzzy inference system. Int J Eng Sci 2(11):6–13
Maliqi E, Jusufi K, Singh SK (2020) Assessment and spatial mapping of groundwater quality parameters using metal pollution indices, graphical methods and geoinformatics. Anal Chem Lett 10(2):152–180
May RJ, Dandy GC, Maier HR, Nixon JB (2008) Application of partial mutual information variable selection to ANN forecasting of water quality in water distribution systems. Environ Modell Softw 23(10–11):1289–1299
Mazandaran Rural Water and Sewer Company (ABFAR) (2017) Department of Researches and exploitation. The qualitative experiments for drinking water wells
McGarry KJ, Werner S, MacIntyre J (2001) Knowledge extraction from radial basis function networks and multi-layer perceptrons. Int J Comput Intell Appl 1(3):369–382
Melloul AJ, Collins M (1998) A proposed index for aquifer water-quality assessment: the case of Israel’s Sharon region. J Environ Manage 54(2):131–142
Memarian H, Pourreza Bilondi M, Rezaei M (2016) Drought prediction using co-active neuro-fuzzy inference system, validation, and uncertainty analysis (case study: Birjand, Iran). Theor Appl Climatol 125(3–4):541–554
Millar EE, Hazell EC, Melles SJ (2019) The ‘cottage effect’ in citizen science? spatial bias in aquatic monitoring programs. Int J Geogrl Inf Sci 33(8):1612–1632
Mosaffa M, Nazif S, Amirhosseini YK, Balderer W, Meiman MH (2021) An investigation of the source of salinity in groundwater using stable isotope tracers and GIS: a case study of the Urmia Lake basin, Iran. Groundwater Sustainable Dev 12:100513
Mosavi AH, Sajedi Hosseini F, Choubin B, Abdolshahnejad M, Gharechaee HR, Lahijanzadeh A, Dineva AA (2020a) Susceptibility prediction of groundwater hardness using ensemble machine learning models. Water (Switzerland) 12(10):2770. https://doi.org/10.3390/w12102770
Mosavi AH, Sajedi-Hosseini F, Choubin B, Taromideh F, Rahi GR, Dineva AA (2020b) Susceptibility mapping of soil water erosion using machine learning models. Water (Switzerland) 12(7):1995. https://doi.org/10.3390/w12071995
Mosavi AH, Sajedi Hosseini F, Choubin B, Goodarzi M, Dineva AA (2020c) Groundwater salinity susceptibility mapping using classifier ensemble and bayesian machine learning models. IEEE Access 8(9162111):145564–145576. https://doi.org/10.1109/access.2020.3014908
Muller SJ, Van Niekerk A (2016) An evaluation of supervised classifiers for indirectly detecting salt-affected areas at irrigation scheme level. Int J Appl Earth Obs Geoinf 49:138–150
Nikoo MR, Mahjouri N (2013) Water quality zoning using probabilistic support vector machines and self-organizing maps. Water Resour Manage 27:2577–2594
Pal J, Chakrabarty D (2020) Assessment of artificial neural network models based on the simulation of groundwater contaminant transport. Hydrogeol J 28(1–2):1–17. https://doi.org/10.1007/s10040-020-02180-4
Pirasteh S, Tripathi NK, Ayazi MH (2006) Localizing groundwater potential zones in parts of karst Pabdeh Anticline, Zagros Mountain, south-west Iran using geospatial techniques. Int J Geoinf 2(2):35–42
Pramanik N, Panda RK (2010) Application of neural network and adaptive neuro-fuzzy inference systems for river flow prediction. Hydrol Sci J 55(8):1455–1456
Saeedi M, Abessi O, Sharifi F, Meraji H (2010) Development of groundwater quality index. J Environ Monit Assess 163:327–335
Sahour H, Gholami V, Vazifedan M (2020) A comparative analysis of statistical and machine learning techniques for mapping the spatial distribution of groundwater salinity in a coastal aquifer. J Hydrol 591:125321. https://doi.org/10.1016/j.jhydrol.2020.125321
Scottish Development Department (1975) Towards cleaner water. Edinburgh: HMSO, Report of a River Pollution Survey of Scotland
Shi X, Jiang S, Xu H, Jiang F, He Z, Wu J (2016) The effects of artificial recharge of groundwater on controlling land subsidence and its influence on groundwater quality and aquifer energy storage in Shanghai. China Environ Earth Sci 75:195. https://doi.org/10.1007/s12665-015-5019-x
Singh P (2014) An efficient concept-based mining model for analysis partitioning clustering. Int J Recent Technol Eng 2(6):1–3
Singh K, Shashtri S, Mukherjee M, Kumari R, Avatar R, Singh A, Prakash Singh R (2011) Application of GWQI to assess the effect of land-use change on groundwater quality in lower Shiwaliks of Punjab: remote sensing and GIS-based approach. J Water Resour Manag 25(7):1881–1898
Talei A, Chua LHC, Quek C (2010) A novel application of a neuro-fuzzy computational technique in event-based rainfall-runoff modeling. Expert Syst Appl 37(12):7456–7468
TAMAB (Water Resources Research Organization of Iran) (2017) Mazandaran regional water company, hydrogeology studies of Mazandaran Plain. Atlas report
Tokar AS, Marcus M (2000) Precipitation-runoff modeling using artificial neural networks and conceptual models. J Hydrol Engine ASCE 45:156–161
Tongal H, Booij MJ (2018) Simulation and forecasting of streamflows using machine learning models coupled with base flow separation. J Hydrol 564:266–282
Toor AK, Singh A (2013) Analysis of clustering algorithms based on number of clusters, error rate, computation time and map topology on large data set. Int J Emerging Trends Technol Comput Sci 2(6):94–98
Ullah N, Choudhury P (2013) Flood flow modeling in a river system using adaptive neuro-fuzzy inference system. Environ Manag Sustainable Dev 2(2):54–68. https://doi.org/10.5296/emsd.v2i2.3738
Vapnik V (1998) Statistical learning theory. Wiley, New York
Varvani J, Khaleghi MR (2019) A performance evaluation of neuro-fuzzy and regression methods in estimation of sediment load of selective rivers. Acta Geophys 67(1):205–214
Wang YM, Chang JX, Huang Q (2010) Simulation with RBF neural network model for reservoir operation rules. Water Resour Manag 24:2597–2610
Wu S, Wang Z, Du Z, Huang B, Zhang F, Liu R (2020) Geographically and temporally neural network weighted regression for modeling spatiotemporal non-stationary relationships. Int J Geog Inf Sci 35(3):582–608
Xie Y, Cai J, Bhojwani R, Shekhar S, Knight J (2020) A locally-constrained yolo framework for detecting small and densely-distributed building footprints. Int J Geog Inf Sci 34(4):777–801
Acknowledgements
The authors would like to thank ABFAR (Mazandaran Rural Water and Sewer Company) to provide the groundwater quality secondary data and help us with data preprocessing.
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study’s conception and design. Material preparation, data collection, and analysis were performed by [VG], [MRK], and [SP]. The first draft of the manuscript was written by [MRK] and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethical Approval
Not applicable.
Consent to Participate
All authors consent to participate in this manuscript.
Consent to Publish
All authors consent to publish this manuscript.
Conflict of Interest
There is no conflict of interest among authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: In this article, the city of the third affiliation was mistakenly given.
Rights and permissions
About this article
Cite this article
Gholami, V., Khaleghi, M.R., Pirasteh, S. et al. Comparison of Self-Organizing Map, Artificial Neural Network, and Co-Active Neuro-Fuzzy Inference System Methods in Simulating Groundwater Quality: Geospatial Artificial Intelligence. Water Resour Manage 36, 451–469 (2022). https://doi.org/10.1007/s11269-021-02969-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-021-02969-2