Introduction

Groundwater is the major source of freshwater supply worldwide, which is currently used to meet nearly half of the drinking water needs, including requirement of about two billion people all over the world (WWAP 2009; Gleeson et al. 2010). In addition, groundwater provides around 43% of the water consumed in irrigation (Siebert et al. 2010). Accordingly, the major groundwater systems of the world do not remain in dynamic equilibrium rather do show significantly declining groundwater level trends (WWAP 2012). It has been estimated that about 700–800 km3 of groundwater has been depleted from the aquifers in the USA during the twentieth century (Konikow and Kendy 2005). Likewise, a World Bank Report (World Bank 2010) states that India is the largest consumer of groundwater in the world, with an estimated annual groundwater use of 230 km3. The fast-depleting groundwater resources, as depicted by the declining groundwater levels, caused deterioration in groundwater quality in many parts of the world. Groundwater quality may be further degraded due to pressure created over hydrologic and hydrogeologic systems in view of the impacts of climate change and variability (e.g., Gurdak et al. 2012; Bondu et al. 2016).

The degradation of groundwater quality has also led to the reduction of the exploitable quantities. There are two major sources of the groundwater quality degradation, i.e., natural (geogenic) processes and anthropogenic activities. For example, in agricultural areas, excessive use of fertilizers has resulted in nitrate contaminations in groundwater well above the water quality guidelines (e.g., Machiwal et al. 2011; Paradis et al. 2016). In coastal areas, the overexploitation of aquifers via numerous wells and boreholes has established a negative water balance triggering seawater intrusion and salinization of groundwater (Ferguson and Gleeson 2012). Anthropogenic activities that pollute the natural environment with potential toxic elements (such as hexavalent chromium, arsenic, and antimony) include paint manufacturing, tannery industry, mining activities, phosphate fertilizer manufacturing and the combustion of coal and fly ash deposits (Molina et al. 2009; Jacobs and Testa 2004). Natural processes also influence groundwater quality at both local and regional scales depending on the geological, hydrochemical and hydrogeological regimes. The release of arsenic (As) in groundwater is mainly controlled by the oxidation of organic carbon coupled with the reductive dissolution of As-bearing iron oxides (Postma et al. 2012), while As concentrations exceeding the WHO provisional guideline value of 10 µg/L (WHO 2017) were measured in groundwater from fractured bedrock aquifer of the Canadian Shield, primarily derived from the weathering of As-bearing sulfides in the oxic/suboxic zone of the aquifer (Bondu et al. 2017). Additionally, geothermal fluids can also enhance As concentrations in shallow aquifers through deep fractures (Pique et al. 2010; Iskandar et al. 2012). Iskandar and Koike (2011) identified the deep-seated hydrothermal system as the major source of As contamination along the fault zone in North Sulawesi, Sulawesi Island, Indonesia, by applying geostatistical and numerical simulation models. The geogenic origin of hexavalent chromium is attributed to ophiolitic rocks and specifically their serpentinized derivatives (Nriagu and Nieboer 1988), while the concentrations of hexavalent chromium are influenced by the prevailing hydrogeological conditions (Kazakis et al. 2015). It is very much needed to pay adequate attention for evaluation of the quality of this vital but invisible resource from local to regional scales based on scientific knowledge, to manage it in a sustainable manner. Therefore, throughout the world, increasing demands for safe drinking water, agricultural and industrial use of it, as well as maintaining healthy ecosystems are leading stakeholders and scientists to develop appropriate strategies and methods varying from simple to complex nature for rational groundwater resource management and protection (e.g., Council of Canadian Academies 2009).

There exist several conventional tools and techniques ranging from graphical to statistical that have been used by various researchers for interpreting groundwater quality (Freeze and Cherry 1979; Karanth 1987; Sara and Gibbons 1991; Güler et al. 2002; Machiwal and Jha 2010). In recent times, researchers felt a need for application of the modern techniques such as time series modeling (e.g., trend identification), multivariate statistics, and geostatistical modeling, among others, to better interpret and precisely characterize the groundwater quality for the efficient management and protection of groundwater resources (e.g., Güler et al. 2002; Jha et al. 2007; Cloutier et al. 2008; Steube et al. 2009; Machiwal and Jha 2010, 2015). The modern techniques also contribute to distinguish between the anthropogenic and natural processes and/or factors influencing the groundwater quality. Salient popular methods used for groundwater quality evaluation and protection are classified into distinct groups and subgroups as shown in Fig. 1.

Fig. 1
figure 1

Classification of salient methods for the groundwater quality evaluation and protection

With the advent of geographic information system (GIS) technology, especially after 1990s, visualization, interpretation and presentation of groundwater quality evaluations over large spatial scales has been drastically improved. The GIS is capable of capturing, storing, analyzing, manipulating, retrieving and displaying a large volume of spatial data for swift organization, quantification and interpretation for decision-making in areas including engineering and environmental sciences (e.g., Stafford 1991; Goodchild et al. 1993; Burrough and McDonnell 1998; Lo and Yeung 2003). It has been proved to be a powerful tool for analyzing and mapping the hydrologic/hydrogeologic data over spatial and temporal scales in order to provide useful information about spatio-temporal variability that ultimately helps in decision-making (Burrough and McDonnell 1998; Gurnell and Montgomery 2000; Chang 2002; Chen et al. 2004; Güler and Thyne 2004a; Machiwal and Jha 2014). The GIS applications are advantageous in groundwater quality evaluation studies particularly for mapping spatial variations of water quality, subsurface flow and pollution modeling, and groundwater-quality monitoring network design, etc. (Jha et al. 2007). In addition, GIS-based water quality mapping is imperative for pollution-hazard modeling, assessment and protection planning, and detection of environmental changes (Goodchild et al. 1993; Skidmore et al. 1997; Chen et al. 2004; Jha et al. 2007).

Recently, advanced statistical tools are successfully integrated with GIS by researchers for illustrating spatial distribution of the chemical composition of groundwater over the large areas, sometimes up to regional scale. In addition, many studies have utilized the GIS-integrated statistical techniques and approaches to determine the hydrochemical regime and establish strategies to manage groundwater resources under the complexity of natural processes in conjunction with the anthropogenic practices that influence groundwater quality. The purpose of this paper is to explore the literature in order to evaluate the current standing of the GIS-integrated statistical analyses used for groundwater quality evaluation, and to focus on the future research directions. To the authors’ knowledge, this kind of review does not currently exist in the literature. This paper deals with the past studies having application of time series modeling and multivariate statistical/geostatistical techniques for groundwater evaluation and artificial intelligence techniques for groundwater vulnerability assessment. Then, this paper outlines distinct GIS-based water quality indices developed and applied for groundwater quality assessment worldwide. Finally, it highlights the limitations and research gaps incurred in the past studies, and then it emphasizes on the future research needs to be considered for a better evaluation of groundwater quality under the framework of GIS-integrated statistical techniques.

Time series modeling of groundwater quality variables

Components, steps and assumptions of time series analysis

A “time series” may be defined as “a sequence of values collected over time on a particular variable” (Haan 1977). Similar to the time-variable data series, there exist “spatial data series” in hydrogeology. In spatial series, data are location-dependent instead of time-dependent as in the time series. Most time series analysis techniques are equally applicable to spatial data series (Shahin et al. 1993), and hence, spatial data series is sometimes referred to as time series. In general, a hydrologic or hydrogeologic time series is composed of deterministic and stochastic components (Haan 1977; Shahin et al. 1993). The deterministic component presents a systematic pattern in the time series and can be classified as a trend, a shift or jump, a periodic component, or a combination of these (Haan 1977). The time series analysis intends at detection and quantitative description of each of the generating processes underlying a given sequence of observations (Shahin et al. 1993). There are four major steps involved in a time series modeling (McCuen 2003): (i) detection, (ii) analysis, (iii) synthesis, and (iv) verification. In the detection step, systematic components of the time series such as trends and periodicity are identified. In the analysis step, the systematic components are analyzed to identify their characteristics, including magnitudes, form and their duration over which the effects exist. In the synthesis step, information from the analysis step is accumulated to develop a time series model and to evaluate goodness-of-fit of the developed model. Finally, in the verification step, the developed time series model is evaluated using independent sets of data. For further details of the time series analysis, the reader is referred to the specialized books on time series analysis, such as Yevjevich (1972), Salas et al. (1980), Bras and Rodriguez-Iturbe (1985), Cryer (1986), Clarke (1998), and Machiwal and Jha (2012). Most statistical analyses using hydrologic time series are based on fundamental assumptions of the time series characteristics, which include; the series is homogenous and follows normal probability distribution, stationary, free from trends and shifts, non-periodic with no persistence (Adeloye and Montaseri 2002).

About five decades ago, hydrologic application of time series modeling was confined up to surface water problems, especially for analyzing the hydrologic extremes, such as floods and droughts (McCuen 2003). However, with enlarging domain of statistical hydrology over the past few decades, time series analyses presently encompass the problems of surface water as well as groundwater systems (Shahin et al. 1993; Machiwal and Jha 2006). With such a broad domain, time series analysis has emerged as a powerful tool for analyzing surface and subsurface hydrologic time series data. An extensive review on the applications of time series analysis in surface water hydrology, climatology and groundwater hydrology has been presented by Machiwal and Jha (2006). That review revealed that although several studies deal with the application of time series analysis in surface water hydrology, the application of time series analysis in groundwater hydrology is highly limited. Salient studies analyzing characteristics of the hydrogeochemistry time series are reviewed in the following sub-sections.

Normality of groundwater quality variables

The assumption of presence of normality in time series of groundwater quality variables is very crucial in obtaining reliable results of the parametric statistical tests (USEPA 1996). In the past studies dealing with geochemistry data, normality tests are applied to the point data of the groundwater quality variables, and not to the spatial data/maps on GIS platform. For example, Mouser et al. (2005) tested pH, electrical conductivity (EC) and calcium concentration data from the Molly Bog peatland located between Stowe and Morristown in Vermont, USA, for presence of normality using normal probability plot and Shapiro–Wilk test. Results suggested that calcium concentration did not follow a normal distribution, which was then subjected to logarithmic-transformation. Chou (2006) suggested use of normal probability plot for testing normality in environmental data, and transforming the non-normal data by applying logarithmic or Box–Cox transformations (Box and Cox 1964). Aguilar et al. (2007) applied three tests (i.e., Shapiro–Wilk, Shapiro–Francia, and D’Agostino tests) to identify presence of normality in nitrate concentrations of 24 sites located in Hesbaye chalk aquifer of the Geer basin of Belgium. The Shapiro–Wilk test was employed for a sample size < 50, and the Shapiro–Francia test for a sample size ≥ 50. In another study, Kolmogorov–Smirnov test was used to examine normality of chemical (i.e., ammonium, nitrate, nitrite, soluble reactive phosphorus and total phosphorus) and microbiological (i.e., bacterial abundance, cell biomass and bacterial biomass) variables of groundwater samples collected in Doñana aquifer system of southwest Spain (Ayuso et al. 2009). The non-normal variables identified by the Kolmogorov–Smirnov test were then transformed to make them normal. Nas and Berktay (2010) used frequency plot and quantile–quantile plot to check normality of urban groundwater quality data (i.e., pH, EC, chloride, sulfate, hardness and nitrate) for 177 sites in Konya City (Turkey). Results revealed non-normality of EC, chloride, sulfate, hardness and nitrate, which were then normalized by log-transformation. Hosseini and Mahjouri (2014) employed Anderson–Darling test to assess normality of nitrate concentrations in Karaj aquifer of Iran. The non-normal data were transformed by using logarithmic transformations. Jovein and Hosseini (2017) examined normality of EC of the groundwater samples collected from Mahvelat Plain located in the northeastern part of Iran by normal quantile–quantile and frequency plots, and transformed the data by applying Box–Cox transformation. Recently, Leite et al. (2018) evaluated multivariate normality using Shapiro–Wilk generalized test of 14 water quality parameters for 12 sites comprising three distinct micro-regions of Santa Catarina State (four sites per region) in the municipality of Ponte Alta do Norte and São Cristóvão do Sul, Brunópolis, and Curitibanos located in Marombas River basin of southern Brazil.

In addition to statistical tests, skewness and kurtosis values were also computed to test normality of trace elements (i.e., arsenic, lead, cadmium, and aluminum) present in groundwater of Dhemaji district of Assam, India (Buragohain et al. 2010). All trace elements followed non-uniform distribution in the area. Similarly, skewness and kurtosis values besides Kolmogorov–Smirnov test were used to evaluate normality of groundwater quality parameters in Amol-Babol Plain (Narany et al. 2014) and Torbat-Zaveh Plain, Khorasan Razavi (Nematollahi et al. 2016) of Iran, Modena Plain of central Italy (Barca and Passarella 2008), Antonio-El Triunfo mining district, Baja California Sur of Mexico (Wurl et al. 2014) and Pingtung Plain of Taiwan (Jang et al. 2016). The normality of trace elements present in groundwater of Greece was evaluated by computing skewness and kurtosis values, and Box–Cox transformations were used to normalize data (Dokou et al. 2015). Using Shapiro–Wilk test, Noshadi and Ghafourian (2016) checked normality of groundwater quality parameters (i.e., calcium, chloride, bicarbonate, magnesium, sodium, nitrite, nitrate, pH, sulfate, total dissolved solids (TDS), hardness, EC, and sodium adsorption ratio) at 298 sites in Fars province of Iran. Results indicated presence of normality in all the parameters except magnesium that was later on considered normal looking at value of skewness (< 2). In addition, Gan et al. (2018) log-transformed chloride, sulfate and arsenic with substantial skewness and kurtosis values to improve the normality of distribution in groundwater of Jianghan Plain, located in central Yangtze River Basin of central China.

It is clear from the above discussion that a variety of techniques has been used in literature to examine presence of normality in the time series of groundwater quality variables. The normality of groundwater quality variables is mainly evaluated by applying graphical (histograms, box–whisker plots, normal probability plots, quantile–quantile plots, etc.) and statistical tests (skewness and kurtosis, Chi-square (χ2) test, Kolmogorov–Smirnov test, Lilliefors test, Anderson–Darling test, Cramér–von Mises test, Shapiro–Wilk test, probability plot correlation coefficient, Jarque–Bera test, D’Agostino Pearson Omnibus test, etc.). Furthermore, normality is tested for point data of individual sites without their GIS integration. In fact, a methodology for examining normality of the spatially distributed values of the point estimates is currently lacking that restricts the researchers to test normality as well as to present the normality test results in a spatially distributed manner.

Trends in groundwater quality variables

Identifying trends and understanding their underlying mechanisms can help make appropriate decisions for groundwater quality management (McBride 2005). Loftis (1996), for the first time, presented a review on trends in groundwater quality by discussing case studies from different parts of the world ranging from regional to local scales. Based on the review, the exact meaning of “trend” was emphasized as a critical step for groundwater quality studies in both temporal and spatial contexts. In literature, studies dealing with trend identification in groundwater quality are very less in comparison to those exploring trends in surface water quality (Taylor and Loftis 1989; Loftis 1996). Since the last decade, application of statistical trend tests in groundwater quality studies has been receiving an increasing attention (Visser et al. 2009; Kaown et al. 2012; Machiwal and Jha 2015; Yazdanpanah 2016; Koh et al. 2017).

Since the inception of the GIS technology in groundwater studies during 1990s, in studies involving trend identification of groundwater quality variables, researchers adopted the GIS mainly to present point-wise results of trend tests geographically for depicting spatial distribution of presence/absence of the trends (e.g., Mendizabal et al. 2012). Review of the literature revealed that most of the trend detection studies have dealt with individual parameters of the groundwater quality, such as sulfate (Malapati et al. 2011), chloride (Scanlon et al. 2010), hardness (Hudak 2001), etc. Few researchers explored trends in multiple groundwater quality parameters (e.g., Machiwal and Jha 2015; Masoud et al. 2016). Machiwal and Jha (2015) detected trends in 15 groundwater quality parameters of 53 sites located in Udaipur district of Rajasthan, India, by applying three tests, i.e., Kendall’s rank correlation test, Spearman rank order correlation test and Mann–Kendall test. The presence of serial correlation in all groundwater quality parameters was also tested before applying the Mann–Kendall test as the outcome of this test gets affected under the presence of serial correlation (Yue et al. 2002). Similarly, trends in multiple groundwater quality parameters (23 quality variables) were assessed for 20 sites located in Tanta district of Egypt by applying Mann–Kendall test (Masoud et al. 2016). Results indicated that the statistically significant trends at 5% significance level were remarkable for the total hardness, total alkalinity, TDS, iron, manganese, nitrite, ammonium, phosphate and silica. However, the major focus of the past studies has been on identifying trends in nitrate (Hudak 2000a, b; Scanlon et al. 2008; Bronson et al. 2009; Enwright and Hudak 2009; Chaudhuri et al. 2012). Since 2000, researchers have been investigating how best to regionalize the nitrate concentration trends in the groundwater.

Statistical trends are generally detected by two approaches: parametric and nonparametric (Machiwal and Jha 2015). The most widely used parametric method is regression test, which is more powerful than the commonly employed nonparametric Mann–Kendall test, but the former approach requires the data be independent and normally distributed (Gilbert 1987; Bethea and Rhinehart 1991), and the latter approach is free from such normality assumption. Helsel and Frans (2006) developed a regional-Kendall method based on the principle of the seasonal-Kendall test to determine regional trends in the groundwater quality. The performance of the regional-Kendall test was found satisfactory by researchers such as Frans (2008), Sprague and Lorenz (2009), and Kaown et al. (2012). Recently, Lopez et al. (2015) developed a methodology for application of the regional-Kendall test in GIS platform by using geostatistics. In addition, Yazdanpanah (2016) made an effort to integrate results of linear trend analysis with GIS using geostatistical modeling of slope values of the linear trend model.

Similar to the normality-testing of the point-wise groundwater quality variables, it is evident that the literature studies have identified trends in a variety of groundwater quality parameters over individual sites in different parts of the world. However, trend detection studies using spatially distributed GIS maps of the groundwater quality variables are not found in the literature. The major cause for non-consideration of spatial maps for trend assessment is non-availability of a methodology for coupling statistical trend tests with GIS to identify the temporal trends directly into the GIS framework. In contrast to normality-testing, a large number of the studies used GIS to present the results of the trend tests for groundwater quality time series over space. Recently, Kumar et al. (2017) developed a standard methodology to identify spatial trends using spatial raster datasets in GIS framework by coupling three statistical tests (i.e., Kendall rank correlation test, Spearman rank order correlation test, and Mann–Kendall test) with GIS. However, the methodology developed was demonstrated through a case study to identify trends in rainfall of Gujarat state, India using satellite datasets. There is a need to employ such methodology for trend identification in time series of groundwater quality variables.

Persistence

Persistence of a time series is the tendency of the successive data to “remember” their antecedent data and to be influenced by them (Giles and Flocas 1984). In other words, it is defined as the correlational dependency of order or time lag “k” between each ith element and the (ik)th element of time series (Kendall 1973), and is measured by autocorrelation (i.e., correlation between two terms of the same time series). In hydrogeochemistry, persistence testing is reported in a few studies (e.g., Jones and Smart 2005). They investigated internal structure of long-term nitrate concentration records for five karst springs in Mendip Hills, England (UK) by using stochastic autoregressive modeling. The results indicated the significant short-term persistence of 1–2 months in three of five springs.

It is also seen that except normality and trends, the remaining characteristics of the time series such as homogeneity, stationarity, periodicity, and persistence, are generally ignored in the studies dealing with groundwater quality over spatial and temporal scales.

Evaluation of groundwater quality data using GIS-integrated multivariate statistical methods

Application of multivariate statistical methods in groundwater quality studies

As it was put forth by J.D. Hem in his seminal work (Hem 1985), water chemistry (i.e., hydrochemistry)—the field of study mainly concerned with chemical and physical properties of natural waters—“hardly qualified as a scientific discipline” until the late 1950s. Interest to groundwater chemistry (i.e., hydrogeochemistry) studies occurred even much later and was not extensive until 1972 (Niu et al. 2014). Furthermore, at the time being, water quality testing was mostly an expensive and arduous endeavor requiring a variety of volumetric, gravimetric, colorimetric, turbidimetric, complexometric, and potentiometric procedures (Rainwater and Thatcher 1960). Today, with a wide array of modern analytical instruments and technologies at our disposal (APHA-AWWA-WEF 2017), it is possible to identify and quantify a great number of chemical constituents (inorganic and organic) and to measure various physical parameters in water samples of different matrix complexity (ranging from freshwater to brine) with a better measurement accuracy/precision and at a lesser cost than ever before. Obviously, statistical analysis of such multidimensional datasets, acquired at different spatial and temporal scales, require computationally efficient and sophisticated techniques to close the widening gap between our data-generating and data-analyzing capabilities. Especially after 1980s, dramatic increase in the processing and storage capacities of computer hardware, coupled with the emergence of powerful multi-tasking (GIS-based) software packages integrating relational database management (RDBM), statistical/spatial analysis, and two or three dimensional (2-D/3-D) visualization tools have made possible to analyze such large datasets for extracting and summarizing relevant quantitative information hidden in data. This type of insight cannot be solely gained from the conventional graphical plots (Hem 1985; Zaporozec 1972; Güler et al. 2002; Machiwal and Jha 2010) that are used in visualization and interpretation of water quality data.

The chemical composition of natural water is derived from many different complex processes and sources (natural and/or anthropogenic), all of which imprint a unique physicochemical signature on the water constantly recycling through the Earth’s spheres (atmosphere, hydrosphere, geosphere, and biosphere). Therefore, statistical analysis of hydrochemical data entails the simultaneous evaluation of all the chemical and physical parameters (or variables) measured, since water quality is a function of these properties (Williams 1982). One of the ways to accomplish such an evaluation is through multi-variate statistical analysis (MVSA). Generally speaking, the main objective of the MVSA is to simplify the data matrix (composed of p variables and n cases) by finding associations among dataset variables (called R-mode analysis) and/or cases (called Q-mode analysis) (Dalton and Upchurch 1978). Ideally, the information extracted from the data matrix should be easily understood and potentially useful, providing new insights about data. More often, MVSA is a stepwise procedure with the obvious first step getting acquainted with data at hand. Therefore, prior to conducting any formal statistical analysis (i.e., univariate, bivariate, and multivariate), all variables in the dataset should be carefully scrutinized for data quality, measurement/entry errors, missing/censored values, and outliers in order to verify data integrity/consistency and identify the variables/cases violating certain rules and/or method assumptions (Güler et al. 2002). Nonetheless, most of these data quality issues can be minimized, if not resolved completely, by developing and implementing reliable quality assurance/quality control (QA/QC) protocols for collection, handling, and analysis of water samples, both in the field and in laboratory. While most quantitative analytical data, especially the ones related to naturally occurring trace elements, do not lend themselves directly to inferential MVSA due to problems related to non-normality and heteroscedasticity (i.e., heterogeneity of variances), there are techniques available (e.g., Box–Cox transformations (Box and Cox 1964; Box et al. 1978) and z score scaling, i.e., standardization) (Johnson and Wichern 1992) to improve the overall data distribution for elucidating latent associations among data variables and/or cases. Although such statistical associations do not directly establish cause-and-effect relationships, they can assist in creating hypotheses to make viable predictions about the underlying complex processes and phenomena responsible for the data variance and noise (Güler et al. 2002, 2017). However, the challenge is to decide which MVSA methods are best suited for the problem at hand, and understanding their theoretical assumptions and inherent strengths/weaknesses.

Up until late 1960s, statistical treatment of the water quality variables was mainly limited to univariate and bivariate numerical analysis (e.g., calculation of ionic ratios, mean/extreme values, and correlation coefficients) and graphical displays (e.g., frequency distributions and scatter plots) (Hem 1970). Most of the MVSA methods in common use today for tackling Earth Science problems have been adopted from other scientific disciplines (e.g., physics, astronomy, biology, and social/behavioral sciences), where the use of these methods was widespread long before the computer era, thanks to Hollerith’s electromechanical punch-card tabulator and dedicated human computers. Pioneering applications of the MVSA methods in water-related fields occurred much later, and was not as extensive until 1990s. A simple bibliometric analysis (this study) of peer-reviewed journals (published from 1980 forward) listed by The Institute for Scientific Information (ISI) Web of Science online database revealed that factor analysis (FA) is by far the most frequently used MVSA method in groundwater studies, followed by principal component analysis (PCA), cluster analysis (CA), and discriminant analysis (DA) (Fig. 2). However, applications of canonical correlation analysis (CCA) and Correspondence Analysis to groundwater studies are extremely rare or non-existent. It is further evident that the number of studies involving use of the MVSA techniques in hydrogeology has an overall increasing trend over the years, with a significant increment during the period 2011–2016. In the following sections, we briefly introduce salient exploratory MVSA methods, along with examples from the scientific literature that focus on their different applications, for extraction of relevant information concealed in data.

Fig. 2
figure 2

Bar charts depicting usage and growth of multivariate statistical analysis methods in SCI-expanded publications related to hydrogeochemical studies (as of July 4, 2017). CCA canonical correlation analysis, DA discriminant analysis, CA cluster analysis, PCA principal component analysis, FA factor analysis

Eigenvector methods: factor analysis vs. principal component analysis

FA and PCA can be described as “eigenvector” methods for finding lines and planes of closest fit to systems of points in multidimensional space, mathematical foundation of which was first established by Pearson as early as 1901 (Thurstone 1931). Both FA and PCA have been extensively applied in many disciplines (especially in social and behavioral sciences) mainly for data reduction purposes, (e.g., reduction of variables or cases). The data reduction using these methods is achieved through finding the directions of maximum variances (i.e., eigenvalues) in a multivariate dataset and representing them in a much lower dimension (usually 2–5) than the original dataset. In both methods, the variance analysis involves decomposition of the matrix of correlations, which presents interrelations among all pairs of the original variables. However, PCA is often preferred as a method for data reduction, while FA is often used when the goal of the analysis is to detect the structure (i.e., a few underlying, but unobservable, latent constructs or factors) in a dataset (Suk and Lee 1999). Indeed, combining two or more correlated variables (or vectors) into one “factor” or “principal component (PC)”, exemplifies the basic idea of FA and PCA. The new factors and PCs extracted by FA and PCA (respectively) are uncorrelated and ordered so that each successively extracted factor (e.g., F1, F2, and so on) or PC (e.g., PC1, PC2, and so on) accounts for a lesser amount of variance of the original dataset than the previous one (Davis 1986; Brown 1998). FA and PCA are occasionally mistaken as the same MVSA method, probably because of the apparent similarities in the terminology and methodology used for both. Despite the similarities in the terminology, there are distinct differences between the methodologies of FA and PCA methods. However, in most cases, FA and PCA usually yield very similar results, if communalities (i.e., proportion of variance that each variable or case has in common with other variables or cases) are close to unity.

Assumptions of both FA and PCA include that: (i) each original variable follows normal distribution; (ii) original variables display linear relationships; (iii) there are no outliers in data; (iv) sample size is adequate (n ≥ 50, or n ≥ 100 for more stable estimates) and balanced (i.e., case to variable ratio is at least 5). These multi-step MVSA methods (FA and PCA) have been used extensively in hydrogeochemical studies (both in R- and Q-modes) to extract the relevant information hidden in data matrices, e.g., to: (i) extract and ordinate the most important and influential parameters (i.e., physical and chemical variables) responsible for the spatial and/or temporal variations in water quality (Ashley and Lloyd 1978; Melloul and Collin 1992; Ribeiro and Macedo 1995; Reghunath et al. 2002; Thyne et al. 2004; Cloutier et al. 2008); (ii) ascertain the similarities/dissimilarities or continuity/overlap in spatially and/or temporally distributed groundwater samples (i.e., cases) (Güler et al. 2002; Dalton and Upchurch 1978; Usunoff and Guzman–Guzman 1989; Farnham et al. 2002); (iii) reveal underlying latent factors (e.g., key processes, phenomena, sources, and end-members) that account for the structure of the hydrochemical data (Dawdy and Feth 1967; Melloul and Collin 1992; Suk and Lee 1999; Meng and Maynard 2001; Lambrakis et al. 2004; Güler et al. 2017; Kazakis et al. 2017; Busico et al. 2018), and (iv) produce data for further investigation or for other methods (e.g., factor score mapping, multiple regression, cluster analysis, GIS analysis, etc.) (Dalton and Upchurch 1978; Subbarao et al. 1996; Suk and Lee 1999; Güler et al. 2012).

Many past studies applied FA or PCA in combination with GIS techniques to identify the anthropogenic and natural hydrogeologic processes functioning in the aquifer systems [e.g., (Thyne et al. 2004; Dragon 2006; Güler et al. 2012; Petrişor et al. 2012)]. However, precise and proper spatial analyses integrating PC scores with GIS-based geostatistical modeling are rarely carried out (e.g., Güler et al. 2012; Narany et al. 2014; Machiwal and Jha 2015).

Cluster analysis

The term “Cluster Analysis”, first introduced by Tryon (Tryon 1939), encompasses a wide variety of classification algorithms applied in many fields (including hydrogeochemistry) to organize data variables and/or cases into homogenous and non-overlapping subsets or groups, called clusters (Hartigan 1975). Using this method, the original data matrix, composed of p variables and n cases, is partitioned into k number of subsets (where k is generally much smaller than p (in R-mode) and/or n (in Q-mode); hence, data reduction is achieved. In general, the members of each cluster share similar characteristics (e.g., in terms of chemical composition) compared to non-members (members belonging to other clusters). In this method, grouping of individual variables and/or cases is generally achieved through an iterative process, where the number of clusters (k) may or may not be known a priori. The cluster centroids (or means) obtained from the resulting partition can be used as representative members (a.k.a. prototypes) of their respective groups. In hydrochemical studies, commonly used partitioning algorithms include hierarchical clustering (joining or tree clustering) and K-means clustering.

The hierarchical clustering analysis (HCA) employs various types of distance (similarity/dissimilarity) measures and linkage methods (i.e., amalgamation rules) (Sneath and Sokal 1973; Hartigan 1975) and the choice of which combination to use does not have an easy answer and greatly affects outcomes (Güler et al. 2002). One of the most widely used combination in HCA is the Euclidean distance (as distance measure) and Ward’s method (for linkage), which forms distinct and easily interpretable clusters that may be significant in the hydrochemical, hydrologic, and geologic contexts (Gong and Richman 1995; Güler et al. 2002). Since, the results of HCA are mostly presented in a tree-like 2-D diagram called dendrogram (Davis 1986), the method is generally appropriate for partitioning small datasets (Güler et al. 2002). However, this apparent shortcoming can be overcome by employing a multi-step clustering approach (e.g., pre-clustering and then re-clustering) (Güler and Thyne 2004a) or using another MVSA method (e.g., PCA) first for data reduction and noise filtering (Pirkle et al. 1984). Dendrograms can also be spatially projected in 3-D, in a map form using color-coded clusters (Forina et al. 2002), but to our knowledge this technique has not been used in hydrochemical and hydrogeochemical studies. The HCA is generally accepted as a semi-objective procedure, not requiring a priori specification of the number of clusters (k), where their numbers is usually defined post-process (somewhat subjectively), drawing a line (i.e., phenon line) that cuts through dendrogram branches at a certain distance value (Güler et al. 2002). Unlike HCA, K-means clustering (KMC) is a nonhierarchical method that can allow classification of a substantially large number of samples (Gong and Richman 1995; Pacheco 1998). KMC follows a simple iterative procedure, which assumes exactly k number of random cluster centers at the onset of the analysis (MacQueen 1967). If k is unknown a priori, a subjective bias may be introduced into the results. The KMC tries to define exactly k different cluster centroids (one for each cluster) with the greatest possible distinction. From the computational point of view, KMC can be thought of as analysis of variance (ANOVA) in reverse (Güler et al. 2002). The KMC process is initiated with k random clusters, and then objects to be clustered are iteratively relocated between those clusters with the aim to: (i) reduce within-cluster variance and (ii) increase between-cluster variances (Pacheco 1998). However, the resulting KMC partition is highly sensitive to the initial randomly selected cluster centers. Executing multiple KMC runs on the dataset can help to minimize this effect. The results from KMC are typically presented in matrix form, which shows members of each cluster and their distances from respective cluster centers (Güler et al. 2002).

CA techniques mentioned here relies on assumptions such as normal distribution and equal variance (homoscedasticity) of the water chemistry data variables that are continuous in nature (Alther 1979). In addition, the use of variables having specific relationships or displaying a high intercorrelation among themselves (i.e., multicollinearity) may cause unwanted redundancies in the clustering process (Güler et al. 2002). However, CA using factor scores obtained from the factor analysis (FA) can be used in order to reduce multi-collinearity (Suk and Lee 1999). Outliers should also be treated with caution, since they tend to strongly distort the results. Another important and essentially unsettled issue in cluster analysis is the “cluster validity problem” (Hardy 1996), which mainly involves determination of the “true” number of groups (k) in a dataset (mostly unknown a priori). In hydrogeochemical studies, the spatial coherence of the statistically defined groups (e.g., similarity/proximity of geographical locations, altitudes, and distances of within and between group members) can be verified using GIS-based spatial analysis techniques for cluster validation purposes, which may also provide insights into aquifer heterogeneity/connectivity and the processes governing water quality (Güler and Thyne 2004a). As a general rule of thumb, distinctly different hydrogeochemical groups should be both statistically and spatially (in a geographical sense) well separated, due to increasing water–rock interactions along hydrological flow-paths (Thyne et al. 2004).

Since the late 1970s, CA has been successfully applied to water-chemistry data in many groundwater studies to: (i) classify samples into distinct hydrogeochemical groups (Ashley and Lloyd 1978; Riley et al. 1990; Johnson and Wichern 1992; Suk and Lee 1999), (ii) identify hydraulic connections between surface and deep zones (Williams 1982), (iii) interpret groundwater flow (Ochsenkühn et al. 1997), (iv) find optimal number of natural clusters (Pacheco 1998), (v) classify samples from different aquifers (Steinhorst and Williams 1985; Saleh et al. 1999), and (vi) evaluate temporal changes in groundwater composition (Ribeiro and Macedo 1995; Suk and Lee 1999; Berzas et al. 2000). A number of researchers (Farnham et al. 2000; Meng and Maynard 2001; Güler et al. 2002; Güler and Thyne 2004a; Thyne et al. 2004; Helstrup et al. 2007; Cloutier et al. 2008) used R- and Q-mode cluster analyses, in conjunction with other MVSA, geochemical (modeling), and spatial analysis techniques (e.g., GIS) for hydrogeological and hydrogeochemical site characterization in groundwater studies with scales ranging from local to regional.

Discriminant analysis

The main purpose of discriminant (function) analysis (DA) is to determine a set of characteristics (i.e., variables, p) that permit for the best prediction (discrimination) between two or more naturally occurring a priori defined groups (k ≥ 2) within the dataset or assigning new objects (i.e., cases, n) accurately into these homogeneous groups on this basis (Izenman 2013). The basic notion underlying DA is to decide whether groups differ with regard to the mean of a predictor variable, and then to use that variable to predict group membership of new cases. The prediction is achieved by linear discriminant functions (Johnson and Wichern 1992; Wunderlin et al. 2001), which are vectors (linear combination of the selected independent variables) in the directions of optimal separation between the groups. DA initially requires a reference set of (representative) samples for each group, for “training” purposes. DA is generally a stepwise procedure, with forward- and backward-modes, where variables are added or removed one-by-one in a sequential manner to improve the separation between groups (Machiwal and Jha 2010). DA algorithm tries to maximize between group variance–covariance and minimize within-group variance–covariance under simultaneous consideration of all analyzed features. The impact of each variable on the discriminant function can be assessed by comparing their partial F values, where the higher the value is the more impact it has on the discriminant function. In addition, the Wilk’s lambda (λ) value is used as a measure of the statistical significance of the discriminatory power of the model (λ = 0, perfect discriminatory power and λ = 1, no discriminatory power), while Mahalanobis distance statistic (D2) is used to assess for separation of groups. Computationally, DA is analogous to the one-way/multivariate analysis of variance methods (ANOVA/MANOVA). When three or more groups (k) present, the method is referred to as multiple discriminant analysis (MDA), which has close associations with other MVSA methods, including multiple regression analysis, FA, and canonical correlation. DA or MDA can be used to classify and, thus, to confirm the groups found by means of CA. DA relies on the same assumptions that are required for the CA (e.g., normal distribution, homogeneity of variances/covariances, multicollinearity, and no outliers) (Izenman 2013). In addition, the variables that are used to discriminate between groups should not be completely redundant with the other variables. In other words, if a variable (e.g., total dissolved solids) is the sum of a number of other variables (e.g., ionic constituents) that are also being evaluated, then the “ill-conditioned matrix” problem may occur. DA can be used successfully when the dependent variable is “categorical” and the independent variables are “metric” and normally distributed.

In hydrogeochemical studies, DA is infrequently used and has been applied for the assessment of spatio-temporal variations in datasets (Steinhorst and Williams 1985; Wunderlin et al. 2001), where site (spatial) and season (temporal) can be coded as grouping variables, while the measured physical and/or chemical parameters constitute independent variables (Machiwal and Jha 2010). The further details on the procedure can be found in Cooley and Lohnes (1971) and Johnson and Wichern (1992).

Canonical correlation analysis

Canonical correlation analysis (CCA) is considered as one of the correlation techniques (Hotelling 1936). However, it is different from the FA or PCA in spite of certain similarities in concept and terminology. In general, it is used to investigate the intercorrelation between two datasets of variables, whereas FA, PCA or empirical orthogonal functions detect a pattern of relationship within one dataset (Clark 1975). The CCA can be used for examining presence of any similar kind of pattern that may occur simultaneously in two different datasets, and if it is present, then the correlation between associated patterns is calculated. In hydrogeochemistry studies, the application of CCA could not be found in literature. However, for measuring trophic status of reservoirs and lakes, Cairns et al. (1997) applied CCA on the water parameters (namely chlorophyll, total suspended and dissolved solids, and turbidity) and digital values of three bands and numerous band ratios of SPOT (Systeme Pour 1′Observation de la Terre) satellite data. The results indicated that the turbidity and chlorophyll contributed 0.91 and 0.76, respectively, to the first canonical water variable showing a good relationship. It is worth mentioning that the DA is one of the special cases of CCA.

Application of geostatistical modeling in groundwater quality evaluation

Geostatistical modeling techniques were originally developed and applied in geological studies for estimating mineral concentrations in ore bodies and recoverable reserves (David 1977; Journel 1974; Journel and Huijbregts 1978). It is seen from the literature that the first attempts of geostatistical-modeling application to geochemical data were made by David and Dagbert (1975) and David (1977). During the 1940s, an important contribution of geostatistics in meteorology was made by the Soviet School of Meteorology (Drozdov and Shepelevskii 1946). Later on, Gandin (1965) and Kagan (1967) emphasized the need of recognizing spatial variability along with quantification of estimation error. Kriging is the widely used geostatistical technique developed by Matheron (1965, 1973). In hydrogeology, Delhomme (1978) paved the way for the geostatistical-modeling applications. Application of the geostatistical-modeling techniques in groundwater quality studies was very limited up to the end of 1990s. However, with integration of GIS, use of geostatistical-modeling techniques in groundwater quality evaluation significantly increased. After 2000, studies involving GIS-integrated geostatistical-modeling techniques mushroomed in literature. Cooper and Istok (1988a, b) made an excellent effort by developing a comprehensive methodology for applying geostatistics to the problems of groundwater contamination, and demonstrated its application through a case study at the Chem-Dyne Superfund site in Ohio, USA. Istok and Cooper (1988) developed techniques for combining local estimates obtained by kriging to obtain global estimates and estimation errors for the expected contaminant concentration in any specified portion of the contaminant plume. An overview of the basic concepts of the geostatistics and its proposed linear and nonlinear kriging estimation techniques is provided by the ASCE Task Committee (1990a). The ASCE Task Committee (1990b) reviewed the applications of the geostatistical-modeling techniques in groundwater hydrology under the five major sections: (i) mapping, (ii) simulation of hydrological variables, (iii) estimation using the flow equations, (iv) sampling design, and (v) geostatistical-modeling applications in groundwater system management.

In 1980s, few researchers applied geostatistics in hydrogeochemistry studies (e.g., Myers et al. 1982). Myers et al. (1982) developed four variogram models (i.e., linear, constant-linear, concave and convex) for 12 variables (U, B, Ba, Ca, Li, Mg, Mo, As, V, SO4, Specific conductance, and total alkalinity) in Ogallala formation and Permian geologic units in Texas, USA. The results of the geostatistical modeling were compared with the inverse distance weighting (IDW) technique, which revealed a better performance of the IDW technique in spatial interpolations of the variables. Rouhani and Hall (1988) used geostatistical techniques for the design of a regional shallow groundwater quality monitoring network in the Dougherty Plain, located in southwest Georgia, USA. In Spain, reliability of groundwater-quality monitoring network in controlling saltwater intrusion was assessed by using lognormal kriging for mapping chloride distribution in the Llobregat delta confined aquifer of Barcelona (Candela et al. 1988). Bárdossy and Kundzewicz (1990) applied two geostatistical methods (i.e., point kriging and intrinsic random function of order ‘k’) for detection of outliers in chloride and total hardness data of groundwater from the Upper Rhine Valley, extending across three countries: France, Germany, and Switzerland. Bjerg and Christensen (1992) evaluated horizontal variations in groundwater-quality parameters (i.e., pH, alkalinity, Cl, NO3, Ca and K) in a shallow sandy aquifer located in western part of Denmark. Results indicated substantial variations in all parameters even at smaller distances. Istok et al. (1993) presented a case history of the alluvial aquifer underlying the Malheur River Basin, Oregon, USA, where isotropic and spherical geostatistical models were applied for estimating pesticide concentrations from the measured nitrate and pesticide concentrations under the limited sampling of the pesticides. Rautman and Istok (1996) presented a geostatistical framework for probabilistic assessment of groundwater contamination and illustrated the approach using synthetic data of a hypothetical site. The approach is further demonstrated through a case study in agricultural area in the Lower Malheur River Basin and the Western Snake River Plain near the eastern Oregon, USA (Istok and Rautman 1996). Pebesma and de Kwaadsteniet (1997) prepared spatial maps of 25 groundwater-quality variables based on median measurements of 425 sites in 4 × 4 km block in the Netherlands using block kriging. Their study quantified the effect of monitoring network density, and evaluated changes in the groundwater quality over a span of 20 years.

Ordinary kriging and cokriging were compared for studying spatial distribution of nitrate in Lucca Plain aquifer of Central Italy (D’Agostino et al. 1998). Results indicated that the cokriging improved the estimation and reduced the uncertainty in terms of estimation variance. In multivariate geostatistical problems, two related variables are used to improve estimation of the primary variable by using the secondary variable. The ordinary kriging and cokriging have a smoothing effect causing underestimation (or overestimation) of the variable due to large (or small) sample values in cross-validation. This smoothing effect was reduced by applying Gaussian random-process based principle in simulating kriged and cokriged estimates using chloride (primary variable) and resistivity (secondary variable) data in Horonobe area of northern Japan (Lu et al. 2016). Sânchez-Martos et al. (2001) first applied the PCA technique to identify three factors (i.e., sulfate, thermal and marine influences) that affect groundwater processes in the detrital aquifer of the Bajo Andarax (Almeria, Spain). Then, the identified three factors were analyzed using ordinary block kriging to obtain their spatial distribution. Geostatistical approach has been used with Bayesian analysis for contaminant source identification by developing a methodology to estimate release history of a conservative solute (Snodgrass and Kitanidis 1997). This approach is subsequently extended to the estimation of the antecedent distribution of a contaminant at a given point back in time (Michalak and Kitanidis 2004). Later on, Sun (2007) further extended this approach to develop a robust geostatistical approach to contaminant source identification by solving the linear estimation problems. Empirical Bayesian kriging is another technique of kriging family, which is different from other kriging methods as the former uses an intrinsic random function for spatial interpolation (Gupta et al. 2017). This technique is rarely used for spatial interpolation of groundwater quality variables (e.g., Mirzaei and Sakizadeh 2016).

After the year 2000, several studies have applied geostatistical-modeling techniques for mapping spatial variability of the chemical concentrations for groundwater quality assessment/evaluation (e.g., Goovaerts 1999; Castrignanò et al. 2000; Yu et al. 2003; Mouser et al. 2005; Schaefer and Mayor 2007; Machiwal and Jha 2015). At present, abundant studies on this aspect exist in literature, and there has been an increasing trend in appearance of such studies in research journals after the year 2000. Salient studies, reported after 2000, dealing with the use of geostatistical-modeling techniques for the groundwater quality assessment/evaluation are enlisted in Table 1. It is revealed that ordinary kriging is the most widely used geostatistical-modeling technique in groundwater quality studies. The spatial distribution of almost all kind of groundwater-quality parameters along with scores of the principal components obtained through PCA is determined in different parts of the world. It is also observed that the estimation error of the geostatistical modeling is computed in a large number of studies by using cross-validation criteria. However, some of these studies ignored the important step of validation while applying the geostatistical modeling.

Table 1 Salient studies that utilized geostatistical techniques for mapping spatial distribution of groundwater quality after the year 2000

Application of hybrid methods for groundwater vulnerability assessment in GIS platform

Concept of groundwater vulnerability, first introduced by Margat (1968), is based on the assumption that the physical environment may provide some degree of protection to groundwater against human activities. The groundwater vulnerability is classified into two types: specific vulnerability and intrinsic vulnerability (National Research Council 1993). Intrinsic vulnerability of an aquifer can be defined as the ease with which a contaminant introduced onto the ground surface can reach and diffuse in groundwater (Vrba and Zaporozec 1994). On the other hand, specific vulnerability is used to define the vulnerability of groundwater to particular contaminant or a group of contaminants by taking into account the physicochemical properties of the contaminants and their relationships (Gogu and Dassargues 2000). Initially, groundwater vulnerability of an aquifer is mapped, and hence, it can be used as an assessment tool against groundwater pollution. Since the year 1968, numerous groundwater vulnerability assessment methods have been developed and applied worldwide, which are coupled with the GIS. The vulnerability assessment methods can be classified into: (i) index-based methods, (ii) quantitative or simulation models, (iii) statistical and artificial intelligence methods, and (iv) hybrid methods that are the combination of the earlier three methods mainly integrating index-based methods with statistical and artificial intelligence methods. A tree diagram illustrating classification of methods for groundwater vulnerability assessment is presented in Fig. 3. Shirazi et al. (2012) presented a review of application of the GIS-based DRASTIC method for groundwater vulnerability assessment. Later on, Wachniew et al. (2016) summarized review of intrinsic methods of groundwater vulnerability assessment. It is learnt that earlier reviews emphasized on a particular vulnerability assessment method or only index-based methods for groundwater vulnerability. Recently, Machiwal et al. (2018) presented a comprehensive review of groundwater vulnerability highlighting current status and challenges of index-based, quantitative and statistical methods including methods for source protection. However, in this paper, we focus mainly on hybrid methods developed by combining advanced statistical and artificial intelligence techniques with index-based methods. The index-based methods are parameter weighting and rating methods, which, apart from classifying the various parameters, also introduce relative weight coefficients for each factor. Such methods are usually coupled with statistical methods, so as to overcome the subjectivity of weights and ratings of each parameter.

Fig. 3
figure 3

Tree diagram illustrating classification of methods for groundwater vulnerability assessment

Rupert (2001) introduced, perhaps for the first time, a hybrid approach for groundwater vulnerability assessment by using a calibration procedure. The groundwater vulnerability map, initially developed using the DRASTIC method, was modified according to its correlation with nitrate concentrations in the Snake River Basin in USA. Similarly, Panagopoulos et al. (2006) used Spearman’s \(\rho\) and Kendall’s \(\tau\) correlation coefficients to modify both the weights and ratings of the DRASTIC parameters. A major concern in the assessment of groundwater vulnerability to nitrates constitutes the use of qualitative parameters. Kazakis and Voudouris (2015) replaced the qualitative parameters of DRASTIC method with quantitative ones, and proposed a new method to estimate groundwater vulnerability to nitrate. Additionally, nitrate concentration was correlated with grading methods in order to determine the more suitable classes of the proposed method. The grading methods of natural breaks, equal interval, quantile and geometrical intervals were used to define the class ranges of the final vulnerability to nitrate-based index, whilst sensitivity analysis and ANOVA F test statistics were used to verify the results. Other more complex hybrid methods include integration of index-based methods and artificial intelligence (AI) techniques such as fuzzy logic and artificial neural networks (ANNs).

Regression analysis has been widely used in environmental studies. In the region of Osona (NE Spain), Boy-Roura et al. (2013) used multiple linear regression and isotopes for the assessment of groundwater vulnerability to nitrates. In another study, logistic regression and weights of evidence statistical procedures were coupled with DRASTIC method for the development of two hybrid methods, which were applied in Korinthia prefecture in South Greece (Antonakos and Lambrakis 2007). In the Pearl Harbor-Honolulu aquifer in the USA, stepwise logistic regression and capture zones of wells were coupled for the assessment of groundwater vulnerability (Mair and El-Kadi 2013). Among other techniques, Pacheco et al. (2015) applied analytic hierarchy process (AHP) for the factor weighting of DRASTIC parameters and the estimation of groundwater vulnerability in different aquifers of Portugal. Fuzzy logic and ANNs are successfully utilized to assess groundwater vulnerability (Dixon 2005a, b). Fijani et al. (2013) coupled Sugeno fuzzy logic (SFL), Mamdani fuzzy logic (MLF), ANNs, and Neuro-Fuzzy (NF) techniques in order to estimate groundwater vulnerability in the Maragheh-Bonab basin of Iran. Likewise, Larsen fuzzy logic (LFL) was applied in conjunction with SFL and MFL for the assessment of groundwater vulnerability in the Varzeqan Plain, in northwestern Iran (Nadiri et al. 2017). In addition to above techniques, multivariate statistical analysis, e.g., PCA and CA have also been used in groundwater vulnerability assessment studies. The CA has been used to determine the most influential chemical factors in determining aquifer vulnerability in the Visakhapatnam district of India (Rao et al. 2013). Additionally, the CA was coupled with PCA for the modification of DRASTIC method in the Qazvin aquifer, in northern Iran (Javadi et al. 2017).

Thirumalaivasan et al. (2003) made a first attempt to apply AHP for the modification of DRASTIC method, which was then employed for vulnerability assessment in Tamil Nadu, India. The AHP is a structured multi-criteria analysis technique used for analyzing complex problems, and thus, the AHP is widely used for the calibration of parameters’ weights in vulnerability assessment methods. In vulnerability assessment study of the Eğirdir Lake basin of Turkey, DRASTIC method was modified using the AHP technique (Sener and Davraz 2013). Furthermore, overlay and index-based techniques were coupled with the AHP in a GIS platform for estimating groundwater vulnerability in northern India (Gangadharan et al. 2016). Decision support systems (DSSs) constitute a valuable and flexible tool for groundwater resource management, and it has been also integrated with groundwater vulnerability assessment methods. For instance, the DSSs have been coupled with vulnerability maps in intensively irrigated areas of Italy and Greece providing an integrated tool for sustainable management of groundwater and optimal use of fertilizers (Voudouris et al. 2010). Gemitzi et al. (2006) developed a groundwater vulnerability index based on decision-making techniques, such as fuzzy logic and GIS. Stumpp et al. (2016) established an intrinsic vulnerability index in a decision tree form, which leads the user through the stages of vulnerability assessment. Kazakis et al. (2018a) modified the GALDIT method using fuzzy sets in order to estimate groundwater vulnerability of coastal aquifer to seawater intrusion.

The weights of evidence (WoE) method have also been used for the modification of index-based methods of groundwater vulnerability assessment. This method can evaluate the importance of each single factor class, and thus, allowing the range of values that influences the nitrate concentration to be determined. Abbasi et al. (2013) modified the DRASTIC method using WoE in the Charmahal-Bakhtyari Province in southwest Iran. In Po Plain area of Northern Italy, Sorichetta et al. (2012) used positive and negative WoE in order to assess groundwater vulnerability to nitrate. Genetic algorithm has also been used for the site selection of groundwater production wells considering groundwater vulnerability to pollution. Elçi and Ayvaz (2014) applied this approach in Tahtalı watershed in İzmir, Turkey. It is revealed from the literature that genetic algorithm has not been widely used in studies involving assessment of the groundwater vulnerability, and therefore, this promising AI technique has the vast scope in future studies. It is worth mentioning that the applications of the aforementioned hybrid methods performed in a GIS platform highlight importance of the statistical and AI methods in development and application of the new hybrid methods. Salient hybrid methods developed by modifying the original groundwater vulnerability assessment methods are summarized in Table 2.

Table 2 Methods used for groundwater vulnerability assessment and their modifications in literature

GIS-based groundwater quality index

The previous sections reviewed the application of time series modeling, multivariate statistical/geostatistical and artificial intelligence techniques in hydrogeochemistry, as for groundwater quality evaluation and vulnerability assessment. Combined with more conventional methods (e.g., Piper and Durov diagrams, Wilcox chart, descriptive statistics), these techniques are powerful to get a better knowledge of the geochemical processes associated to groundwater chemical evolution, both in space and time. On the other hand, a challenge remains to properly communicate the relevant geochemical knowledge to groundwater managers in a way to integrate groundwater quality issues within groundwater sustainability action framework. To do so, there was a need to develop indices that could be applied for groundwater quality assessment. Combined to GIS, these water quality indices can be integrated into other spatial data related to natural resources and human geography, thus contributing to proper development and management of the groundwater resources.

Water quality index (WQI)

In their book dealing on water quality indices (WQIs), Abbasi and Abbasi (2012) pointed out that expressing water quality brings numerous challenges. In fact, the quality of water can be defined for different uses (e.g., drinking, agricultural irrigation, livestock, and industrial), may vary in time and space, and can be categorized by a number of parameters (chemical, physical, microbiological, and radiological), with some parameters being more problematic than others, regarding health issues.

To face the complexity to describe water quality and to provide water resources managers with representation of water quality that allows comparisons between samples, regulatory agencies of different countries and international agencies have developed different types of WQIs, including the US Geological Survey (Stoner 1978), CCME-WQI (Canadian Council of Ministers of the Environment 2001), and Global Drinking (GD)-WQI (United Nations Environment Programme 2007). Lumb et al. (2011) published an extensive review of the evolution of the WQI concept, including the CCME-WQI and the GD-WQI. Later on, Sutadian et al. (2016) presented a review of 30 WQIs, developed for evaluating river water quality, based on selection of parameters, generation of sub-indices, generation of parameter weights, and aggregation procedure to compute index. Abbasi and Abbasi (2012) defined the objective of WQI as translating the concentrations of the measured parameters (variables) of a water sample into a single value (the index value). By doing so, the index value of each sample can be used to compare the water quality between samples (observations). When computing WQI with GIS, numerous applications leading to proper and sustainable management of water resources can be implemented. Abbasi and Abbasi (2012) provide detailed information on WQI development and common generation steps, starting with the Horton’s WQI (Horton 1965). Figure 4 illustrates the basic steps generally followed to develop WQI (Abbasi and Abbasi 2012). The selection of parameters (Step 1) from the water quality dataset and the attribution of a weight to each of the parameters (Step 3) are leading subjectivity to the technique. The parameters’ transformation (Step 2) is needed to bring parameters of different units or ranges to a single scale, producing sub-indices. It is during this step that one can index the parameters numerically to water quality guideline, such as the World Health Organization standards (WHO 2017). Finally, the aggregation of the sub-indices (Step 4) allows the determination of the final index score of the WQI.

Fig. 4
figure 4

Flowchart illustrating basic steps generally involved in water quality index (WQI) determination as presented by Abbasi and Abbasi (2012)

Several indices were developed for assessment of surface water quality (e.g., Prati et al. 1971; Smith 1990; Dojlido et al. 1994; Nasiri et al. 2007; Thi Minh; Hanh et al. 2011; Şener et al. 2017). The following sub-section emphasizes on the application of WQI in the field of hydrogeology, where specific indices, including the groundwater quality index (GWQI; e.g., Machiwal et al. 2011), contamination index (Cd; e.g., Backman et al. 1998), metal pollution index (MPI; e.g., Giri et al. 2010) and index of aquifer water quality (IAWQ; e.g., Melloul and Collin 1998), were developed to define the quality of groundwater. With advancement of the computing facilities, WQIs are now integrated with GIS to provide quantitative groundwater quality maps for different geographical regions and scales (e.g., Machiwal et al. 2011; Ketata et al. 2012; Sadat-Noori et al. 2014).

Groundwater quality index (GWQI) and GIS-based GWQI mapping

For the purpose of this review, GWQI is used as the general term to describe indices developed to address groundwater quality, predominantly based on physicochemical parameters (e.g., GWQI, Cd, MPI, and IAWQ). Research studies on GWQI have been reported on groundwater geochemical data from many countries, as shown in Table 3. Such studies have increased since the pioneering work of Backman et al. (1998) and Melloul and Collin (1998), with an important number of publications from semi-arid and arid regions of the world, including parts of India, where several states are facing severe water scarcity (Machiwal et al. 2011).

Table 3 Salient studies that developed and applied various types of groundwater quality indexes (GWQIs)

With the objective to provide a general view of the degree of groundwater contamination of a region, Backman et al. (1998) tested the applicability of mapping groundwater contamination index (Cd) in Finland and Slovakia. There Cd takes into account the number of parameters exceeding the guideline values, as well as the concentrations exceeding these limit values. As shown by Backman et al. (1998), the groundwater contamination degree calculated for each sample can then be represented on maps for aesthetic and health-risk parameters distinguishing between groundwater contamination of geogenic origin and anthropogenic sources. By dressing parallels to DRASTIC model (Aller et al. 1985), Melloul and Collin (1998) developed the IAWQ, a GWQI that allowed delineating areas where land uses are already affecting groundwater quality. Stigter et al. (2006) used multivariate analysis to develop a GWQI and a groundwater composition index (GWCI) as monitoring tools for groundwater contamination from agricultural activities and to serve as communication tool as part of agro-environmental policies in Portugal.

In the last decade, several studies integrated the GWQI concept to GIS to support efficient strategies to assess groundwater quality, as well as to properly manage and monitor aquifers and groundwater resources (Table 3). Babiker et al. (2007) proposed a GIS-based GWQI with the objective to summarize available water quality data into easily understood maps. They used GIS to implement the proposed GWQI and to test the sensitivity of the model. In another GIS-based GWQI spatio-temporal study, Machiwal et al. (2011) utilized, following GWQI map, an optimum index factor (OIF) to generate a potential GWQI (P-GWQI) map in western India. They summarized the entire process to develop GWQI and P-GWQI maps within a flowchart. Following Babiker et al. (2007), Machiwal et al. (2011) also performed a map removal sensitivity analysis to identify the most influential water quality parameters, and so, the parameters that should be monitored with higher accuracy. Khan et al. (2011) used GIS-based GWQI to assess the impact of land use changes on the groundwater quality from a rapidly urbanizing region of South India. Sadat-Noori et al. (2014) combined the use of GIS and GWQI to assess groundwater quality of the Saveh-Nobaran aquifer in Iran.

GWQI has also been used in various applications in groundwater quality and hydrogeochemical studies. As an example, Ramos Leal et al. (2004) used a GWQI and the Cd of Backman et al. (1998), combined to aquifer vulnerability evaluation, to support the design of a water quality monitoring network for Mexico. Nobre et al. (2007) calculated a GWQI, based on the IAWQ of Melloul and Collin (1998), in conjunction with vulnerability, contamination and well capture indices, to develop GIS-based groundwater vulnerability and risk mapping. The approach of Nobre et al. (2007), where several elements are integrated within GIS environment, was successful to assess groundwater pollution risks and to identify areas to be prioritized for groundwater monitoring and landuse restriction. The concept of GWQI is still in evolution, as shown by the work of Li et al. (2014) and Vadiati et al. (2016). With the objective to minimize uncertainties associated with traditional WQI calculations, Vadiati et al. (2016) investigated the potential of a hybrid fuzzy-based GWQI (FGWQI) to assess groundwater quality in the Sarab Plain of Iran. They found that the hybrid FGWQI produces significantly more accurate assessments of groundwater quality than traditional WQI. Even though the application of FGWQI is promising, Vadiati et al. (2016) concluded that more research is needed to test the approach and compare it to deterministic WQI techniques in different contexts.

Limitations and research gaps

Whether time series analysis in hydrogeochemistry has been comprehensively applied?

This review highlighted that studies are generally lacking where multiple characteristics, i.e., normality, homogeneity, stationarity, trends, persistence, periodicity and stochasticity of a hydrogeochemical time series are characterized for the same time series. Every time series characteristics have their own importance, however, in literature studies dealing with hydrogeochemical data series, the major emphasis was only on testing normality and presence/absence of trends.

Has time series modeling been adequately integrated with GIS?

Literature reviewed in this study clearly pointed out that the studies dealing with time series of hydrogeochemical variables could not adequately integrate time series modeling with GIS technology. This is mainly due to unavailability of the essential time series analysis modules in GIS software required for analyzing time series characteristics of hydrogeochemical variables in a spatially distributed manner.

Whether all multivariate statistical techniques may adequately be coupled with GIS?

It is evident from this review that FA, PCA, and CA techniques have been extensively used to analyze the multivariate hydrogeochemical datasets. However, other MVSA techniques such as DA and CCA could not receive much attention of the researchers. In addition, it is learnt that the options for analyzing hydrogeochemical variables by applying GIS-coupled PCA technique are available in many GIS software. However, such procedures for coupling of other multivariate techniques with GIS are generally not available.

Although multivariate statistical techniques provide a powerful means for the analysis of hydrogeochemical data series (allowing simultaneous evaluation of all physicochemical variables), due to their supervised nature, there are no unified methodologies on how to conduct such analyses. For instance, there are certain decisions to be made in certain steps of these analyses (e.g., variables to be included or excluded, use of raw or transformed/standardized variables, selection of methods, algorithms, cut-off levels, and criteria for evaluation), which may introduce a subjective bias in the process, with the potential of greatly affecting outcomes. Even using the same dataset, different results can be obtained as a result of options chosen to conduct the analyses and during interpretation stage.

Have geostatistical-modeling techniques been advanced in mapping groundwater quality variations?

In groundwater quality mapping studies, it has been a customary practice to apply ordinary kriging technique. There have been fewer efforts in exploring applicability and efficacy of other kriging techniques, e.g., cokriging, indicator kriging, empirical Bayesian kriging, etc. Only a few studies comparatively evaluated multiple kriging techniques for getting the best results. Validation of the mapped variable is an essential step in the procedure of applying the geostatistical-modeling technique for estimating spatial distributions. However, uncertainty in spatial estimations of the groundwater quality variables remains unaddressed in few studies due to non-computation of error variance.

Is quantitative assessment of groundwater vulnerability more comprehensive than qualitative?

The modern and advanced approaches of groundwater vulnerability assessment are based on both using hybrid methods involving a coupling of qualitative–quantitative methods and integration of statistical and artificial intelligence techniques with index-based qualitative methods. However, these approaches demand a large number of field data, which are difficult to be collected/measured. Additionally, the trend to establish a method for all hydrogeological regimes might neglect the specific conditions of each aquifer. Although the qualitative approaches are characterized by subjectivity and strong dependence on the researchers, they might be more flexible, adaptable and cost-effective. It is clear that vulnerability assessment using intelligence techniques should be carefully applied and evaluated. In the future, a deeper discussion and comparison between quantitative and qualitative approaches will be needed to determine the most suitable approach.

Does GIS-based groundwater quality index provide consistent evaluation?

It is apparent from the results of past GWQI studies that index score calculation remains highly dependent on the set of parameters selected, as well as on the weights assigned to each of the parameters, making this technique subjective. Of the several developed GWQIs, hardly any GWQI have the ability to be consistent and comparable if applied over different areas. Thus, a universally accepted and consistent GIS-based GWQI is currently not available in literature.

Are different water quality indices steady and comparable over spatial and temporal scales?

As revealed from the review of literature, most of the WQIs are dependent upon the water quality guidelines that vary among jurisdictions/institutions, and thus, lead to difficulties in interpreting water quality when comparing WQI maps at regional scale, sometimes at country-level. Furthermore, there are chances that the water quality guideline for some parameters may change in time, and in such a case, the WQI map needs to be updated prior to interpretation or use.

Perspectives on future research needs

This review clearly evidenced that application of statistical techniques such as time series modeling, multivariate statistical/geostatistical techniques, artificial intelligence techniques and water quality indices are gaining popularity in the field of hydrogeochemistry. Integration of statistical techniques with GIS enhanced capabilities for precisely interpreting the hydrogeologic processes occurring within the aquifer systems. Still, some research gaps and limitations in integration of the advanced statistical techniques within the GIS platform are experienced by the researchers as explained in the previous section, which may be a challenge for future studies undertaking water quality evaluation of the hydrogeologic systems. Few of the major needs for future research are pointed out below.

  • In hydrogeochemistry, first two steps of time series modeling, i.e., detection and analysis, have been widely adopted. However, the future research will need to focus on the last two steps, i.e., synthesis and verification including stochastic time series modeling of groundwater quality variables.

  • Time series characteristics other than normality and presence of trend such as homogeneity, stationarity, periodicity and persistence need to be considered equally important in hydrogeochemistry, and wide applications of statistical methods for their detection are required.

  • Due to inherent complexities of and spatial continuities observed in the chemical and physical properties of water chemistry data, sometimes multivariate statistics may not be able to produce the expected results. This is mostly because, most of the multivariate statistical methods are based on binary logic (i.e., Aristotelian logic), which imposes sharp boundaries. According to this logic, a water sample can only be a member of a certain group and no overlapping groups are allowed, e.g., in cluster analysis. However, methods using a multi-valued logic (e.g., fuzzy c-means clustering) can be used to overcome such limitations, where partial memberships can be evaluated (a water sample can be partial member of other groups) (Güler and Thyne 2004b; Güler et al. 2012).

  • In future studies, uncertainty associated with the spatial estimations of groundwater quality variables predicted by geostatistical-modeling techniques will have to be properly addressed by validating the mapped variables using the cross-validation criteria. Also, accuracy of the customary ordinary kriging and few advanced techniques such as empirical Bayesian kriging will need to be comparatively evaluated in order to find the best interpolation technique under a set of given hydrogeologic conditions.

  • For groundwater vulnerability assessment studies, a need is felt to develop some sort of protocol for monitoring of the groundwater quality in order to have comparative appraisal of vulnerability degree of the aquifer over different parts of the world.

  • In studies dealing with groundwater quality index (GWQI) for groundwater quality assessments, a robust methodology will have to be developed to reduce subjectivity from the processes of parameter selection and weight attribution.

  • Development of a universal GWQI for assessing groundwater quality for different purposes would really be a challenging task for prospective researchers. Attempts should be made to develop a framework for generating a unique as well as versatile index that allows comparisons in groundwater quality among different spatial scales ranging from local to regional (Lumb et al. 2011).

  • It is emphasized to investigate the applicability of hybrid GWQI involving coupling of artificial intelligence techniques such as fuzzy or neuro-fuzzy technique with index-based weights, along with integration of various additional parameters such as physicochemical, organic matter, microbiological, major anions/cations and heavy metals (Vadiati et al. 2016).

  • One of the major future research needs will be adequate integration of all kind of statistical methods in GIS platform. Procedures for different time series modeling tests, multivariate statistical and artificial intelligence techniques and water quality indices need to be adequately incorporated in GIS software.

  • Finally, it is realized that a deep sense of cooperation, sharing of experience and exchange of ideas among the hydrogeologists working in different regions of the world having a diverse setting of economic, social and political ethics would be needed to ensure thorough investigations and reliability of outcomes.

Concluding remarks

Application of modern techniques such as time series modeling, multivariate statistical/geostatistical and artificial intelligence techniques to characterize groundwater quality for efficient management and protection of groundwater resources has been attracting the researchers increasingly over the past five decades. After advent of Geographic Information System (GIS) in 1990s, the advanced statistical and artificial intelligence techniques with their GIS-based integration have emerged as more powerful tools than the traditional methods for a better evaluation of groundwater quality. It is revealed from the literature that time series modeling has not been extensively utilized in hydrogeochemistry studies, and there exists a huge scope for its comprehensive applications in future. Mainly, presence of normality and trend are examined in groundwater quality time series, and other important time series properties such as homogeneity, stationarity, periodicity, persistence and stochasticity are generally ignored on the temporal scale. Also, adequate GIS integration of time series modeling techniques are generally lacking. It is apparently depicted that the past studies have mostly adopted Factor Analysis, Principal Component Analysis (PCA) and Cluster Analysis (CA) for evaluating the groundwater quality. However, other multivariate statistical analysis techniques could not be widely employed. Recently, factor loadings/scores of the PCA have been integrated with GIS-coupled geostatistical modeling, although such studies are quite rare in literature. A need is felt to find the accuracy of the advanced geostatistical-modeling techniques, e.g., Empirical Bayesian Kriging and geostatistical-simulation, in spatial mapping of the groundwater quality variables, and their spatial comparisons with the traditional techniques, e.g., Ordinary Kriging. It is evident from literature that the concept of groundwater vulnerability has been developed widely over the past five decades since its inception in 1968, and currently a very large number of artificial intelligence techniques have seen their GIS-integrated applications for protection of the groundwater resources worldwide. In recent times, a hybrid approach amalgamating index-based rating methods with statistical methods is gaining wide attention of the researchers across the globe for groundwater vulnerability assessments. Furthermore, review of the past studies clearly highlighted that studies dealing with water quality assessment based on certain indices are relatively less for the groundwater in comparison to those for the surface water. Over the last one decade, there has been an increase in the number of studies, either developing a groundwater quality index (GWQI) or applying an existing GWQI for groundwater quality appraisals.

This review emphasizes the importance of salient time series characteristics in GIS-based hydrogeochemistry studies. In future groundwater quality studies, time series modeling should be employed in a comprehensive manner by including synthesis and verification steps. Similarly, potential of the multivariate statistical techniques, other than PCA and CA, need to be explored for groundwater quality evaluation and protection. Also, it will be imperative to minimize uncertainty of the aquifer vulnerability assessments by developing the hybrid methods with a proper balance of qualitative and quantitative methods. In addition, a robust and global GWQI would have to be generated in order to have consistent and steady water quality evaluations of the hydrogeologic systems that can be comparable over different spatial scales across the globe. Finally, one of the major future challenges would be to develop a variety of modules for implementing advanced statistical and artificial intelligence methods in GIS platform to enable advanced analyses of these techniques in a spatial manner.