Introduction

Groundwater is a dynamic renewable resource nurtured as a potential source of quality water for human consumption and other purposes in different regions of the world (Li et al. 2016; Hasanuzzaman et al. 2017), whereas an extensive contamination has posed more threats to the groundwater quality rather than the depletion of groundwater (Macdonald et al. 2016). In most developing countries, groundwater quality has become a serious problem due to shortage of freshwater sources; hereafter, the evaluation of groundwater quality is an essential study for sustainable groundwater resource management. In general, groundwater quality depends on precipitation quantity, recharge and discharge water qualities, water–rock interaction, and residence time of water (Wagh et al. 2017). Additionally, two main processes contribute to groundwater quality; one is the natural processes including dissolution, weathering of rocks, and leaching of ion exchange which may impact the groundwater quality, while the other is the anthropogenic sources such as intensive agricultural activities, unplanned urbanization, and industrial activities which may affect it (Panaskar et al. 2016).

However, Bangladesh is blessed with plentiful groundwater resources, but limited. The climatic variability, shifting of river direction, dewatering of aquifers for irrigation, population increase, rapid urbanization, extensive agriculture, and domestic and industrial activities are some factors that have a direct effect on the quality and quantity of groundwater resources in Bangladesh (Biswas et al. 2014; Bodrud-Doza et al. 2016; Islam et al. 2017a). The occurrences of arsenic in groundwater resources of Bangladesh are regarded as a major calamity in the world (Karim 2000). Currently, the elevated concentrations of arsenic (> 50 μg/L) have been found in a shallow aquifer which affects the groundwater quality in northeastern Bangladesh (BGS and DHPE 2001; Ahmed et al. 2004; Halim et al. 2010). Generally, trace metals such as arsenic (As), iron (Fe), and manganese (Mn) are common to groundwater and the elevated concentrations of these metals have long been a deep concern because of the potential adverse effects on human health and the aesthetic or nuisance problems that some present in the Sylhet district of northeastern Bangladesh. Nevertheless, the groundwater quality is terribly degraded day by day in the Sylhet district, Bangladesh, because of environmental changes, climatic variations, and human activities. So continuous monitoring and characterization of groundwater quality ranks are important to know the groundwater quality status and to prevent further groundwater contamination in the designated area.

Groundwater quality is always closely associated with the health of humans; thus, finding an appropriate technique to characterize groundwater quality for drinking use is an important study in the recent decade. Various techniques have been reported in the past literature on groundwater quality assessment and decision-making in many parts of the world (Ishaku 2011; Rubio-Arias et al. 2012; Bhuiyan et al. 2016). However, in the past several decades, groundwater quality index (GWQI) is one of the most extensively used techniques which is a numerical tool for assessing the suitability for drinking purposes because of its practicality and effectiveness (Yan et al. 2014). Although this index has been successfully applied by various researchers, such as Rubio-Arias et al. (2012), Sahooa et al. (2015), Bodrud-Doza et al. (2016), and Bhuiyan et al. (2016), the weights of the parameters are commonly given by expert judgment based on the GWQI calculations in which lots of related and valuable data get lost (Amiri et al. 2014). Another limitation is that the index deals with the ambiguity and bias of the environmental issues in various steps of decision-making. For example, some parameters in the index equations can affect the final score of GWQIs significantly without any valid scientific reason. As a result, a right decision cannot be taken (Duque et al. 2006). Due to these limitations of the GWQI technique, an innovative classification approach is necessary, which is able to provide accurate and exact information about decision-making on the groundwater quality appraisal.

At present, numerous methods have been applied to assess the suitability of water quality, including the fuzzy logic model (Kamrani et al. 2016), set pair analysis model (Li et al. 2011; Feng et al. 2014), multivariate approach (Wu et al. 2014), matter–element extension method (Li et al. 2016), blind number method (Yan and Zou 2014), and analytic hierarchy procedure (Hosseinimarandi et al. 2014). However, these methods have some critical drawbacks, such as too many factors which need to be considered, while the evaluation of groundwater quality is being performed. Another problem is that these techniques cannot delineate properly the water pollutant ranks and we cannot express whether the variables involved in the evaluation meet the decision-making of functional areas consequently (Varnosfaderany et al. 2009). To solve this problem, the entropy-based weighted technique is applied to estimate the weights of groundwater hydro-chemical parameters that ignore the artificial weight dividing, which can delineate clearly water quality categories. The entropy-weighted water quality index (EWQI) can compute the groundwater quality correctly which can simply elucidate the comparison between two samples in the same rank (Li et al. 2010). Hence, the comparison of the EWQI and GWQI classifications is an appropriate approach for scientific justification and decision-making; therefore, this study can present reasonable results of water quality evaluation (Li et al. 2010; Wu et al. 2011; Fagbote et al. 2014; Amiri et al. 2014; Su et al. 2017; Gorgij et al. 2017).

On the other hand, over the last three decades, geostatistical models like semivariogram and kriging interpolation models have been effectively employed to examine the spatial patterns of continuously varying hydro-chemical parameters of groundwater and to include this information into mapping procedure (Burrough and McDonnell 1998). Consequently, such type of information is essential for assessing the contamination movement sources and spatial variation of the contaminants at various locations. In addition, geostatistics and other spatial analyses such as spatial autocorrelation index can recognize and define the sites, magnitudes, and shapes of statistically significant spatial patterns in the investigated region. The spatial autocorrelation and geostatistical analyses have been adopted as a decision-making tool by many researchers for groundwater quality study. A detailed description of the geostatistical model has been well-documented in various literatures (Goovaerts 1997; Webster and Oliver 2001; Kumari et al. 2013; Islam et al. 2017b). In geostatistics, the kriging technique is considered as the best interpolation method and is the most extensively applied technique. The cross-validation in each kriging interpolation technique is used to select the best fitted model. This technique deals with the spatial autocorrelations among the sampling sites and has been extensively applied for mapping the spatial distribution of groundwater quality parameters. Among the various chemometric techniques, multivariate statistical technique has been extensively used for identification of possible factors/sources that affect groundwater quality in various regions of the world including Bangladesh (Halim et al. 2010; Ahada and Suthar 2017; Islam et al. 2017a). However, combining the methods of multivariate statistics and geostatistics can provide a realiable outcome on the detailed source history of contaminants.

Nevertheless, a thorough understanding of factors influencing groundwater quality and its spatial variability is vital for decision-making in any particular region. In such a case, very limited researches have been undertaken on the status of groundwater quality for drinking purposes in the Sylhet district, Bangladesh (Munna et al. 2015). The characterization of groundwater quality ranks and its spatial variation by using the integrated entropy method, spatial autocorrelation index, and multivariate and geostatistical approaches is yet to be conducted in Bangladesh. Considering all these facts, therefore, this study has been designed to outline the drinking water quality ranks and its spatial distribution in the Sylhet district of Bangladesh by using the above-mentioned approaches.

Material and methods

Study area description

The study area, Sylhet district, is located in the northeastern part of Bangladesh. Geographically, it lies from 24.36° to 25.11° N latitude and from 91.38° to 92.30° E longitude and encompasses an area of 3090.40 km2 (Fig. 1). The population of the area is approximately 2.6 million, and most of the people engage in agricultural activities (BBS 2014). The soil is less fertile as the district is positioned in the floodplain of the Surma and Kushiyara rivers; also, the physiographic setting belongs to the class of the Tertiary hilly region. The land surface represents an irregular geomorphic pattern, falling from piedmont hills near India across a gently sloping region with an elevation of about 33.5 m above the mean sea level (MSL), but in some places, it consists of slightly higher ridges and shallow depression. Land use patterns are mainly depressions, agricultural lands, and settlement area in this area. Climate is one of the important characteristics for water movement and occurrence. The area has a subtropical humid climate with a hot and rainy summer season and a distinct cooler dry season. The average annual temperature is 25.0 °C. The highest temperature of 39 °C mainly occurs in April and the lowest temperature of 4 °C in January. The study area experiences the highest precipitation in Bangladesh. This area receives average rainfall ranging from 3000 to 5000 mm per year. More than 80% of annual rainfall of about 3334 mm occurs in June–September during the monsoon period while less than 5% of rainfall occurs in November–March during the dry period (Munna et al. 2015). The relative humidity varies between 60% in the dry season and 88% in monsoon season. Abstracted groundwater of approximately 90% is used for agricultural purposes and the remaining for human consumptions. The primary source of water is mainly from the groundwater which is available in the shallow and deep aquifers extracted through shallow and deep wells. However, uncontrolled groundwater uses and regular monitoring of water quality have yet to be performed. Thus, groundwater should be assessed most effectively in terms of drinking water purposes in the study area.

Fig. 1
figure 1

Location of the study area and sampling sites

Geology and hydrogeology settings

Geologically, the investigated area is a part of the Surma basin. The Surma basin is a sub-basin of the Bengal basin of Bangladesh (Fig. 2). This basin is also the extension part of the Bengal foredeep, where tectonic activities occur after the Bengal basin formation (Islam et al. 2013a, b). The basin comprises a succession of thick sediments (± 16 km) which is deposited during continual marine transgression and regression events that formed huge economic deposits (Islam et al. 2017c). It might have originated during the late orogenic stage of the Bengal basin because of continuous subsidence along the great Dauki fault system and the Plio-Pleistocene uplift of the Shillong Plateau (Islam and Habib 2015). An immense marine transgression has occurred in the Surma basin during the Holocene period surrounding a major region of the Ganges Brahmaputra Meghna (GBM) river delta complex with supra-tidal sediments (Khan et al. 2000). During that period, a well-defined long depression known as the Sylhet depression is confined by the great Dauki fault in the meandering alluvial sediment of paleo-channels serving as compartments of groundwater occurrence. The lithology of the area consists of alluvial sand in the north part and deltaic sand, silt, and clay in the south part. Tertiary sediments are folded, faulted, uplifted, and deeply dissected by the river. Generally, the hill/hillock range and river valleys are longitudinally aligned in the study region. Recent to Plio-Pleistocene sediments have been deposited on the eroded surface of late Tertiary rocks. Sediments are gray to yellowish gray, loosely compacted medium- to fine-grained sand, gray clay or clayey silt which is largely composed of quartz, plagioclase feldspar, potassium feldspar, mica, and clay minerals. The weathered Sylhet limestone deposit is found in the north part of the study area. The overall lithology of the area is highly discontinuous in nature having less extensive to relatively broad layers with poor- to moderate-yielding capacity aquifers.

Fig. 2
figure 2

The geology of the study area (After Alam et al. 2003)

In the study area, the primary Holocene aquifers mostly consist of gray fine-grained sands, usually having plentiful organic matters, peat, and gravel which are overlain by impervious silt and clay particles (Halim et al. 2010). Based on subsurface geological information, it appears that the thickness of a single aquifer varies from 20 to 98 m in some areas and may be considered to be moderately good aquifer. Figure 3 shows the hydrogeological cross-section of the study area indicating the configuration of the multi-layered aquifers. Hydrogeology and aquifer layers are structurally controlled; the substantial lateral variations in the lithological characteristics of sedimentation occur to short distances which cause changes in the spatial patterns of the aquifer geometry even within 100-m depth. The transmissivity of the aquifer from 146.90 to 825.45 m2/day and storage coefficient from 1.5 × 10−3 to 3.48 × 10−2 (MOA 1997) belongs to Holocene to Late Pleistocene alluvial sands. The hydraulic conductivity of the aquifer is comparatively low (30–50 m/day; BGS and DPHE 2001), which is attributed to the aquifer materials by secondary weathering and the existences of well-sorted fine sands. Although the aquifer is comparatively homogenous with respect to hydraulic properties and resources, the Surma basin is hydro-structurally complex as it is confined by tectonically active structures such as the Dauki fault in the area. The alluvial aquifer is semi-confined to unconfined in nature. The inverse distance-weighted (IDW) interpolation method is used to prepare the groundwater table contour map of the Sylhet area (Fig. 4). As observed from the equipotential lines (m, PWD) of the piezometric map, the groundwater flow direction is usually from north to south, but in the north eastern part, the groundwater flow is driven from west to east. The reasons for this erratic pattern of groundwater flow might be attributed to the controlling of single or coupled geological processes, such as topographically driven flow, free convection, and tectonically driven flow. The Holocene alluvial aquifers are prevalent in the study region which are found in a shallow depth up to 70 m. In fact, this region is exaggerated by heavy pumping activities and over-exploitation of groundwater for irrigation (BGS and DPHE 2001). The groundwater recharge is carried out mainly from heavy rainfall and floodwaters during the monsoon season, resulting in groundwater level rise. The groundwater recharge also occurs from the lateral flow from the northern part of the hilly region. After the monsoon season, part of the water is recharged from the river, stream, pond, and low-lying areas. The peizometric level of groundwater drops during the dry period due to over-exploitation, and all exploited water is replenished completely during the monsoon season.

Fig. 3
figure 3

Hydrogeological cross-section of the study area showing the configurations of the aquifer

Fig. 4
figure 4

Groundwater flow direction map in the Sylhet district of Bangladesh

Sampling analytical procedures

For this study, a total of 91 groundwater samples were collected from different locations (n = 91) in the Sylhet district, Bangladesh, during the dry and wet seasons of the years 2014–2015. The information about well depths has been obtained from the local well owners and the Department of Public Health Engineering (DPHE) office at the study area. Samples have been collected from three types of tube wells including (1) shallow wells (15–70-m depth), (2) intermediate wells (71–145-m depth), and (3) deep wells (146–300-m depth) on the basis of the availability of the study sites. The sampling locations of wells are displayed in Fig. 3, which are recorded using a portable GPS (global positioning system) device. These wells are pumped for several minutes prior to sampling to eradicate any stagnant water. All samples are collected and preserved in prewashed high-density polyethylene (HDPE) bottles. Each sample is stored in three HDPE bottles at each sampling location following the standard method recommended by the APHA (2005). A total number of 91 samples are put in a cooler box and transferred into the laboratory and kept in a fridge for subsequent uses where they are stored at 4 °C temperature. The precisions of analyses are performed by the running duplicate analyses of the selected samples. Cation and anion charge balances (< 10%) are also verified for the precision of the data. The relative standard deviation for measured variables for groundwater samples is found to be within ± 2%. All analytical procedures of groundwater samples recommended by the APHA (2005) are followed in this study (Table 1).

Table 1 The analytical method for selected hydro-chemical parameters of groundwater

The determination of water quality indices

Entropy water quality index

The information entropy theory was introduced by Shannon Claude (1948), where the entropy is considered as the measure of uncertainty or information. Shannon entropy states that the degree of uncertainty influences the predicting of the outcome of an event (Shyn et al. 2011). It principally shows how much a probabilistic event can be stochastic (Gorgij et al. 2017). In fact, this technique reports the randomness of an event mathematically (Shyn et al. 2011). The justification for using the entropy theory is that this method can completely express the information of groundwater quality and successfully eradicate the problems that arise in other techniques for groundwater quality appraisal. So the determination of the EWQIs has been introduced and analyzed in the past literature by Li et al. (2010) and Wu et al. (2011). In this study, the EWQI is applied to characterize groundwater quality due to its simplicity, accuracy, and consistency (Li et al. 2010; Wu et al. 2011; Amiri et al. 2014; Gorgij et al. 2017).

Entropy weight is a popular method to determine the parameter weights, and three steps are used to compute the EWQI as follows (Li et al. 2010):

The first step is the calculation of the eigenvalue matrix, X, which is associated with m groundwater samples and n hydro-chemical parameters of each in Eq. 1.

$$ \mathrm{X}=\left[\begin{array}{ccc}\begin{array}{cc}{x}_{11}& {x}_{12}\\ {}{x}_{21}& {x}_{22}\end{array}& \cdots & \begin{array}{c}{x}_{1n}\\ {}{x}_{2n}\end{array}\\ {}\vdots & \ddots & \vdots \\ {}{x}_{m1}\kern0.5em {x}_{m2}& \cdots & {x}_{mn}\end{array}\right] $$
(1)

The eigenvalue matrix, X, is then converted into a standard-grade matrix, Y, to remove the effect of different units and quantity grades of groundwater quality parameters. The standard-grade matrix is defined in Eq. 2

$$ \mathrm{Y}=\left[\begin{array}{ccc}\begin{array}{cc}{y}_{11}& {y}_{12}\\ {}{x}_{21}& {x}_{22}\end{array}& \cdots & \begin{array}{c}{y}_{1n}\\ {}{y}_{2n}\end{array}\\ {}\vdots & \ddots & \vdots \\ {}{y}_{m1}\kern0.5em {y}_{m2}& \cdots & {y}_{mn}\end{array}\right] $$
(2)

The ratio of parameter index value, j, and the i sample is calculated by Eq. 3

$$ {\mathrm{P}}_{ij}={Y}_{ij}/{\sum}_{i=1}^m{Y}_{ij} $$
(3)

Then, the information entropy, e j , is computed by Eq. 4

$$ {e}_j=-\frac{1}{lnm}{\sum}_{i=1}^m{P}_{ij}\mathit{\ln}{P}_{ij} $$
(4)

The smaller the amount of entropy, the bigger is the effectiveness of the parameter, j , index. Equation 5 is used to calculate the entropy weight, ω j , of each parameter

$$ {\omega}_j=\frac{1-{e}_j}{\sum_{j=1}^n\left(1-{e}_j\right)} $$
(5)

The second step is the determination of the quantitative rating scale, q j , for each parameter using Eq. 6

$$ {q}_j=\frac{C_j}{S_j}\times 100 $$
(6)

where C j denotes the concentration of the parameter j (mg/L) and S j indicates the permissible limit of Bangladesh drinking water quality standards of the parameter j (mg/L).

The third step is the computation of EWQI by using Eq. 7

$$ \mathrm{EWQI}={\sum}_{j=1}^n{\omega}_j{q}_j $$
(7)

According to Wu et al. (2011), groundwater quality can be categorized into five ranks, varying from “excellent water” to “extremely poor water” quality domains for drinking purposes in the study area.

Groundwater quality index

The GWQI technique reveals the composite effect of various water quality parameters on the suitability for human consumption (Bodrud-Doza et al. 2016). In the present study, the “standards” (permissible concentrations of hydro-chemical parameters) for the drinking water are recommended by the Department of Environment, Bangladesh (DoE 1997). The groundwater quality is computed by using Eq. 8 for GWQI, according to the Vasanthavigar et al. (2010):

$$ \mathrm{GWQI}=\sum {\mathrm{SI}}_{\mathrm{i}}=\sum \left( Wi\times qi\right)=\sum \left(\left(\frac{wi}{\sum_{i=1}^n wi}\right)\times \left(\frac{Ci}{Si}\times 100\right)\right) $$
(8)

Where, qi is water quality rating, Wi is the relative weight, Ci is concentrations of each parameter, and Si is the Bangladesh standard. Then, SI i is the subindex of ith parameters. Groundwater quality can be then ranked into various classifications based on the GWQI values in the study region. According to Kamrani et al. (2016), groundwater quality can be then characterized into five classes/ranks based on the GWQI values (Table 2).

Table 2 Standard classification of groundwater quality for drinking purpose according to GWQIs

Spatial autocorrelation and geostatistical analyses

Statistical techniques cannot evaluate properly the current tendency of groundwater dataset properties in the sampling locations because of data discontinuity and spatial heterogeneity. In such a case, an efficient technique is required for evaluating qualitative data including spatial pattern analysis. Spatial autocorrelation is one of the developed geographic models that can allow the evaluating of geographically distributed phenomena in terms of spatial analysis. The spatial pattern has three shapes such as clustered, random, and dispersed which are applied in this method. Spatial autocorrelation measures the relationship between one parameter and the neighboring parameter’s proximity in a geographic location (Griffith 2003). It appears that the Moran’s I (Anselin 1995) is the most popular model for spatial clustering analysis in recent studies in comparison with other spatial autocorrelation index models (Fu et al. 2014; Gorgij et al. 2017). Subsequently, this model has been employed in this study and can be expressed in Eq. 9:

$$ I=\frac{n\sum {w}_{ij}\left({x}_i-\overline{x}\right)\left({x}_j-\overline{x}\right)}{w\sum {\left({x}_i-\overline{x}\right)}^2} $$
(9)

where\( \overline{x} \) is the mean value of x i with the sample number n, x i is the value of the parameter at site i, x j is the value at other sites, and w ij is the distance weighting between x i andx j , which can be expressed as the inverse of the distance. The Moran’s I values vary from − 1 to 1, where 1 is for a positive spatial autocorrelation, indicating the more clustered the parameter, and − 1 is for a negative one, suggesting more dispersed parameters. If the Moran’s I value is equal to 0, it indicates a spatial randomness (Fu et al. 2014). The outcomes of the Moran’s I index can be then standardized; thus, its significance level is tested based on an assumption of a normal distribution (Fig. 5). To investigate the spatial pattern using the Moran’s I index, the outcomes may be influenced by the function of weight function, data conversion, and existence of any extreme value. This study has taken into consideration these factors to get more reliable results.

Fig. 5
figure 5

The significance levels and their corresponding Z scores of the spatial autocorrelation index (After modification from ESRI 2009)

The geostatistical model describes the spatial variability of the phenomenon which estimates the unknown values of constantly distributed variables. Nowadays, numerous spatial interpolation models are used to measure the spatial variation of groundwater quality in a particular site. The kriging interpolation models estimate unknown values from the information which is collected from the neighborhood. Different types of kriging interpolation models such as simple, ordinary kriging, and universal kriging models are available in geostatistical interpolation techniques. The universal kriging model is one of the optimal geostatistical techniques that can be applied for mapping groundwater quality where observations near each point are more similar than those far away. Both universal kriging and ordinary kriging yield similar interpolaing estimates, but the only diference is the extrapolating estimates based on a trend fitted to the last few dataset values. For extrapolation, universal kriging has an advantage where an attribute suggests a particular function form for extrapolating a trend fitted from the sampled dataset values. Furthermore, the spatial variation of groundwater quality indices shows some spatial heterogeneity and their index values are seldom available for each possible site of a particular region. The measurement of these index values at each site is not always possible in view of the time and the cost involved in data collection and processing. Consequently, prediction of such index values at other sites based upon the selectively estimated indices values could be one of the alternative options. In such a case, to predict the index values at unmeasured sites, geostatistics like the universal kriging method can be employed as the right tool. The justification for using this model is that it represents the probability of interpolated estimation error of the regionalized variable, where there are no preliminary extents and its simplicity and prediction accuracy results make it better in comparison with other models like the IDW interpolation technique (Islam et al. 2017b). The spatial distribution can be calculated by Eq. 10 (Delhomme 1978):

$$ \widehat{z}\;\left({x}_o\right)={\sum}_{i=1}^n{\lambda}_i\;z\left({x}_i\right) $$
(10)

where \( \widehat{z} \) is the estimated value of an attribute at the point of interest x 0, z is the observed value at the sampled pointx i , λ i is the weight assigned to the sampled point, and n represents the number of sampled points used for the estimation (Webster and Oliver 2001). The attribute is usually called the primary input parameter, particularly in the geostatistical model. Spatial variability is measured by a semivariogram. The semivariogram model is obtained from the sampling data by Eq.11:

$$ \gamma (h)=\frac{1}{2n}{\sum}_{i=1}^n{\left[z\left({x}_i\right)-z\left({x}_i+h\right)\right]}^2 $$
(11)

where n is the number of pairs of sample points separated by the standard distance called lag h (Burrough and McDonnell 1998). The z(x i)is the value of the variable z at the location of x i. On the other hand, the ordinary kriging model uses the above equations to determine the spatial variability of the groundwater quality indices of the investigated samples. The predictive performance of the fit model is tested on the basis of cross-validation results. The mean error (ME), mean square error (MSE), root mean square error (RMSE), average standard error (ASR), and root mean square standardized error (RMSSE) values are evaluated to determine the best model performance. The model attains the best goodness of fit resulting from the minimum MAE, RME, MSE, and RMSE values and ASE and RMSSE values that are close to 1, which is considered as the best fit model (ESRI 2009). The best explanation is that if a model has the lowest value of RMSE, then, the applied model simulates the fact very well. After conducting the cross-validation test, kriged maps are generated which provide insights into the graphical representation of the spatial distribution of the groundwater quality indices. For this study, Arc GIS (10.2 version) is applied to produce these kriged maps.

Pearson’s correlation coefficient is employed to recognize the relationship between the pairs of physio-chemical parameters. The terms “strong/significant,” “moderate,” and “weak/ insignificant” are used in correlation coefficient analysis (Liu et al. 2003) which relate to the absolute values as more than 0.75, 0.75–0.50, and 0.50–0.30, respectively.

The principal component analysis (PCA) is a widely applied multivariate statistical technique to analyze the inter-relationship among different sets of groundwater hydrochemical data to extract the most significant factors and to reduce the data with minimum loss of information. Its goals are to make the data easy to explore and visualize and to infer the underlying geogenic and/or anthropogenic factor influencing groundwater quality. The PCA methodology is applied where the corresponding eigenvalues (> 1) for the Euclidean matrix are computed by using the SPSS version 22.0 software.

Results and discussion

General groundwater characteristics

General physico-chemical parameters such as total dissolved solids (TDS), electrical conductivity (EC), Na+, K+, Ca2+, Mg2+, HCO3 , Cl, SO4 2−, NO3 , and trace elements including As, Fe, and Mn are statistically analyzed. Table 3 reveals that the mean concentrations of these parameters do not exceed the Bangladesh (DoE 1997) and WHO (1998) drinking water guideline values except in some sampling locations. The preliminary evaluation of the analyzed parameters shows that HCO3 and Na+ ions have the highest mean concentrations among all physico-chemical parameters followed by Cl with the third highest concentration (Table 3). The Na+ and HCO3 ions represent the base ion exchange reaction in the study area. In addition, variations in the concentrations of the trace metals such as the concentrations of As (range 0.03–148 μg/L with a mean value of 27.20 ± 32.75 μg/L), Fe (range 0.01–22.7 mg/L with a mean value of 6.83 ± 6.04 mg/L), and Mn (range 0.01–0.83 mg/L with a mean value of 0.28 ± 0.21 mg/L) are observed in the studied samples. The higher concentrations of Fe and Mn in groundwater samples may be due to the reduction of their oxy-hydroxides through the biodegrading of organic particles (Halim et al. 2010). The elevated levels of As (> 50 μg/L) that are found in 18 sampling sites (S2, S5, S8, S10, S18–19, S21, S27–28, S31, S33, S40, S49–50, S86, and S88–89) may have some effects on groundwater quality, which is distributed over the central, northeastern, and southern regions of the study area (Fig. 1). Out of 91 wells, 48 tube wells exceed the WHO guideline value of 10 μg/L for drinking purposes. It is mentioned that the concentrations of As, Fe, and Mn in most of the sampling locations are higher than the water quality guideline values set by Bangladesh (DoE 1997) and international organizations (WHO 2011) in the study area. The high contents of As, Fe, and Mn may be due to the toxic to sub-toxic conditions of the shallow aquifer system (Rahman et al. 2016). Reza et al. (2010) have reported that high As and low Fe and Mn concentrations have been found in the groundwater of the Meghna floodplain at southeastern Bangladesh, but their findings vary from this observation in the Surma floodplain of the northeastern Bangladesh area, where the elevated concentrations of As, Fe, and Mn exist.

Table 3 Descriptive statistic of physico-chemical parameters and trace metals in the study are (n = 91)

All the water samples are classified as freshwater type in the study area, because the TDS mean concentration (mean 188.67 ± 93.58 mg/L) is less than 1000 mg/L (Freeze and Cherry 1979). Groundwater flows from the relatively high areas toward the southwest from the northeast direction (Fig. 2). This may decrease the residence time for rock–water interactions and also decrease the concentrations of groundwater ions. Groundwater with the concentration of Na+ (mean 47.16 ± 30.96 mg/L) exceeds the Cl concentration (mean 12.33 ± 15.73 mg/L) which is likely to show an origin from a freshwater source (Hounslow 1995). This is because the study area is located in the highest-precipitation region in Bangladesh. The highest concentration of Na+ in groundwater may be due to cation exchange and the dissolution of silicate rock (such as albite) in the study sites (Oinam et al. 2012). Although K+ ion (mean 2.39 ± 1.78 mg/L) is naturally occurring in groundwater, industrial and domestic wastes are also responsible in adding it into groundwater (Garg et al. 2009). Furthermore, the concentration of Ca2+ (mean 7.18 ± 6.35 mg/L) exceeds that of Mg2+ (mean 4.15 ± 3.08 mg/L), indicating that the dissolution of calcite may be a dominant factor governing the groundwater chemistry. The HCO3 concentration (mean 147.65 ± 72.06 mg/L) is commonly the outcome of carbonate dissolution. In general, water in the aquifer is of the freshwater category characterized with high concentration of HCO3 (Wu et al. 2017). The concentration of Cl in groundwater originates from geogenic (chloride-rich minerals) or anthropogenic sources, such as domestic sewage, chemical fertilizers, and septic tanks (Ahada and Suthar 2017). The NO3 concentration (mean 3.18 ± 7.32 mg/L) in groundwater occurs from non-point sources including leaching of fertilizers, pesticides, domestic sewage, and animal waste disposal. The elevated contents of NO3 (> 10 mg/L) are found in six sampling sites (S3–5, S14, and S61). The SO4 2− concentration (mean 2.95 ± 5.21 mg/L) in groundwater mostly occurs due to sulfate-bearing mineral dissolution and oxidation (Krouse and Mayer 1999). However, the normal skewness of the measured parameters should be within the standard range ± 2; otherwise, it is regarded as an extreme (Islam et al. 2017b). Moreover, Cl, NO3 , and SO4 2− ions reveal the highly positive skewed datasets and are considered to be extremes that influence mostly the aquifer system. Likewise, in case of kurtosis, these ions, i.e., K+, Cl, NO3 , and SO4 2−, are identified in the leptokurtic scale (value more than 3), while the remaining ions are found in the platykurtic scale, since their values are less than 3 (Table 3).

Factors affecting the groundwater quality

After initial evaluation of the studied parameters, the Pearson correlation analysis of these 13 parameters is applied to find the influence of parameter pairs that affect the groundwater quality. As given from Table 4, the various physico-chemical parameter pairs of groundwater samples have a significant positive correlation such as EC-TDS (r = 0.994), EC-Na+ (r = 0.753), EC-HCO3 (r = 0.882), EC-As (r = 0.617), Na+-HCO3 (r = 0.681), Ca2+-Mg2+ (r = 0.840), Mg2+-NO3 (r = 0.563), and As-HCO3 (r = 0.541), respectively, at the 99% confidence level. The positive significant correlation can show the same sources, which can be natural or anthropogenic sources, and mobility (Haloi and Sarma 2012). For example, the positive significant correlation between EC and Na+ has a geogenic origin, which can be found in the study area from similar sources. In addition, the moderate significant correlation existing between As and HCO3 in the groundwater further indicates the mobilization of As in the groundwater under an anoxic environment (Halim et al. 2010). It is clear that trace metals with a moderately significant positive correlation [r (Fe-Mn) = 0.502] at the 99% significance level probably originated from alike point sources. Similar findings have been reported by Islam et al. (2017b) in the shallow groundwater in Rangpur district, Bangladesh, where occurrences of trace element are affected by redox levels and the nature of the underlying rock of the aquifer. In addition, some pairs have also shown an insignificant negative correlation such as EC-Mn (r = −0.355), Na+-Fe (r = −0.324), and Mn-HCO3 (r = −0.287). The insignificant correlations are observed in other parameters. This indicates that the source of these parameters is independent from each other (Kamrani et al. 2016).

Table 4 Correlation coefficient results of physico-chemical parameters in the study area

In order to understand the contributing factors of physico-chemical parameters affecting the groundwater quality, hydro-chemical data are employed to PCA that allows grouping them based on their inherent characteristics. The PCA is performed with an orthogonal Kaiser’s varimax rotation to make the factors more interpretable without changing the original data structure (Mertler and Vannatta 2005). Thus, the PCA of the current dataset can successfully reduce the contribution of less important parameters in the assessment of groundwater quality. However, the scree plot is employed to detect the number of principal components PCs/groups to be retained to know the original physico-chemical parameters (Fig. 6). Based on the scree plot results, eigenvalues are greater than 1, and three PCs/three groups are extracted which contribute 65% of the total variance (Table 5).

Fig. 6
figure 6

Scree plot for the component of groundwater samples

Table 5 The rotated common factors for loadings, the percentage of variance, and the total cumulative percentage of variance

It is evident that the groundwater sample data can be clustered into three PCs/groups. The first PC (PC1) explaining 31.28% of the total variance is positively loaded with EC, TDS, Na+, HCO3 , and As. This factor defines the shallow to intermediate groundwater geochemistries and explains the weathering of silicate rock. The cation exchange may be attributed to a source of Na+ ion because of the dissolution of the less soluble minerals rather than halite dissolution. Among the analyzed parameters, Na+ and HCO3 factors contribute the most and are grouped together in the first PC. Both Na+ and HCO3 are very strongly correlated with EC [r (Na-EC) = 0.753 and r (HCO3-EC) = 0.882)] as well as TDS [r (Na-TDS) = 0.743 and r (HCO3-TDS) = 0.881] at the 95% significance level. The aqueous As and HCO3 may originate from the dissolution of carbonate rock, such as limestone, in the study area which accompanied the release of As due to the groundwater–rock interaction (Harvey et al. 2002) which is also confirmed by the strong correlation with each other [r (As-HCO3) = 0.541). The second PC (PC2) that explains 22.09% of the total variance is positively loaded with Ca2+, Mg2+, NO3 , and Fe. The Ca2+ and Mg2+ factors explain the natural origin of the ions through the dissolution of calcite by carbonic acid. Water recharge in the area is mostly from Holocene alluvial aquifer containing the carbonate rock which consists of Ca2+ and Mg2+ ions resulting in higher concentrations of Ca2+ and Mg2+ into the groundwater. Also, Sylhet limestone formation is located in the study area which controls the groundwater geochemistry. Water samples in the aquifer may be under-saturated with calcite dissolution (Bhuiyan et al. 2016), therefore playing a role in increased Ca2+ and Mg2+ ions along groundwater flow path. Additionally, geogenic factors like oxidation of Fe (Rahman and Gagnon 2014) and rain water through the discharge of secondary salts which infiltrated into the aquifer might be another possible factor contributing to groundwater quality. However, the NO3 is derived from leaching of NO3 with the percolating water due to the various agricultural activities such as extensive use of chemical fertilizers and agrochemicals (pesticides) in the study locations, thus representing the anthropogenic factor (Amiri et al. 2014). The third PC (PC3) is positively loaded of SO4 2−, Mn, and Fe with 11.19% of total variance. This factor explains the geogenic origin of groundwater through rock–water interaction with ionic exchange (Omo-Irabor et al. 2008). In addition to natural rock–water interaction factors such as gypsum dissolution, excessive use of sulfate fertilizers, and agricultural activities, it can be regarded as another factor in enhancing SO4 2− ion concentration in the aquifer. The climatic variability may be caused by the movement of NO3 and SO4 2− between PCs (factors) 2 and 3. In the study region, Mn and Fe are moderately loaded, which may originate from natural factors that can be released by chemical weathering of the parent rock (Bodrud-Doza et al. 2016). Bacterial activities are also related to releasing Mn and Fe into the groundwater system (Islam et al. 2013a, b).

An interesting finding in comparing the PCA results is the difference in the PCA weight of concentrations of NO3 , SO4 2−, As, Na+, and HCO3 . The NO3 concentration shows high positive loading with PC2, while As concentrations indicate high positive loading with EC, TDS, Na+, and HCO3 in PC1 and moderate positive loading with SO4 2− in PC3. It implies various sources of contaminants contributing to groundwater quality. The analyzed results indicate that PCA can help as a significant means to recognize the major factors affecting groundwater quality in the study area.

Inter-relation of parameters with entropy and spatial autocorrelation index

Before the EWQI and GWQI classifications of each sample, it is essential to know the relationship between information entropy value and entropy weight with physico-chemical parameters. It is generally accepted that the physico-chemical parameters with the highest entropy weight and the lowest information entropy value have the maximum effects on overall groundwater quality (Shyu et al. 2011; Wu et al. 2011; Gorgij et al. 2017). So the maximum entropy value shows a more stable groundwater quality. The entropy value reduces the relative error derived by ignoring the artificial weights (Amiri et al. 2014). This value ignores the artificial weight dividing following an efficient and rational weighting method. It depends on the information entropy value’s disarranging of factors. The results of the calculated entropy weight and groundwater quality weight for the 13 analyzed parameters is given in Table 6. It is observed that NO3 has the highest effect on water quality in the study region. It is because this parameter has the minimum entropy value and the maximum entropy weight of the physico-chemical parameters that were evaluated. The effects of other parameters on overall groundwater quality are assessed which decrease in the following order: SO4 2−  > As > Cl  > Fe > K+  > Ca2+  > Mn > Mg2+  > Na+  > EC > TDS > HCO3 . It is also observed that HCO3 has a minimal impact on overall groundwater quality of the study region. Similar findings are found by Amiri et al. (2014) in the Lenjanat plain aquifer, Iran, but this observation differs from the previous results of Gorgij et al. (2017) where HCO3 has the highest influence of groundwater quality in the Azarshahr plain aquifer, Iran.

Table 6 The entropy weight and groundwater weight of physico-chemical parameters

On the other hand, in the case of groundwater weight, as given in Table 6, the weights of the most parameters are 4 based on the influence of expert evaluation. They have relative significance with regard to water quality criteria (Vasanthavigar et al. 2010; Bodrud-Duza et al. 2016; Islam et al. 2017a). As seen from Table 6, the three parameters NO3 , SO4 2−, and As are the most contributed parameters and thus cause the groundwater quality deterioration in the Sylhet district of Bangladesh. Similar results of PCA also confirm that the NO3 and SO4 2− ions are one of the major contaminants influencing groundwater quality. However, EC, TDS, Na+, Ca2+, and Mg2+ have higher factor loading in F1 and F2, respectively, which have more importance compared to other parameters in the study area. It is considered that the weathered Sylhet limestone formation comprises gypsum as the key source of SO4 2− and the higher concentration of NO3 is derived from anthropogenic inputs which cannot change over a short time; the factor affecting these parameters mainly contribute to regional climate change. For instance, changes in temperature, precipitation amount, and atmospheric CO2 concentrations have affected the agricultural NO3 source through changes in both soil processes and agricultural productivity (Stuart et al. 2011). However, the major physico-chemical parameters have a higher effect on GWQI computation such as NO3 , SO4 2−, EC, TDS, Na+, As, Mn, and Fe. Therefore, special attention should be required to NO3 , SO4 2−, and As parameters which contaminate the groundwater, because these three parameters have mostly larger entropy weights and stress high effects on overall groundwater quality. Once the groundwater is contaminated by NO3 , SO4 2−,and As, it is quite difficult and more expensive to be treat and restore the groundwater system of the study area.

The spatial autocorrelation of physico-chemical parameters has been investigated using Moran’s I values to exhibit how neighboring values are related to each other which affects the overall groundwater quality. Table 7 shows the Moran’s I value and spatial pattern of the analyzed hydro-chemical parameters. Na+ has the highest Moran’s I value with statistical significance (P<0.01) and is thus the most clustered pattern of positive spatial autocorrelation while the NO3 has the lowest Moran’s I value with a statistical non-significance (P > 0.10) and a random spatial pattern of negative spatial autocorrelation. Half of the parameters (e.g., EC, TDS, HCO3 , As, Fe, and Mn) exhibited a clustered spatial distribution with a high positive spatial autocorrelation in the study area.

Table 7 Results of spatial autocorrelation index and spatial distribution of physico-chemical parameters

It is hypothesized that decreasing the entropy value decreases the spatial pattern of hydro-chemical parameters as well as decreases the spatial autocorrelation. The comparison between the information entropy value and Moran’s I value of each parameter is displayed in Fig. 7. Na+ has the highest spatial autocorrelation and the most entropy value whereas NO3 has the lowest spatial autocorrelation and thus a minimum entropy value. Figure 6 reveals that the spatial autocorrelation values and their corresponding information entropy values agreed well, indicating a significant proportional relationship with each other. The results of spatial autocorrelation analysis and the entropy values of each hydro-chemical parameter confirm the observation that a higher information entropy value indicates the high degree of spatial autocorrelation and also a vice versa relationship. This finding differs from the observation of Gorgij et al. (2017) where a significant inverse relationship is found between the degree of spatial autocorrelation and the entropy of physico-parameters in the Azarshahr plain in Iran.

Fig. 7
figure 7

The information entropy value and Moran’s I spatial autocorrelation compared with physico-chemical parameters

Characterization of groundwater quality ranks

After conducting preliminary investigation and analysis of the linear relationships among the parameters and possible factors that contribute to the groundwater quality through PCA, inter-relationship of parameters with information entropy, and spatial autocorrelation index that affects the water quality, the EWQI value and, subsequently, the rank of each sample for drinking purposes are computed by using Eqs. 17. The calculation of GWQI values for each sample is also carried out using Eq. 8. These index classifications give the overall status of water quality with a unique number, which is easy to recognize. The suitability of the groundwater quality for drinking purposes has been determined using the Bangladesh standard (DoE 1997) values, and the results are shown in Tables 8 and 9, respectively. Table 9 reveals that the ranges and mean value of EWQIs of the samples are 4.36–331.31 and 86.10, respectively. The critical limit for EWQIs is 100, indicating that 40.66% of the samples fall above the critical limit. Only 19 groundwater samples (20.87% of all samples) are categorized as good-quality water (rank 2), which is fit for drinking purposes, while 35 groundwater samples (39.56% of all collected samples) are categorized as excellent-quality water (rank 1), showing that the groundwater is fit for drinking uses (Table 8). In addition, 20 samples (21.98%) are marginally suitable for drinking purposes. Thirteen groundwater samples (S4, S11, S15, S18, S20, S21, S30, S44, S53, S58, S61, S62, and S74) are classified as poor-quality water (rank 4), which suggests that these samples are unsuitable for drinking purposes. Only three samples (S5, S31 and S48) are extremely poor-quality water (rank 5). Fe, As, Mn, SO4 2−, and NO3 are the most common contaminants in samples S5, S31, and S48. This is a common indicator of anthropogenic effects on groundwater quality. The extremely poor quality is due to the generally higher concentrations of these parameters than the acceptable limit in these samples.

Table 8 Classification of groundwater quality indices for drinking purpose compared with other studies
Table 9 Assessment results of the EWQI and GWQI value classifications with the quality rank of 91 total samples based on the Bangladesh standard

However, the results show that the GWQI values vary from 6.23 to 307.21 with the mean of 108.85 (Table 9). Vasanthavigar et al. (2010) have proposed the critical limit (100) for drinking purposes. The results indicate that 49.45% of water samples exceed the critical limit (100) of GWQI classification (Table 9). But interestingly, the same sampling locations fewer less good-quality ranks in most of the sampling sites in terms of GWQI values compared to EWQI values. Based on GWQI classification, 53.84% (49 samples of all analyzed samples) groundwater samples fall in the group of excellent- to good-quality water ranks, whereas 18.68% of samples (S7, S15, S16, S19, S21, S28, S32, S37, S42, S44, S58, S60, S61, S73, S78, S80, S82) belong to the group of poor quality (rank 4) and 12.08% of water samples (S4, S5, S11, S18, S20, S30, S31, S48, S53, S62, and S74) are identified in the extremely poor categories (rank 5) which are not fit for drinking purposes. Out of 91 samples, 14 samples (15. 38%) are classified as of medium water quality (rank 3) which can be suitable after pre-treatment for drinking uses (Table 8). These findings are echoed by the similar works of Bodrud-Doza et al. (2016) and Bhuiyan et al. (2016) where they have characterized the groundwater quality based on GWQI classification for drinking uses around the central and southeastern parts of Bangladesh, respectively. According to Table 9, 11 samples (S4, S5, S11, S18, S20, S30, S31, S48, S53, S62, and S74) have trace element contents more than the prescribed values. Therefore, these 11 samples show the extremely poor-quality water (rank 5) for drinking uses based on GWQI calculation. But results of EWQI classification exhibit two quality ranks for these 11 samples; this is because the parameters of this technique have various weights. For instance, based on EWQI calculation, only three samples (S5, S31, and S48) have extremely poor water quality (rank 5) while another eight samples (S4, S11, S18, S20, S30, S53, S62, and S74) are of poor water quality (rank 4). It is also found that 46 samples have the same rank between the EWQI and GWQI classifications.

In fact, there are some paradoxes in ranking water quality for the EWQI and GWQI classifications. The main reason is that the GWQI classification needs to assign artificial weight for calculation of water quality and the concentrations of some parameters are higher than their corresponding acceptable limits. It can be seen as a weakness in the accuracy of a tool like GWQI classification. As a consequence, in these locations where groundwater chemistry demonstrates the presence of some toxic trace metals like arsenic in the groundwater system, the EWQI calculation of groundwater for drinking uses provides more reliable results than GWQIs and is also consistent with the outcomes of field investigation. The results of EWQIs are consistent with previous observations of Amiri et al. (2014) where they have shown that comparing the two index classifications, the entropy weight made EWQIs perfect and the assessment results are more reasonable than other index methods. These findings are in disagreement with the results of Kamrani et al. (2016) who have reported that among three index classifications, the fuzzy-based water quality index method acts more accurately than the two other methods GWQIs and EWQIs.

Furthermore, Pearson’s correlation coefficient is also applied to explain the association between the pairs of water quality indices and physico-parameters and how the parameters affect the two index methods (Table 10). For example, the EWQI values in groundwater have significantly positive correlation to Fe (r = − 0.957). But they are moderately positively correlated to Mn (r = − 0.530), NO3 (r = − 0.552), and Mg2+ (r = 0.500) suggesting the moderate impact from NO3 , Mn, and Fe on the overall groundwater quality. The EWQI values have insignificantly negative correlation to Na+ (r = − 0.226) and HCO3 (r = − 0.033) in the study area. Likewise, the GWQI values have significantly positive correlation to Fe (r = − 0.972) and Mn (r = − 0.639). They are also positively correlated to NO3 (r = − 0.401) and Mg2+ (r = 0.418), whereas GWQIs have insignificantly negative correlation to Na+ (r = − 0.268) and HCO3 (r = − 0.114) in the studied samples. It is evident that both water quality index classifications show a moderate positive correlation with NO3 in the study locations, suggesting that anthropogenic inputs, such as the discharges of domestic and small industrial wastewater, are the major reason that contributes to high NO3 concentration in underground water (Su et al. 2017). This finding is in good agreement with the results of Fogbate et al. (2014) in western Nigeria, where they have found a significant correlation between water quality index and NO3 . Both index values are highly correlated with each other (r = 0.983 at the 99% significance level), indicating the consistency of the results in the analyzed samples. Hence, these correlation coefficient results provide a useful numerical insight into the similarity between the pairs of two parameters that affect the groundwater quality at northeastern Bangladesh.

Table 10 Pearson correlation coefficient analysis of the EWQIs and GWQIs with physico-chemical parameters

The results of EWQI and GWQI methods exhibit more or less alike trends for most of the groundwater samples (Fig. 8). The GWQI values have demonstrated higher spatial variability, whereas EWQI values have shown lower spatial variability in the study area. Overall, groundwater quality in most parts of Sylhet district shows a large spatial variation (Fig. 8). Most of these sampling locations with excellent to good qualities are located in urban, semi-urban, municipal, and rural areas along the study sites (Fig. 1). Increasing the concentrations of trace elements such as As, Mn, and Fe and ions including NO3 , SO4 2−, and Cl is mostly responsible in contaminating the groundwater because of excessive uses of fertilizers in agricultural fields, gypsum dissolution, and ionic exchange in the urban and rural wastewaters infiltrating into the aquifer. Under these settings, the concentration of solutes and groundwater contaminants could be influenced by meteorological causes including rainfall as the most dynamic factor (by change in the dissolution rate and dilution of contaminants), thus the groundwater quality oscillations.

Fig. 8
figure 8

Spatial variations in groundwater quality index values in the study area

Figure 9 shows that the overall water quality in a deep well (146–300 m) is better than that in a shallow well (15–70 m) and intermediate well (71–145 m), which may be attributed to the direct infiltration from the rainfall during the monsoon season (July to October), which is the dominant recharge source of the groundwater. It can be said that the deep aquifer can be considered as suitable for human consumption and less contamination occurs here currently.

Fig. 9
figure 9

Depth-wise distributions of the groundwater quality index values for water samples

Semivariogram model and spatial distribution analysis

The semivariogram (h) model is computed, and the scatter plot of (h) vs. h (distance) was made after standardizing the groundwater dataset. Like other interpolation methods, the universal kriging technique is employed in the study which is regarded as a robust tool for initial policy-making of groundwater quality assessment at northeastern Bangladesh. Various theoretical semivariogram models are applied to fit the measured values, the model with the most fitted value, and the minimum nugget value choice (Goovaerts 1997). The nugget, sill, lag size, nugget/sill ratio, and range of the best fit semivariogram model are given in Table 11. Figure 10 shows that the binned sign donates the experimental semivariogram model around the omnidirectional model showing the blue line and the plus sign indicates the average of the semivariogram model. In this study, RMSE is applied to examine the best fit model by its minimum value, ME and MSE which are nearly zero, RMSSE which is close to 1, and the lowest values of RMSE and ASE which indicate the most suitable model to the datasets (Islam et al. 2017b). The Gaussian model has been chosen to be the best fit experimental semivariogram model for the EWQI and GWQI values according to the minimum values of RMSE and ASE and RMSSE values which are close to 1. Such finding is echoed by the related research of Munna et al. (2015), who have analyzed the spatial structure of groundwater physico-checmical parameters in the Sylhet city corporation area, Bangladesh. Moreover, the ranges vary due to topographic and geometric factors in any region where the large distance and variation of groundwater quality parameters may be affected by meteorological factor such as rainfall, runoff, wind speed, and agronomic practices. The range of the model in this study for the two index methods varies from 5.78 km for EWQIs to 5.69 km for GWQIs, which is enough to examine the spatial distribution of groundwater quality indices in Sylhet district, Bangladesh. Figure 11 displays the scatter plots for the predicted and measured groundwater quality index values. After fitting the best model, the cross-validation tests are applied to investigate how well the index model predicts at unsampled sites. Results show that the comparisons between predicted and measured values are reasonably good (Fig. 11).

Table 11 The best fit semivariogram models of groundwater quality indices and their variance using universal kriging interpolation technique
Fig. 10
figure 10

The best fit semivariogram models of groundwater quality indices in the study area. a EWQI. b GWQI

Fig. 11
figure 11

Scatter plot diagrams show the best fit model performance of groundwater quality indices. a EWQI. b GWQI

The spatial dependence of groundwater quality index values depends on the nugget/sill ratio. The nugget/sill values of both indices are 0.885 and 0.836, respectively. As a standard classification, the variable is regarded to be a strong spatial dependence if the nugget/sill ratio is less than 0.25, a moderate spatial dependence if the ratio is between 0.25 and 0.75, and a weak spatial dependence if the ratio is more than 0.75 (Shi et al. 2007). Both indices exhibit a weak spatial dependence in the semivariogram model (Fig. 10a, b); it suggests that the moderate natural and anthropogenic influences spatially affect the groundwater quality. Similar findings are obtained from the most recent study in the southeastern part of Bangladesh by Bhuiyan et al. (2016), but these findings differ from the results of Bodrud-Doza et al. (2016) and Islam et al. (2017b) where the moderate to strong spatial dependencies have been identified in the semivariogram model in the northern and central parts of Bangladesh.

The universal kriging method is used to generate spatial maps of groundwater datasets (n = 91) for two groundwater quality indices (EWQIs and GWQIs). The spatial distribution maps demonstrate an increasing trend from the southwest to northeast directions in the study area (Fig. 12). The high EWQI values (rank 5) are found in the north and northeastern parts of Gowainghat, Jaintapur, and Kanaighat upazila of the sampling locations, while low EWQI values (rank 1) are observed in the southwestern and central parts of Biswanath, Fenchuganj, Sylhet Sadar, and Gopalganjupazila of Sylhet district, Bangladesh, showing amalgamate point factors that affect groundwater quality (Fig. 12a). Enormously medium EWQI values are identified in southwestern, northwestern, eastern, and southeastern parts including Balaganj, Companiganj, Zakiganj, and Bianibazarupazila of Sylhet district, indicating the existence of heterogeneous factors that contribute to the groundwater quality contamination (Fig. 12a). However, the relatively high GWQI values (rank 4) are found in the northern, northeast, and southeastern parts such as Jaintapur, Gowainghat, Kanaighat, and Beanibazarupazila of Sylhet district which may be attributed to the integrating effects of contamination from escape of ions, over-exploitation of groundwater, release of effluents, and runoff from agronomic fields (Dash et al. 2010; Islam et al. 2017b). Islam et al. (2015) have found a poor water quality type (rank 4) in several sampling sites in the northwestern part of Bangladesh, where detrimental health impact happens due to consumption of drinking water. The rural area in the northern part (Gowainghatupazila) of the sample locations such as S31, passing through a highway road boundary, an evidence of dumping waste, and an excessive use of NO3 fertilizer in agricultural areas can confirm this finding (Fig. 1). This is an alarming condition from a human health perspective due to the extremely poor water quality (rank 5). In fact, both indices exhibiting high values in the northern part of Gowainghat and Jaintapurupazila of the sampling locations is alarming, as most of the households rely on groundwater for domestic purposes. The irregularities observed in the northeastern part of Kanaighat upazila of the GWQI map have the highest GWQI values (rank 5 in sampling site S48) restricting its use for human consumption (Fig. 12b). The GWQI distribution pattern demonstrates a spatial heterogeneous trend in various directions. It shows a decreasing trend from the center to the southwest and an increasing trend in the center toward the northern part (Gowainghatupazila) of the study region. However, the spatial distribution of GWQI maps shows that the medium values are widespread, not localized in any particular part of the study site. It is advisable that proper treatment measures be taken to check the groundwater quality status for drinking purposes as a serious basis for the Sylhet district, Bangladesh. Otherwise, long-term drinking of contaminated groundwater will pose a great threat to the health of humans. Therefore, groundwater should be pre-processed before consumption for drinking uses by local inhabitants, particularly in Gowainghat, Jaintapur, and Kanairghatupazila study areas. However, future studies will be concentrating on more sampling locations of groundwater to gain better insights into the geogenic and human factors that contaminate the aquifer system.

Fig. 12
figure 12

The spatial distribution maps of two index values obtained by groundwater quality indices in the study area. a EWQI. b GWQI

Conclusions

In this research, an attempt is carried out to characterize the groundwater quality for drinking purposes and spatial variability of the groundwater quality indices by using the integrated approaches of the entropy method, multivariate statistics, spatial autocorrelation index, and geostatistics in the Sylhet district of Bangladesh. The dominance of the major cations is in the following order, Na+ > Ca2+ > Mg2+ > K+, whereas the major anion trend is in the following order, HCO3  > Cl > NO3  > SO4 2−. The results show that NO3 , SO4 2−, and then As are the most contributed parameters, thus causing groundwater quality degradation, and special attention should be required to these parameters to prevent further groundwater contamination in the study region. The outcomes of PCA and correlation coefficient analysis are also consistent with these findings. However, a significant proportional relationship exists between spatial autocorrelation index and the entropy of a physico-chemical parameter. The relationship illustrates the application of the entropy method and the spatial autocorrelation index, which provide a useful insight in clarifying the issues of groundwater quality disquiet.

The results reveal that 39.55 and 46.14% of groundwater samples (n = 36 and 44 locations)are categorized as of the “medium” to “extremely poor” quality domains using the EWQI and GWQI classifications, respectively, which are not fit for drinking purposes. In fact, groundwater quality is severely degraded along the main flow path of groundwater, because the concentration of solutes and groundwater contaminants could be altered by the meteorological factors like rainfall as the most dynamic agent. Furthermore, a moderate spatial dependency has been observed in the semivariogram model; it indicates the weak natural and anthropogenic influences affecting groundwater quality. Spatial distribution maps demonstrate that high index values are mainly concentrated to the northern and northeastern parts of Gowainghat, Jaintapur, and Kanaighat upazila of the sampling locations. These areas need to adopt a long-term monitoring policy of groundwater quality. On the other hand, low index values are distributed to the southwestern and central parts of Biswanath, Fenchuganj, and Sylhet sadar of Sylhet district, Bangladesh, showing less degraded groundwater quality. This research provides not only a comprehensive information on the drinking water quality for policy-making, but also a meaningful reference for regional water managers and practitioners.