Introduction

Groundwater contamination with heavy metals is one of the most important environmental issues in the world (Kumar et al. 2012). Groundwater plays a major role in people’s livelihood and economy of Bangladesh as well. However, rapid urbanization, agricultural activities, and natural geochemical processes are affecting directly or indirectly on the chemical composition of groundwater day by day. Rangpur Sadar Upazila (a subdistrict, smaller administrative unit) in Rangpur district of Bangladesh is currently facing the problems of garbage and solid waste disposal. These garbage and solid waste get into groundwater from ceramics and small-scale industries also along with other sources in the study area. The high trace element content in groundwater can create hazardous effects for the people who consume groundwater in the study area. This problem has become more acute due to groundwater contamination from a variety of anthropogenic and natural sources. Anthropogenic sources such as agriculture, transportation, and other industries are responsible for mobilization of groundwater contamination at local and regional scales. Natural/geogenic sources such as chemical weathering of rock and minerals can also lead to higher concentrations of trace metals that cause human health and toxicity concern for terrestrial and aquatic life (Singh 2005). A wide range of human health problems such as cancer, hypertension, vascular disease, and lung disease happen due to the consumption of contaminated groundwater (Smith et al. 2000). Hence, continuous monitoring and assessment of the groundwater trace metal quality in terms of drinking and irrigation purposes help to save lives and environments.

The spatial distribution of trace elements in groundwater can assist in understanding possible sources and identifying extents of the contaminated area. The trace element contamination is mainly controlled by the geological and geochemical heterogeneity in the groundwater aquifer. Geostatistical modeling is used to estimate the trace element concentration at unmeasured points and non-sampled locations to illustrate groundwater contamination (Webster and Oliver 2001). A detailed explanation of the geostatistical model has been well reported in various literatures (Isaaks and Srivastava 1989; Goovaerts 1997; Webster and Oliver 2001). This model has been successfully applied to explore the spatial distribution of trace metals in groundwater in many parts of the world (Yalcin et al. 2007; Kumari et al. 2014). The interpolation model is classified into geostatistical (e.g., simple kriging, ordinary kriging) and deterministic (e.g., inverse distance weighted) models (Burrough and McDonnell 1998). These two models depend on the similarity of sampling points. Among various interpolation models, the best model for a particular data set is obtained by comparing accuracy of their results (Ghanbarpour et al. 2013). The BGS-DPHE (2001) assessed the spatial variation and prediction of arsenic (As) contamination through kriging geostatistical modeling at non-sampled sites of Bangladesh. A number of researchers have reported the spatial variation of arsenic contaminations in groundwater for the different areas of Bangladesh (Gaus et al. 2003; Hossain et al. 2006; Hossain et al. 2007). This study has been designed to apply various interpolation models to compare spatial distribution of trace elements in groundwater samples.

Multivariate approaches, principal component analysis, and cluster analysis are useful tools for identifying common patterns in data distribution, leading to the reduction of the initial dimension of data sets and facilitating its interpretation (Franco-Uría et al. 2009). However, the multivariate statistical approaches are quite difficult to interpret and apply. Because these approaches often require a large number of data, they make somewhat unnecessary complexity and less clear understanding of the data. Simple statistical approaches offer more attractive options in earth sciences though the results may deviate from the real situations (Gajendran and Thamarai 2008). The correlation study provides an excellent tool for prediction of parametric values with a reasonable degree of accuracy (Mathur et al. 2010). The correlation and regression model of the groundwater quality parameters not only helps in assessing water quality but also provides important steps for implementing water quality management practices (Kumar et al. 2011). The reliability of trace metal contents is predicted from the values of coefficient of variance in the correlation matrix study. Groundwater trace element assessment using a number of statistical approaches, together with geostatistical modeling, has not been much focused in Rangpur district, Bangladesh.

However, few studies have assessed heavy metals in the coal mine area of northwest Bangladesh using some indices and multivariate statistical techniques; for instance, Bhuiyan et al. (2010) evaluated heavy metal pollution by using multivariate approaches in the groundwater system in a coal mine area of northwest Bangladesh. The study showed that the mean concentrations of Co, Ni, Cu, and Pb in the water samples exceeded the WHO and Bangladesh standards and the groundwater system is highly contaminated by anthropogenic sources. Unfortunately, very little information is available about the status of trace element concentration of groundwater and their spatial distribution in Rangpur district, Bangladesh. The elevated trace elements like Fe, Mn, and Ba concentrations in the groundwater system are of great public concern on lives owing to their toxicity, persistence, and extensive bioaccumulation for the study area. The need for a detailed systematic and careful study identifying all possible sources, monitoring, and creating a reliable trace element database is rising. The main objective of the study is to assess trace elements of groundwater and their spatial distribution in Rangpur district, Bangladesh. Possible sources of trace elements and their concentration area delineate based on statistical and geostatistical modeling of element distribution.

Material and methods

Description of the study area

Rangpur district (a small administrative unit), located in the northern part of Bangladesh, has been selected for the study. Geographically, it is positioned between 25°18′N to 25°56′N and 88°58′N to 89°30′E. It is bounded by Nilphamari district on the northeast, Gaibandha district on the northwest, Kurigram district on the east, and Dinajpur district on the west (Fig. 1). The study area is chosen primarily based on their proximity of garbage pollution sources and environmental significance. Samples have been collected from eight Upazilas of the district. The Upazilas (a smallest administrative unit, a subdistrict) are namely Rangpur Sadar, Pirgachha, Mithapukur, Pirganj, Badarganj, Taraganj, Gangachara, and Kaunia. Physiographically, the study area falls in the Old Himalayan piedmont plain land (Islam et al. 2014). Overall, about 80% area of the district consists of alluvial soil of the Tista Basin and 20% Barind land. Groundwater is mainly recharged by the monsoon rainfall. The climate of this region is characterized by irregular monsoons, high temperature, much humidity, and heavy rainfall. The highest average temperature recorded in the months of May, June, July, and August are about 32 to 36 °C while the lowest average temperature is observed to be about 7 to 16 °C in the month of January (Banglapedia 2006). The annual rainfall in the study area is about 1448 mm. The groundwater quality at Gangachara and Pirgacha Upazila (north and eastern parts of the area) is mostly dominated by the Tista River.

Fig. 1
figure 1

Location map showing sampling sites of Rangpur district, Bangladesh

Sampling and analytical procedure

A total number of 47 groundwater samples have been collected for chemical analysis from randomly selected sampling points at depth ranging from 10 to 53 m of the study area during the dry season (April–June) in 2014 (Fig. 1). The sampling locations of wells are recorded using a global positioning system (GPS) (Kansas, USA). The information about well depths is collected from the record preserved by the well owners and local government engineering department (LGED). All groundwater samples are collected in prewashed high-density polypropylene (HDPP) bottles following the standard method of APHA-AWWA-WEF (2005). However, 47 samples are put in a cooler box and transferred into the laboratory and kept in a freezer for subsequent uses. The trace metal concentrations (i.e., As, Fe, Mn, Zn, Ba, and Al) in water samples are measured by inductively coupled plasma mass spectrometry (Thermo Scientific XSERIES 2 ICP-MS) (Jarvis et al. 1996) which is linearly calibrated from 10 to 100 μg/l with custom multi-element standards (SPEX CertiPrep, Inc., NJ, USA) before running the test samples. The accuracy and precison of analyses are tested through running duplicate analyses on selected samples. The relative standard deviation for measured elements are found to be within ±2%.

Statistical and geostatistical analyses

Descriptive statistics techniques are used to deduce the normal statistical parameters (maximum, minimum, mean and standard deviation, variance) for groundwater trace metal quality. The correlation matrix measures how well the variance of each variable can be explained by relationships with each others (Liu et al. 2003). The terms “strong,” “moderate,” and “weak/insignificant” are employed to correlation matrix analysis according to the approach of Liu et al. (2003) and refer to absolute values as >0.75, 0.75–0.50, and 0.50–0.30, respectively. When more than two variables are considered simultaneously, multiple linear regression model is used to evaluate their interdependency (Adhikari et al. 2009). In the case of the multiple regression model, the coefficient of determination, r 2 value, is more easily interpretable than the correlation coefficient, r, as a measure of the degree of association, because r 2 is equal to the proportion of the total variability in the dependent variable that may be attibuted to the effects of the independent variables. Statistical analysis has been done using SPSS software (version 22) for Pearson correlation matrix and multiple linear regression analysis for the study. Pearson’s correlation matrix is applied to identify the relationship among the pairs of elements.

Geostatistical modeling is applied for spatial distribution of trace elements which is related to the groundwater system in applied hydrogical studies. This model is well reported in the most recent literatures (e.g., Kumari et al. 2014; Ağca et al. 2014; Bhuiyan et al. 2016). The spatial variability structure of trace elements of grounwater data sets is assessed by the interpolation techniques such as ordinary kriging (OK), simple kriging (SK), and inverse distance weighting (IDW) for the present study. These interpolation models are compared to each other and show the best fit model for extracting the spatial variability of trace elements in the study.

Kriging is a spatial interpolation model used to estimate the value of discrete variables at an unsampled location based on the data sets of discrete variables and structural features of a semivariogram model. Kriging estimates accuracy and reliability of the spatial distribution of the data sets. In the first step, the spatial variation is measured by a semivariogram model (Burgess and Webster 1980). The semivariogram model γ(h) is calculated by Eq. 1, by Journel and Huijbregts (1978):

$$ \gamma (h)=\frac{1}{2 n}{\sum}_{i=1}^n\left[ z\left({x}_i\right)- z\left({x}_i+ h\right)\right]{}^2 $$
(1)

where n is pair numbers of sample points divided by the standard distance called lag h (Burrough and McDonnell 1998). z(x i ) is the value of variable z at location point of x i .

The semivariogram model γ(h) is tested as the best fit model and is described by the nugget (C0), the sill (C), and the range (A0) parameters. The second step, an appropriate semivariogram model (e.g., spherical, exponential, Gaussian, and circular), is chosen based on selecting the trial and error basis. In the present study, spherical, exponential, Gaussian, and circular models have been applied to estimate the spatial dependence/autocorrelation of the groundwater trace elements. The level of random variation within the data set is represented by the nugget, and the sill is equal to the variance of the random variable (Webster and Oliver 2001).

Kriging models can be classified into various models. For this study, ordinary kriging (OK) and simple kriging (SK) models have been applied to show the spatial variation of trace elements and to compare with the inverse distance weighting (IDW) model. The OK model is calculated by Eq. (2), and to ensure that the estimates are unbiased, the weights (λ i ) must be equal to 1 (Ghanbarpour et al. 2013).

$$ \widehat{z}\left({x}_o\right)={\sum}_{i=1}^n{\lambda}_i z\left({x}_i\right) $$
(2)

where ẑ is the measured value at the sampled point x o, z is the observed value at point x i  , λ i is the weight assigned to the point, and n represents the sampled number used for the estimation (Webster and Oliver 2001). The OK model estimates the local constant mean (Goovaerts 1997).

The SK model is determined by Eq. (3):

$$ \widehat{z}\left({x}_o\right)= m+{\sum}_{i=1}^n{\lambda}_i\left[ z\left({x}_i\right)- m\right] $$
(3)

where m is the mean and all other variables are defined as before, in Eq. 3.

For the IDW model, the weight (λ i ) depends on the distance to the predicted location. The weights are based on the distance between the sample location and the prediction points as well as the overall spatial autocorrelation to compare with other kriging models. The weighting is controlled by the power of weights, such that if the power is greater the effect of the points to the distance is greater than expected (Goovaerts 2000). The weighted value decreases with increasing distance from the prediction point in the IDW model. The sum of the weights (λ i ) is equal to 1 in IDW. The formula is presented in Eq. 4:

$$ {\lambda}_i={d}_{i0}^{- p}/{\sum}_{i=1}^n{d}_{i0}^{- p} $$
(4)

where d i0 denotes the distance between the sample locations and the prediction points. For instance, when the distance becomes larger, the weight is reduced exponentially by a power parameter of p. Therefore, the IDW model produces a relatively rough surface, which is dependent on the distance between sample points (Burrough and McDonnell 1998). In this study, power parameters (p) of 1 and 2 are used to provide a basis to compare the effect of various power parameters. Out of these three models, the ordinary kriging model is applied for constructing a spatial distribution map of the variables because of its simplicity and prediction accuracy results for the study (Gorai and Kumar 2013).

A cross validation test is used to compare various interpolation techniques and find an optimal model for each trace element (Isaaks and Srivastava 1989). Semivariogram models are tested and validated for each trace metal variable for selecting the best one. The selected interpolation model is employed to provide the best predictions of spatial variability for each data set. Next, it removes each data set point, one at a time, and predicts the associated value using the remaining data points. So, the computed and predicted values can be compared for all data points. The mean error (ME), mean square error (MSE), root mean square error (RMSE), average standard error (ASR), and root mean square standardized error (RMSSE) values are assessed to establish the best fit model performances. Hu et al. (2004) discussed several criteria for using error measurements to judge the validation of interpolation models. However, the minimum ME and RMSE determine the model having the most accurate prediction results. The following errors are estimated by Eqs. 5 to 9:

$$ \mathrm{ME}=\frac{1}{n}{\sum}_{i=1}^n\left({p}_i-{o}_i\right) $$
(5)
$$ \mathrm{RMSE}=\sqrt{\left[\frac{1}{n}{\sum}_{i=1}^n\left({p}_i-{o}_i\Big){}^2\right)\right]} $$
(6)
$$ \mathrm{MSE}=\frac{1}{n}{\sum}_{i=1}^n\left({ps}_i-{os}_i\right) $$
(7)
$$ \mathrm{RMSSE}=\sqrt{\left[\frac{1}{n}{\sum}_{i=1}^n\left({ps}_i-{os}_i\right){}^2\right]} $$
(8)
$$ \mathrm{ASE}=\sqrt{\left[\frac{1}{n}{\sum}_{i=1}^n{\left({p}_i-\raisebox{1ex}{$\left({\sum}_{i=1}^n{p}_i\right)$}\!\left/ \!\raisebox{-1ex}{$ n$}\right.\right)}^2\right]} $$
(9)

where n is the number of the observed point, o and p are the observed values and predicted values at location i, os is the standardized observed value, and ps is the standardized predicted value. After completing the cross validation process, the geostatistical model offers a graphical representation of the distribution of the groundwater trace elements. ArcGIS (10.2 version) has been used for this interpolation model.

Results and discussion

Descriptive statistics of trace elements

The descriptive statistics of trace elements for the groundwater samples are summarized in Table 1. According to the results, Fe has the highest mean concentration in the groundwater, followed by Mn, Ba, Zn, Al, and As. The ordering of trace element abundance in the results is quite alike to those of Bodrud-Doza et al. (2016) and Bhuiyan et al. (2016) but is different to that of Reza et al. (2010). However, the mean values of Fe (7726.46 ± 6559.34 μg/l) and Mn (684.48 ± 754.74 μg/l) are higher than the drinking water quality standards set by DoE (1997) and WHO (2011) at most of the sample locations of southeastern parts: Pirganj and Pirgacha Upazila in the study area. Trace element concentrations like As, Fe, Mn, Zn, Al, and Ba are found to range from 0.5 to 42.8, 47 to 22,400, 85 to 4960, 6 to 234, 10 to 160, and 6 to 176 μg/l with the mean values of 8.80 ± 10.01, 7726.46 ± 6559.34, 684.48 ± 754.74, 33.29 ± 38.99, 27.44 ± 28.62, and 44.55 ± 28.60 μg/l, respectively (Table 1). The standard skewness should be within the range ±2; otherwise, it is regarded as an extreme (Reimann et al. 2008). Furthermore, Mn, Zn, Al, and Ba concentrations show the highly positive skewed data and are considered to be extreme in the groundwater samples. Similarly, in case of kurtosis, these trace element concentrations are observed in the scale of leptokurtic where its value laid on >3 whereas the Fe concentration is found in the scale of platykurtic. The high Fe concentration in groundwater is potential for human health hazards since it is used by humans for drinking and cooking purposes. Both Fe and Mn concentrations in groundwater are detrimental because of the form and taste of the water and their ability to cause staining. Islam et al. (2015) found similar findings in assessing health hazards of metal concentrations in groundwater of Bangladesh. Merrill et al. (2012) have reported iron consumption by women in rural northwestern Bangladesh where potentially unpleasant organoleptic qualities of high iron contents in the groundwater system are observed. The Ba concentration is higher than the prescribed drinking water quality standards (DoE 1997) in the central and eastern parts such as the Mithapukur and Kaunia Upazila sampling locations (Fig. 1). However, the mean concentrations of As, Al, and Zn are within acceptable limits for drinking purposes. High As, low Fe, and low Mn in groundwater of the Meghna floodplain in southeastern region of Bangladesh are reported by Reza et al. (2010), but these results differ from the observations in the study area where very high concentrations of Fe, Mn, and Ba are observed which pose health and environmental implications for the people of Rangpur district, Bangladesh.

Table 1 Descriptive statistics and drinking standards of selected trace element concentrations in the study area

Box plots show As, Fe, and Mn concentrations for trace elements of groundwater samples (Fig. 2). Assessment of mean, median, maximum, and minimum concentrations of Fe and Mn indicate that these high concentrations have exceeded the permissible limit for drinking waters in most of the sampling sites. On the other hand, a box plot of arsenic (As) distribution is characterized by lower arsenic concentration below the standard limit. It is observed from Fig. 2 that the length of the boxes in the case of Fe is relatively large in comparison to that for the remaining arsenic (As) and Mn concentrations, which reveals large spatial variation. It may be attributed to very high concentrations of Fe generally available in the groundwater system at most sampling sites over the study area. Figure 3 shows the concentration of trace metals in groundwater with respect to depth. Most of the metals exist in the shallow aquifer at depth ranges from 19 to 32 m. The trace metal concentrations in the groundwater system resulting from the oxidation process of sulfide minerals reported here are likely to be of concern to human health and the environment (Bhuiyan et al. 2010). For example, the oxidation process of sulfide minerals such as pyrite is given in the following equation (Garrels and Thompson 1960):

$$ {\mathrm{Fe}\mathrm{S}}_2\left(\mathrm{s}\right)+{\mathrm{H}}_2\mathrm{O}+7/{2\mathrm{O}}_2\left(\mathrm{aq}\right)\to {\mathrm{Fe}}^{2+}+{{2\mathrm{SO}}_4}^{2-}+{2\mathrm{H}}^{+} $$
Fig. 2
figure 2

Box plots show the mean, median, maximum, and minimum values of As, Fe, and Mn distributions in the study area

Fig. 3
figure 3

Relative concentration of trace elements in groundwater, depth-wise in the studied sample (ae). All concentrations are in units of micrograms per liter

Another example, sphalerite (ZnS) mineral in groundwater, will release as Zn into the environment by oxidation through the reaction below:

$$ \mathrm{ZnS}\left(\mathrm{s}\right)+{2\mathrm{O}}_2\left(\mathrm{aq}\right)\to {\mathrm{Zn}}^{2+}+{\mathrm{SO}}_4{{}^2}^{-} $$

Likewise, the elevated Fe, Mn, and Ba concentrations released into the groundwater are probably due to the oxidation of pyrite, siderite, and barite (Sakurovs et al. 2007). Fe and Mn are generally found in rocks and minerals in an insoluble form. The high concentration of Fe in water is due to geogenic origin and occurs naturally in soils, rocks, and minerals. It could be originated from a rock-water interaction or weathering of an iron-rich rock (Bloundi et al. 2009). It is also noted that a high concentration of Fe has been found in the shallow aquifer at depth ranges from 18 to 32 m (Fig. 3a). About 97% samples of the study area at depth ranges from 20 to 35 m are found to be contaminated by Mn as they have exceeded the WHO’s permissible limit (Fig. 3b). Groundwater might be polluted by Mn due to mobilization of either geogenic or enriched anthropogenic sources in the chemical weathering of soils and minerals (Buragohain et al. 2010). It is assumed that Mn and Fe have enhanced the development of iron bacteria which obtain their energy for growth from the chemical reaction when Mn reacts with dissolved oxygen. Water comes in contact with the solid materials dissolving Fe and Mn, releasing their constituents to the groundwater system (Haloi and Sarma 2012). Generally, a high concentration of Ba has been found in deeper aquifers that are sulfate depleted. This is because of natural dissolution of barite minerals in aquifer where sulfate has been bacterial reduced (Baldi et al. 1996). However, a high Ba concentration has occurred in the shallow aquifer at depth ranges from 19 to 32 m which are more vulnerable to contamination of groundwater from geogenic sources (Fig. 3c). The Zn concentration in groundwater is relatively lower as a result of the leaching of Zn from piping and fittings (Nriagu 1980). All the sampling points in the study area fall below the international standard for Zn and As contents in groundwater at depth ranges from 18 to 35 m (Fig. 3e). Arsenic (As) originates from an arsenic-rich bedrock of the aquifer that infiltrates in groundwater pumped to the surface through the tube wells. The low concentration of arsenic (0.5 to 42.8 μg/l) in groundwater in the northern region does not support findings from the southwestern region with high concentrations of arsenic in groundwater (Hossain et al. 2007). It is seen that the Al content of groundwater at the sampling points of S-11 (19.8 m) and S-20 (26.8 m) is relatively high as having exceeded their limits defined by DoE (1997) (Fig. 3f). The relatively high concentration of Al in groundwater is due to dissolution of Al from clay minerals and other alumino-silicate minerals found in soils and rocks (Buragohain et al. 2010).

Correlation matrix analysis

In the study, Pearson’s correlation matrix is generated in order to find possible sources, origin, and covariance of trace elements of groundwater samples (Table 2). A correlation matrix depicts inter-elemental relationships as well as new associations between the elements. Strong (p < 0.01) and significant correlations (p < 0.05) are observed in the trace elements of groundwater samples in the study area. Arsenic exhibits a weak significant positive correlation with Fe (r = 0.494, p < 0.01), Zn (r = 0.399, p < 0.01), and Ba (r = 0.322, p < 0.05) where arsenic shows an insignificant correlation with Al (r = 0.102) and Mn (r = 0.254). These associations reveal amalgam common sources of either geogenic or anthropogenic origin. Fe shows a moderately significant positive correlation with Ba (r = 0.543) at p < 0.01 while Fe demonstrates an insignificant correlation with Mn (r = 0.024), Zn (r = 0.145) and Al (r = 0.118). These correlations indicate similar sources of geogenic origin and mobility (Haloi and Sarma 2012). These results may be attributed to geogenic sources from the parent rocks, and other anthropogenic sources are limited to small-scale industries, domestic sewage, and agricultural activities in the study area. Similar observations are reported by Chapagain et al. (2010) in the deep groundwater quality in Kathmandu, Nepal, where trace metal occurrences are possibly influenced by redox levels and the nature of the underlying sediment of groundwater. Mn reveals an insignificant negative correlation with Zn (r = −0.005) where Mn depicts also an insignificant positive correlation with Al (r = 0.017) and Ba (r = 0.046). These metals indicate that a variety of sources are involved with groundwater pollutants in the study area. Zn exhibits a weak positive significant correlation with Al (r = 0.294, p < 0.05) and Ba (r = 0.169). Ba shows a weak significant positive correlation with Al (r = 0.438, p < 0.01), indicating an anthropogenic origin of groundwater contamination. Since large-scale industries do not exist in the study area, small-scale industries, agricultural fertilizer, and stagnant water may contribute as the major sources of this groundwater chemical alteration. It is hypothesized that trace metals with a relatively significant positive correlation possibly originated from the same pollution sources (Mansouri et al. 2012).

Table 2 Pearson correlation matrix among trace element concentrations of groundwater samples

Multiple regression model

A multiple regression technique is applied to develop suitable models relating a given trace element to a set of statistically most significant independent variables. Multiple regression models for predicting As, Fe, Mn, Zn, Al, and Ba concentrations from various known concentrations of trace elements in the groundwater are presented in Table 3. Six independent variables are observed to have a significant effect (‘t’ test for the partial regression coefficients at 5% significance level) on resultant dependent variables. The predictions of arsenic (As) which is the dependent variable from the results of all independent variables (Fe, Mn, Zn, Al and Ba) are fairly good. For example, the multiple r 2 value 0.419 of arsenic (As) indicates that 41.9% of the variation in arsenic (As) concentration could be explained by variability in other five independent variables used in the model. In predicting arsenic (As), the independent variables such as Fe, Mn, and Zn concentrations have a statistically significant effect at the 5% confidence level (Table 3). Ba prediction with the multiple r 2 value of 0.44% exhibits that 44% variability is observed in Ba concentration which could be predicted to the combined effects of As, Fe, Mn, Zn, and Al concentrations. Ba concentration is mostly dependent on Fe and Al concentrations, indicating a statistically significant effect (Table 3). The regression model of Mn shows that 9.5% of variation in Mn concentration can be ascribed by this model. Furthermore, Fe, Al, and Zn concentrations can be predicted up to 43.3 and 27.3 and 24.6% variability accounted for in this model. For predicting Fe concentration, independent variables of arsenic (As) and Ba concentrations have statistical significance. Similar multiple regression models are applied to predict groundwater quality parameters in various regions of the world (e.g., Adhikari et al. 2009; Kumar et al. 2011; Sundari et al. 2013).

Table 3 Multiple regression models for trace elements in the study area

Geostatistical modeling

The spatial distribution of trace element concentrations is analyzed by various interpolation techniques (SK, OK, and IDW) over the study area. To extract spatial distribution of each trace element, the most accurate interpolation technique is selected using cross validation processes. The predicted and observed values are compared for each models using the correctness measures (Eqs. 5 to 9) to test the robustness of the predicted models (Table 4). Table 4 presents the comparison of interpolation models for the trace elements, based on accuracy measures. The choice of the best semivariogram model is based on the ME, MSI, RMSE, RMSSE, and ASE criteria. A model is considered as robust and accurate when the ME and MSE are close to zero, RMSE and ASE are minimum as possible, and RMSSE is close to 1 (Adhikary et al. 2010). Cross validation results suggest that each trace metal provides more accurate spatial distribution for the study area. As can be seen in Table 4, for arsenic concentration, the SK technique is the optimal interpolation model using a spherical semivariogram model in comparison with other two models (OK and IDW) based on overall comparison of RMSE. In addition, Mn and Al concentrations have shown the SK technique as the optimal interpolation model using the circular semivariogram model. Conversely, in the case of Fe concentration, the OK technique using the Gaussian semivariogram model is the most accurate interpolation model, while for Ba concentration, the IDW technique with a power parameter of 1 is the best interpolation model using the semivariogram model because of minimum root mean square error (RMSE). As a result, kriging interpolation techniques (OK and SK) with various semivariogram models give better performances for each trace element except Ba concentration (Table 4). So, the SK technique is predicted as the most accurate interpolation model for As, Mn, Zn, and Al concentrations when compared with OK and IDW models. However, IDW prediction is slightly less precise than kriging interpolation techniques (OK and SK), in terms of RMSE.

Table 4 Comparison between interpolation models for trace elements of groundwater using cross validation results

The semivariogram (h) models are calculated and the scatter plot of (h) vs. h (distance) prepared using ArcGIS (version 10.2). Different theoretical semivariogram models are used to fit the estimated values and the model with the best fit value and the smallest nugget value choice (Goovaerts 1997). The elements nugget, sill, lag size, nugget/sill, and range of the best fitted semivariogram models are shown in Table 5. There are three classifications used for model explanation: if the ratio is less than 25%, it shows strong spatial variation; if the ratio is between 25 and 75%, it indicates moderate spatial dependence; and if the ratio is more than 75%, then it represents weak spatial dependence (Shi et al. 2007). Figure 4 shows the experimental semivariogram model (binned sign) around the omnidirectional semivariogram model (blue line) and average of the semivariogram model (plus sign). Kriging models (OK or SK) are produced to show the most accurate spatial distribution maps for all elements except Ba concentration (Table 5). The Gaussian semivariogram model is observed to be the best fit model for Fe and Zn concentrations, while the circular semivariogram model fit best for Mn and Al concentrations. On the contrary, the spherical semivariogram model fit well for As concentration (Table 5). The IDW model is identified to be the best optimal model for Ba concentration, where no spatial autocorrelation can be shown using semivariogram analysis. The major ranges varied from 7.44 to 32.68 km where the greatest range was measured for Fe (32.68 km) and the smallest one for Mn (7.44 km). The Gaussian semivariogram model shows high spatial variability for higher ranges of Fe concentration, while the circular semivariogram model represents less spatial structure for lower ranges of Mn concentration. The ranges are varied due to topographic and geometric factors of groundwater while large distance and variation of trace element concentrations could mostly be affected by precipitation, runoff, and fertilizer application. In this study, the results show that Fe and Al concentrations are a strong spatial dependence (Fig. 4c, e) while As, Zn, and Mn concentrations exhibit a moderate spatial dependence in the semivariogram shapes (Fig. 4a, b, d). The moderate to strong spatial dependence of trace elements has been demonstrated in the less nugget effect in semivariogram shape and is due to low variability of topography of groundwater which varied in the residential area, agricultural area, and industrial area. Similar observations are obtained from the most recent study conducted in the central region of Bangladesh by Bodrud-Doza et al. (2016), but these findings differ from the observation of Bhuiyan et al. (2016) where the weak spatial dependence has been reported in the large nugget effect in semivariogram shape in the southeastern part of Bangladesh.

Table 5 The optimal interpolation models and their best fit semivariogram models for each trace elements
Fig. 4
figure 4

Best fitted semivariogram models of trace elements in the study area. a As concentration. b Fe concentration. c Mn concentration. d Zn concentration. e Al concentration

Spatial distribution maps of trace elements

The optimal interpolation model is applied to prepare the spatial distribution maps of each trace element of groundwater samples for the study (Fig. 5). Trace element concentrations exhibit complex distribution patterns with a decreasing trend in the southeast to northwest directions in the study area. The high concentrations of Fe, Mn, and Zn are observed in the south and southeastern parts of Pirganj and Pirgacha Upazila of the sampling area, while the low concentrations of Al, Ba, and As are identified in the northern part of Taragonj, Badargonj, and Gangachara Upazila of Rangpur district, Bangladesh, indicating heterogeneous point sources of groundwater pollution (Fig. 5). This strongly indicates different sources and processes of pollution, which is also confirmed by the results of a low negative correlation between of Mn and Zn in the earlier part of the paper. Enormously high concentrations of As, Fe, and Zn are observed in south, central, and southeastern parts such as Pirganj, Rangpur Sadar, and Pirgacha Upazila of Rangpur district, suggesting the existence of amalgamate sources of pollution (Fig. 5a, b, d). The mean high concentrations of Fe and Zn are identified in the central to southeastern parts such as Rangpur Sadar, Mithapukur, Pirganj, and Pirgacha Upazila of Rangpur district which might be attributed to the combined effect of contamination from domestic sewage, small industries, and runoff from fertilized agriculture fields (Dash et al. 2010). The urban area in the central part (Rangpur Sadar) of the sampling site, passing through a highway road, and some evidence of dumping waste activities and brick field can support this statement (Fig. 1). However, higher values of Fe, As, and Ba in the southeastern part are also supported by a moderately positive significant correlation of Fe with As and Ba (Table 2). This is an alarming situation for human health due to the higher concentration of these trace elements. In fact, the high concentration of Fe in the eastern part of Pirgacha Upazila of the sampling site is alarming, as maximum households depend upon groundwater for domestic uses. The irregularities identified in the northern boundary of Badargonj Upazila of the Mn distribution map have the highest Mn values restricting its use for drinking purposes (Fig. 4c). The Mn distribution pattern reveals an irregular and complex trend in different directions. It shows an increasing trend from central to northwest and a decreasing trend southeast toward the central part (Rangpur Sadar Upazila) of the study area. There should be a similar origin for Mn and Zn concentrations, as they have demonstrated different distribution patterns (Fig. 5) in comparison to other elements. The low negative correlation (Table 2) also confirmed the different processes involving Mn and Zn concentrations. The distribution map of Al slightly differs than the Ba distribution map, as it shows an increasing trend from north to central parts (Gangachara to Rangpur Sadar Upazila) of the study area (Fig. 5e). This peripheral difference is confirmed by a weakly positive correlation (Table 2) between Mn and Al. The Ba distribution map depicts a relatively complex spatial variability and trends in north to south directions (Fig. 5f). However, the spatial distribution of Ba indicates that the higher values of Ba are widespread, not localized in any specific part of the study area. In addition, there is no spatial autocorrelation between the samples based on semivariogram models (Table 5). The IDW model has been validated as the optimal spatial model to evaluate the Ba spatial distribution map. So, proper measures should be taken to check the groundwater trace element quality as an urgent basis for the study area. However, spatial analysis has been conducted based on a comparatively small set of sampling points (47 groundwater samples). It can be said that the more sampling points are required for better understanding of the geogenic sources of trace element contamination in the groundwater system.

Fig. 5
figure 5

Spatial distribution maps of trace elements by using optimal interpolation models. a As concentration. b Fe concentration. c Mn concentration. d Zn concentration. e Al concentration. f Ba concentration

Conclusions

This paper presents a set of statistical approaches such as descriptive statistics, correlation, multiple regression, and geostatistical modeling which are employed to evaluate the trace elements of groundwater and their spatial distribution in Rangpur district, Bangladesh. Trace elements show a dominance in the following order of Fe > Mn > Ba > Zn > Al > As. The descriptive statistics results show that the mean concentrations of Fe, Mn, and Ba have exceeded the permissible limits, when compared with standard values in some places of the southeastern part such as Pirganj and Pirgacha Upazila of Rangpur district; those concentrations can aggravate the risk to human health. In addition, Mn, Zn, Al, and Ba concentrations indicate the highly positive skewed data and are regarded to be extreme in groundwater samples. The correlation matrix results depict that some of the elements reveal a moderately positive significant correlation, while others show a low negative insignificant correlation with each other. The multiple regression models can predict groundwater trace metal quality with 5% significance level in Rangpur district, Bangladesh.

The geostatistical models (kriging models) are effective tools for initial decision makers of groundwater trace metal assessment in Rangpur district, Bangladesh. The models are able to find higher concentration trends of the trace elements, which indicate hot spots in the study area. Besides, these higher hot spots of Fe, Mn, and Ba concentrations could have originated from geogenic sources. These geogenic pollutants are much alarming for groundwater consumption in the study area. However, other anthropogenic sources are quite unpleasant. Further studies are required to get adequate knowledge of uncertainty in the sampling sites. On the basis of spatial distribution maps, trace elements have shown different spatial distribution patterns. Correlation matrix and multiple regression model results also supported the outcomes of elements for the spatial distribution of groundwater samples. This study provides sufficient background information on trace elements, possible source, and controlling factors of groundwater pollutants and their spatial distribution in the study area. This paper is expected to help decision makers taking adaptive measures for site-specific groundwater trace element monitoring in Rangpur district, Bangladesh.