Comparison of ordinary kriging and artificial neural network for spatial mapping of arsenic contamination of groundwater

Chowdhury, Mohammad; Alouani, Ali; Hossain, Faisal

doi:10.1007/s00477-008-0296-5

Comparison of ordinary kriging and artificial neural network for spatial mapping of arsenic contamination of groundwater

Technical Note
Published: 12 December 2008

Volume 24, pages 1–7, (2010)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Comparison of ordinary kriging and artificial neural network for spatial mapping of arsenic contamination of groundwater

Download PDF

Mohammad Chowdhury¹,
Ali Alouani¹ &
Faisal Hossain²

1408 Accesses
53 Citations
Explore all metrics

Abstract

In this technical note, we investigate the hypothesis that ‘non-linearity matters in the spatial mapping of complex patterns of groundwater arsenic contamination’. The spatial mapping pertained to data-driven techniques of spatial interpolation based on sampling data at finite locations. Using the well known example of extensive groundwater contamination by arsenic in Bangladesh, we find that the use of a highly non-linear pattern learning technique in the form of an artificial neural network (ANN) can yield more accurate results under the same set of constraints when compared to the ordinary kriging method. One ANN and a variogram model were used to represent the spatial structure of arsenic contamination for the whole country. The probability for successful detection of a well as safe or unsafe was found to be atleast 15% larger than that by kriging under the country-wide scenario. The probability of false hopes, which is a serious issue in public health monitoring was found to be significantly lower (by more than 10%) than that by kriging.

A Comparison of Some Interpolation Techniques for Determining Spatial Distribution of Nitrogen Compounds in Groundwater

Article Open access 05 June 2019

Comparison of interpolation methods for the estimation of groundwater contamination in Andimeshk-Shush Plain, Southwest of Iran

Article 07 October 2015

Assessment of the Potential Contamination Risk of Nitrate in Groundwater Using Indicator Kriging (in Amol–Babol Plain, Iran)

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Extensive groundwater contamination by arsenic is observed in many alluvial aquifers of the world today. Some examples are: West Bengal (Mazumder et al. 1998), Taiwan (Tseng et al. 1968), Vietnam (Berg et al. 2001), Bangladesh (Nahar et al. 2008) and Mexico (Del Razo et al. 1990). The problem is also widespread in many parts of the United States (US). A recent risk assessment study by Twarakavi and Kaluarachchi (2006) has shown that rural regions in southern California, Arizona, Florida, Washington states and a few others scattered throughout the US currently face a high risk from arsenic exposure through untreated ground water consumption. However, the case of Bangladesh stands out because of the seriousness of contamination. Estimates show that about 103 million (70% of the Bangladesh rural population) depend on shallow wells that have been excavated at a depth of less than 150 m (Ahmed 2003), and more than half of the Bangladesh population may be at risk due to the high levels of arsenic in these groundwaters (Yu et al. 2003).

High levels of arsenic in the groundwater can cause numerous health concerns among the public if left untreated after extraction from the aquifer and prior to public consumption. Soluble arsenic compounds are generally rapidly absorbed into the body from the gastro-intestinal tract (Hindmarsh and McCurdy 1986). Studies have shown that twenty years of sustained consumption of contaminated water exceeding 50 μg/l of arsenic can cause internal cancers (lung, kidney, liver and bladder) and affect 10% of all exposed (Tseng et al. 1968; Schwartz 1997).

Thus, detection of groundwater arsenic contamination can prevent widespread diseases which could otherwise be very costly to treat. Spatial mapping of arsenic contamination on the basis of sparse in situ sampling data can be considered one such cost-effective and non-structural method of contamination detection at non-sampled locations. In particular, rural regions with isolated ground water supply systems lacking central treatment facilities (i.e., parts of the US or most regions of developing countries) stand to benefit tremendously because of the high level of cost-effectiveness offered by spatial mapping schemes. Because each one of the vast number of drinking wells (exceeding 10 million for Bangladesh) cannot be frequently and directly tested, a spatial mapping scheme based on limited field measurements can bridge the gap between lack of adequate testing resources and the urgency of quarantining unsafe zones or identifying regions of the aquifer unaffected by contamination.

Conventional methods for spatial mapping of groundwater contamination based on linear geostatistical theory (such as kriging) can however have high uncertainty at non-sampled locations (Hossain et al. 2007). In these geostatistical methods, the spatial characteristics of the ‘random’ function (contaminant concentration) are described at most by the second order moment (covariance). Hence, these conventional approaches are optimum for representing multi-Gaussian random fields. In the presence of complex subsurface heterogeneities and given the non-linearity of the contaminant transport process, it is highly unlikely that the contaminant distribution will exhibit a multi-Gaussian characteristic (Faybishenko 2002).

For the case of groundwater contamination in resource-limited settings, a high level of uncertainty can hamper cost-effective management and remediation efforts. As an example, if the unbiased estimation uncertainty of kriging (comprising only random error) for a contaminant variable is observed to be greater than the Environmental Protection Agency (EPA)-specified safe limit, then it is likely that many non-sampled regions could be predicted by kriging as uncontaminated and hence, safe for public consumption. Sustainable water resources management in resource-poor settings (such as developing countries) therefore requires us to explore alternative approaches that seek to minimize these well-known limitations of linear geostatistical techniques such as kriging.

In this technical note, the objective of our study is to explore the validity of the hypothesis that ‘non-linearity matters in the spatial mapping of complex patterns of groundwater arsenic contamination’. Herein, spatial mapping refers exclusively to the data-driven technique of spatial interpolation based on sampling data at finite locations. Using the well known example of extensive groundwater contamination by arsenic in Bangladesh, we compare the results obtained from spatial mapping using two common mapping techniques. The essential algorithmic distinction between these two techniques is in non-linearity of the spatial behavior that can or cannot be modeled. The two methods are: (1) ordinary kriging (a best linear unbiased estimator) and (2) artificial neural network (ANN; a learning technique capable of generalizing non-linear spatial pattern for non-sampled locations).

2 Study region, data and mapping tools

2.1 Study region and data

Our study region was Bangladesh excluding the dense forests in the Southwest and Southeast (Fig. 1). Geologically, the region is made up of mainly old oxic Pleistocene deposits in the north and relatively young anoxic Holocene deposits in the south. Arsenic data were obtained from the British Geological Survey (BGS) which, in collaboration with local authorities in Bangladesh, surveyed randomly selected wells from 1998 to 2000. Hereafter, this survey is referred to as the BGS-DPHE (2001) survey. The main dataset comprised 3,534 wells and is freely available from the internet at http://www.bgs.ac.uk/arsenic/. The wells sampled were constructed between 1962 and 1999. Measurement of arsenic was taken at a single depth close to the screen for each well, wherein the depths varied from 10–300 ft below the surface. In our study we excluded data pertaining to well depths larger than 100 ft as the arsenic problem is mostly confined to depths less than 100 ft (Hossain et al. 2007).

In the overall scheme of our investigation, the BGS-DPHE (2001) survey currently represents the most quality-controlled database of arsenic measurements available for any kind of country-wide management analyses (Kinniburg and Smedley 2001). Arsenic measurements of BGS-DPHE (2001) survey were based on the Atomic Absorption Spectro-photometric (AAS) method, which can be considered a very reliable method for arsenic testing (Rahman et al. 2002). Some distinct patterns of arsenic contamination are apparent with this data. The highest levels of contamination usually occur in the Southern regions comprising Holocene deposits, while the lowest contaminations are observed in the Northern regions of Pleistocene deposits (Fig. 1).

2.2 Spatial mapping tools

Our focus was on the investigation of two spatial mapping tools based on distinct spatial interpolation methodologies. The first mapping tool is the commonly known geo-statistical ordinary kriging (OK) technique; based on the linear paradigm of mapping. The second mapping tool is the ANN based on the non-linear paradigm of mapping. ANNs work in a similar fashion as the human brain learns from a trained pattern. For each mapping technique, the goal was to spatially interpolate the arsenic concentration at a non-sampled well location using sampled data from wells in the study region. The spatial coordinates and arsenic concentration of wells were considered as the input and output of each mapping technique, respectively. In the following, we provide a brief description of each mapping tool to highlight the distinct linear or non-linear aspect associated with each method.

2.2.1 Ordinary kriging (OK)

Ordinary kriging is a spatial interpolation estimator $ \hat{Z}(x_{0} ) $ used to find the best linear unbiased estimate (at non-sampled location) of a second-order stationary random field with an unknown constant mean as follows:

$$ \hat{Z}(x_{{\mathbf{0}}} ) = \sum\limits_{i = 1}^{n} {\lambda_{i} Z(x_{i} )} $$

(1)

where $ \hat{Z}(x_{0} ) $= kriging estimate at non-sampled location x ₀; Z(x _i) = sampled value at location x _i; and λ_i = weighting factor for Z(x _i).

The estimation error is

$$ \hat{Z}(x_{0} ) - Z(x_{0} ) = R(x_{0} ) = \sum\limits_{i = 1}^{n} {\lambda_{i} Z(x_{i} ) - Z(x_{0} )} $$

(2)

where Z(x ₀) = unknown true value at x ₀; and R(x ₀) = estimation error. For an unbiased estimator, the mean of the estimation error must equal zero. Therefore,

$$ E\left\{ {R(x_{0} )} \right\} = 0 $$

(3)

$$ {\text{and}}\quad \sum\limits_{i = 1}^{n} {\lambda_{i} } = 1 $$

(4)

Minimum variance of estimation error is required for solving the interpolation problem by kriging. The minimization of the estimation error variance under the constraint of unbiasedness leads to a set of equations for the weighting factors, λ_i, which can be solved by an optimization routine. For details on the ordinary kriging technique and its application to ground water or related problems, the reader is referred to works of Goovaerts et al. (2005). For specific details on the software routines used in this study, the reader is referred to the Geostatistical Software Library (GSLIB) and Users Guide (Deutsch and Journel 1998).

2.2.2 Artificial neural networks

Artificial neural networks (ANN) are made up of interconnected artificial neurons which are an imitation of biological neurons (Govindaraju and Rao 2000). Given sufficient training data, ANNs can learn, to a high degree of accuracy, any complex mapping between the input and output data. ANNs are also capable of generalization which makes them very attractive when it comes to limited measurement data. Contrary, to linear estimation techniques ANN does not impose any constraint on the statistical properties of the process to be modeled. ANNs are useful when one does not have an idea about the complexity nor the structure of the input/output map. In this study, the multi-layer supervised back-propagation (MLBP) architecture was used. Two hidden layers were used to adjust the weights of neuron to achieve our target output (Fig. 2).The weights were trained using the back propagation (BP) algorithm. The essential feature of the ANN in contrast to kriging was in its ability to decipher and generalize highly non-linear patterns in the complex dataset as input–output mapping functions.

The ANN model mapped the arsenic contamination (output) at a given well location with the coordinates of the well location (input). As a result, the ANN was able to ‘learn’ the dependence of the geographical location on the arsenic concentration. The output of ANN was categorized into one of the following three classes: Class (1) predicted concentration is less than 10 parts per billion (ppb, or μg/l); Class (2) predicted concentration is between 10 and 50 ppb; and Class (3) predicted concentration is higher than 50 ppb. Note that the 10 and 50 ppb are the safe limits prescribed by the World Health Organization (WHO) and Bangladesh Government, respectively.

It is important to note at this stage, that classification of output by ANN is commonly addressed in literature by using the self organizing map (SOM) method. Due to the nature of the data (discussed in Sect. 3), we found it difficult to adopt the SOM approach. While this deviation may raise concerns, which are understandable, we believe that such potential limitations alone should not hamper our ability to investigate the hypothesis that ‘non-linearity matters in the spatial mapping of complex patterns of groundwater arsenic contamination’.

2.2.2.1 Algorithms for ANN training

In this study, we used the Levenberg–Marquardt (LM) algorithm for training of ANN (Nocedal and Wright 1999). The LM algorithm, which is a blend of gradient descent and Gauss–Newton iteration, is probably the most widely used optimization method. It takes lager steps down the gradient at location where the gradient is small and conversely, takes smaller steps when the gradient is large, so as not to miss the minima. This algorithm is a trust region based method with hyper-spherical trust region that has proved to be a better solution in searching for the minima. The algorithm employs damped Gauss–Newton method utilizing a damping parameter μ. We provide further details on the selection of the damping parameter μ and other algorithmic features in Appendix.

3 Data preprocessing, calibration and training

In order to elicit the essential features of the spatial pattern of arsenic data and thereby facilitate the modeling equally for each mapping tool, data preprocessing was performed. For both the ANN and the kriging method, this preprocessing step may be considered analogous to data quality assessment to reduce noise in the data. Although, the BGS-DPHE (2001) data was of the best quality available, the data preprocessing (described below) helped generalize the complexity in the data that could otherwise undermine the performance of a technique. In the un-preprocessed format, the spatial nature of arsenic data is known to be highly irregular in the southern and south central regions of Bangladesh (Hossain et al. 2007). Since the discovery of arsenic contamination in Bangladesh during the early nineties, it has been noted that two adjacent wells can occasionally contradict each other in terms of exceeding the safe limit (Rahman et al. 2002) due to ‘hard-to-resolve’ micro-level differences in factors such as sediment type, geology, well depth and geochemistry of the aquifer. To reduce this irregularity and to maximize the effectiveness of mapping accuracy equally for both techniques we performed filtering of data which is described next.

Data from each well was grouped in 5 × 5 km grids. Each well in a grid was converted to a single class-based numeric value for management as already described in Sect. 2. Three classes were used as follows. Class One (Safe): 0–10 ppb; Class Two (Unsafe according to WHO limit): 10–50 ppb; and Class Three (Unsafe according to Bangladesh safe limit): >50 ppb (note: ppb is equivalent to μg/l; WHO and Bangladesh safe limits are 10 and 50 ppb, respectively). To arrive at a single class-based numeric value for all wells in a 5 × 5 km grid, the absolute values of arsenic data were converted to a class value and then filtered to create a representative value for the grid. In each gridbox, higher frequency of occurrence of any class was assigned as the class arsenic of all the wells inside the gridbox. In case when the frequencies of two classes were similar, the higher numerical value of the class was assigned as the ‘winner’. This filtering resulted in a more generalizable spatial pattern of arsenic concentration for both kriging and ANN (see Fig. 3 with respect to Fig. 1).

As mentioned earlier, for the ANN technique, a 2 hidden layers (with two input nodes for spatial coordinates) network was set up and trained using the back propagation algorithm. For the kriging technique, empirical variograms were modeled using the exponential variogram function (see Hossain et al. 2007 for an example). Training of ANN and derivation of kriging variograms was performed on two-thirds data randomly selected. The remaining one-third was used for an independent validation. The convergence criteria for terminating training of the ANN was set at 10⁻² (root mean squared error). In the topology of the ANN, 50 and 25 neurons were used in the first and second (hidden) layers, respectively.

4 Comparision of ANN versus ordinary kriging

A fair competition was set up between ANN and kriging to test the validity of our hypothesis. Both schemes were given the task of predicting arsenic concentration at non-sampled locations using as input, only the spatial coordinates of the location of wells. The calibration of the kriging scheme and the training of the ANN scheme used equal amounts and exactly the same data (i.e., spatial coordinates of the wells and the corresponding arsenic concentration-class value). The variogram modeled for kriging is shown in Fig. 4.

For assessing the accuracy of each method for spatial interpolation of arsenic concentration at non-sampled locations, the following three metrics were used:

1.
Probability of successful detection: This is the probability that the predicted class value matches with the in-situ class value of a non-sampled well.
2.
Probability of false hope: This is the probability that the predicted class value is underestimated significantly leading to an unsafe well being predicted wrongly as safe for a non-sampled well.
3.
Probability of false alarm: This is the probability that the predicted class value is overestimated significantly leading to a safe well being predicted wrongly as unsafe for a non-sampled well.

Because our focus was on assessing spatial mapping for management, the performance metrics were computed against class values of each well in the validation set (and not against actual arsenic concentrations). Table 1 shows the comparative performance of the ANN versus ordinary kriging. A single ANN model and a variogram was used for the entire country. We clearly observe that ANN, by virtue of its ability to generalize the spatial pattern using a highly non-linear network, shows considerably more accuracy when compared to ordinary kriging subject to the same breadth and constraints in data. The probability for successful detection is at least 15% higher than that by kriging for the country as a whole. This is a significant improvement considering that more than 80% of the Bangladesh population is projected to be at risk from arsenic contamination from drinking well water. More importantly, the probability of false hopes, which is a serious issue in public health monitoring (Nahar et al. 2008), is clearly lower (by about 10%) than that by kriging for most regions. Finally, the fact that one ANN model is conveniently able to demonstrate clear improvements over the kriging method for the whole country is a clear testament to accuracy of the global-scale generalization of learning the complex patterns.

Table 1 Comparative performance of ANN and kriging for spatial mapping of arsenic concentration in Bangladesh. Comparison is shown for independent data that was not used in calibration or training of the techniques

Full size table

5 Conclusion

Our study provided evidence in support of the hypothesis that ‘non-linearity matters in the spatial mapping of complex patterns of groundwater arsenic contamination.’ Although the use of learning-type tools for spatial mapping is not new (Besaw and Rizzo 2007), successful applications reported in literature so far pertain mostly to the mapping of geophysical parameters. These parameters exhibit spatially much smoother patterns in the timescales of interest (such as hydraulic conductivity, soil porosity etc.). Consequently, the mapping (spatial interpolation) is made relatively easier by a technique. Our study demonstrated that ANNs can also be used to map with noticeably higher accuracy than kriging the complex and seemingly erratic spatial pattern of groundwater contamination provided that reasonable data preprocessing and exploratory data analysis are performed.

A natural extension of our work is now to explore ways to leverage knowledge of the physics of the contamination process in non-linear mapping schemes like ANNs. Traditional ANNs are black-box tools and are often criticized as lacking in the ability to provide or ingest physical insights (ASCE Task Committee 2000). Using the theory of chaos, we have recently demonstrated that the spatial randomness of arsenic in Bangladesh may indeed be deterministic (Hossain and Sivakumar 2006) and therefore has promise to be deterministically modeled (Hill et al. 2008). The challenge now is to find practical ways to leverage the information gained from chaos analysis towards the robust design of ANN-type mapping schemes that can build upon conventional kriging methods. Such an effort can potentially blend the recently acquired knowledge on the physical factors governing contamination and act as a bridge between the data-based spatial mapping community and the process-based contamination community. So far, both communities have advanced their fields rather independently and we believe it is now time to explore a merger to minimize mapping uncertainty in resource-poor settings.

Notes

This algorithm is not very sensitive to the choice of τ, but a rule of thumb, a small value is used, e.g. τ = 10⁻⁶ if x ₀ is believed to be a good approximation of x ^*.

References

Ahmed MF (2003) Arsenic contamination: Bangladesh perspective. ITN, BUET, Dhaka, Bangladesh
Google Scholar
ASCE Task Committee (2000) Artificial neural networks in hydrology. II: hydrologic applications. ASCE J Hydrol Eng 5(2):124–136
Article Google Scholar
Berg M, Tran HC, Nguyen TC, Pham TV, Schertenleib R, Giger W (2001) Arsenic contamination of ground water and drinking water in Vietnam: a human health threat. Environ Sci Technol 35(13):2621–2626
Article CAS Google Scholar
Besaw LE, Rizzo DM (2007) Stochastic simulation and spatial estimation with multiple data types using artificial neural networks. Water Resour Res 43:W11409. doi:10.1029/2006WR005509
Article Google Scholar
BGS-DPHE (2001). Arsenic contamination of groundwater in Bangladesh. Kinnburgh DG, Smedley PL British geological survey report WC/00/19, vol 1–4, British Geological Survey, Keyworth, UK. Available at: http://www.bgs.ac.uk/arsenic/. Accessed 1 November 2008
Del Razo LM, Rellano MA, Cebrian ME (1990) The oxidation states arsenic in well-water from a chronic arsenicism area of Mexico. Environ Pollut 64:143–153
Article CAS Google Scholar
Deutsch C, Journel A (1998) GSLIB: geostatistical software library and user’s guide. Oxford University Press, UK, p 340
Google Scholar
Faybishenko B (2002) Chaotic dynamics in flow through unsaturated fractured media. Adv Wat Resour 25(7):793–816
Article Google Scholar
Goovaerts P, AvRuskin G, Meliker J, Slotnick M, Jacquez G, Nriagu J (2005) Geostatistical modeling of the spatial variability of arsenic in groundwater of southeast Michigan. Wat Resour Res 41(W07013). doi:10.1029/2004WR003705
Govindaraju RS, Rao AR (2000) Artificial neural networks in hydrology. Kluwer Academic Publishers, Amsterdam
Google Scholar
Hill AJ, Hossain F, Sivakumar B (2008) Is correlation dimension a reliable proxy for the number of influencing variables required to model risk of arsenic contamination in groundwater? Stoch Env Res Risk A 22(1):47–55. doi:10.1007/s00477-006-0098-6
Article Google Scholar
Hindmarsh JT, McCurdy RF (1986) Clinical and environmental aspects of arsenic toxicity. Crit Rev Clin Lab Sci 23:315–347
Article CAS Google Scholar
Hossain F, Sivakumar B (2006) Spatial pattern of arsenic contamination in shallow tubewells of bangladesh: regional geology and non-linear dynamics. Stoch Env Res Risk A 20(1–2):66–76. doi:10.1007/s00477-005-0012-7
Article Google Scholar
Hossain F, Hill AJ, Bagtzoglou AC (2007) Geostatistically-based management of arsenic contaminated ground water in shallow wells of Bangladesh. Wat Resour Manage 21:1245–1261. doi:10.1007/s11269-006-9079-2
Article Google Scholar
Kinniburg DG, Smedley PL (2001) Editors. Arsenic contamination of groundwater in Bangladesh, Ministry of Local Government, Rural Government and Cooperatives, Government of Bangladesh; BGS Technical Report WC/00/19, vol 1
Mazumder GDN, Haque R, Ghosh N, De BK, Santra A, Chakraborti D, Smith AH (1998) Arsenic levels in drinking water and the prevalence if skin lesions in West Bengal, India. Int J Epidemiol 27(5):871–877
Article Google Scholar
Nahar N, Hossain F, Hossain MD (2008) Health and socio-economic effects of groundwater arsenic contamination in rural Bangladesh: evidence from field surveys, international perspectives. J Environ Health 70(9):42–47
CAS Google Scholar
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York, p 636
Book Google Scholar
Rahman MM, Mukherjee D, Sengupta MN, Chowdury UK, Lodh D, Chanda CN, Roy S, Selim M, Quamruzzaman Q, Milton AH, Shadullah SM, Rahman MT, Chakraborti D (2002) Effectiveness and reliability of arsenic field testing kits: are the million dollar screening projects effective or not? Environ Sci Technol 36:5385–5394
Article CAS Google Scholar
Rahman S, Hossain F (2008) A forensic look at groundwater arsenic contamination in Bangladesh. Environ Forensics 9(4). doi:10.1080/15275920801888400
Schwartz RA (1997) Arsenic and the skin. Int J Dermatol 36:241–250
Article CAS Google Scholar
Tseng T, Babazono A, Yamamoto E, Kurumatani N, Mino Y, Ogawa T, Kishi Y, Aoyama H (1968) Ingested arsenic and internal cancer in an endemic area of chronic arsenicism in Taiwan. J Natl Cancer Inst 40:453–463
CAS Google Scholar
Twarakavi NKC, Kaluarachchi J (2006) Arsenic in the shallow ground waters of conterminous United States: assessment, health risks, and costs for MCL compliance. J Am Wat Resour Assoc 42(2):275–294. doi:10.1111/j.1752-1688.2006.tb03838.x
Article Google Scholar
Yu WH, Harvey CM, Harvey CF (2003) Arsenic groundwater in Bangladesh: a geostatistical and epidemiological framework for evaluating health effects and potential remedies. Water Resour Res 39(6):1146. doi:10.1029/2002WR001327
Article Google Scholar

Download references

Acknowledgments

The authors wish to acknowledge the constructive reviews received from two anonymous reviewers. The first author was supported by the Center of Management, Protection and Utilization of Water Resources at Tennessee Technological University and the Ivanhoe Foundation.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Tennessee Technological University, Cookeville, TN, 38505-0001, USA
Mohammad Chowdhury & Ali Alouani
Department of Civil and Environmental Engineering, Tennessee Technological University, Cookeville, TN, 38505-0001, USA
Faisal Hossain

Authors

Mohammad Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
Ali Alouani
View author publications
You can also search for this author in PubMed Google Scholar
Faisal Hossain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Faisal Hossain.

Appendix

1.1 Training of artifical neural network

The algorithm for training of ANN employs damped Gauss–Newton method utilizing a damping parameter μ as follows,

$$ (J^{T} J + \mu I)h_{\text{lm}} = - g $$

(5)

$$ {\text{with}}\,g = J^{T} f\,{\text{and}}\,\mu \ge 0 $$

(6)

where, J ∈ R is a Jacobian Matrix, which contains the first partial derivatives of the function component $ \left\| {f(x)} \right\|, $ that need to be minimized,

$$ \begin{aligned}& (J(x))_{i,j} = \frac{{\partial f_{i} }}{{\partial x_{j} }}(x) \hfill \\ &{\text{and}}\;F(x) = \frac{1}{2}\sum\limits_{i = 1}^{m} {(f_{i} (x))^{2} = \frac{1}{2}\left\| {f(x)} \right\|^{2} = \frac{1}{2}f(x)^{T} f(x)} \hfill \\ \end{aligned} $$

(7)

f has continuous second partial derivatives, that can be written from Taylor expression as follows,

$$ f(x + h) = f(x) + J(x)h + O(\left\| h \right\|^{2} ) $$

Here, h _lm is the step used in this LM method and g is a variable which depends on step size (h _lm) and damping parameter (μ).

Also, J = J(x) and f = f(x). The damping parameter μ has several effects as follows:

For all μ > 0 the coefficient matrix is positive definite. This ensures that h _lm is a direction downhill. For large values of μ we get, $ h_{\text{lm}} \approx - \frac{1}{\mu }g = - \frac{1}{\mu }F^{\prime}(x) $ i.e. a short step in the steepest descent direction. This is good if the current iteration is far from the solution. If μ is very small, then h _lm ≈ h _gn; h _gn = Gauss–Newton step. This is beneficial in the final stages of the iteration, when x is close to x ^*. If F(x ^*) = 0 (or very small), then we can get (almost) quadratic final convergence.

The damping parameter μ influences both the direction and the size of the step. The choice of initial μ value should be related to the size of the elements in A ₀ = J(x ₀)^T J(x ₀), e.g. by letting μ ₀ = τ.max_i {a ⁽⁰⁾_ii }, where τ is chosen by user.^{Footnote 1}

Updating of the damping parameter μ is controlled by the gain ratio,

$$ \xi = \frac{{F(x) - F(x + h_{\text{lm}} )}}{{L(0) - L(h_{\text{lm}} )}} $$

(8)

Here,

$$ f(x + h) \approx \ell (h) \equiv f(x) + J(x)h $$

(9)

Inserting Eq. 9 in Eq. 7 we find that

$$ \begin{aligned} F(x + h) \approx L(h) & \equiv \frac{1}{2}\ell (h)^{T} \ell (h) \\ & = \frac{1}{2}f^{T} f + h^{T} J^{T} f + \frac{1}{2}h^{T} J^{T} Jh \\ & = F(x) + h^{T} J^{T} f + \frac{1}{2}h^{T} J^{T} Jh \\ \end{aligned} $$

(10)

The denominator of Eq. 8 is the gain predicted by the linear model of Eq. 10 as follows,

$$ \begin{aligned} L(0) - L(h_{\text{lm}} )\; & = - h_{\text{lm}}^{T} J^{T} f - \frac{1}{2}h_{\text{lm}}^{T} J^{T} Jh_{\text{lm}} \\ & = - \frac{1}{2}h_{\text{lm}}^{T} (2g + (J^{T} J + \mu I - \mu I)h_{\text{lm}} ) \\ & = \frac{1}{2}h_{\text{lm}}^{T} (\mu h_{\text{lm}} - g) \\ \end{aligned} $$

(11)

Both h ^T_lm h _lm and −h ^T_lm g are positive definite, so L(0) − L(h _lm) is guaranteed to be positive.

A large value of gain ratio, ξ indicates that L(h _lm) is a good approximation to F(x + h _lm), and we can decrease μ so that the next Levenberg–Marquardt step is closer to the Gauss–Newton step. If ξ is small (may be even negative), then L(h _lm) is a poor approximation, and we should increase μ with the twofold aim of getting closer to the steepest descent direction and reducing the step length.

The stopping criteria for the algorithm should reflect that at a global minimizer we have F′(x ^*) = g(x ^*) = 0 so we can use,

$$ \left\| g \right\|_{\infty } \le \varepsilon_{1} $$

Another relevant criterion is to stop if the change in x is small,

$$ \left\| {x_{\text{new}} - x} \right\| \le \varepsilon_{2} (\left\| x \right\| + \varepsilon_{2} ). $$

This expression gives a gradual change from relative step size ε ₂ when $ \left\| x \right\| $ is large to absolute step size ε ²₂ if x is close to 0. Finally, as in all iterative processes we need a safeguard against an infinite loop,

$$ k \ge k_{\max } .$$

Also ε ₂ and k _max are chosen by the user.

The last two criteria come into effect e.g. if ε ₁ is chosen so small that effects of rounding errors have large influence. This will typically reveal itself in a poor accordance between the actual gain in F and the gain predicted by the linear model and will result in μ grows fast, resulting in small $ \left\| {h_{\text{lm}} } \right\| $ and the process will be stopped by following the stopping criterion.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chowdhury, M., Alouani, A. & Hossain, F. Comparison of ordinary kriging and artificial neural network for spatial mapping of arsenic contamination of groundwater. Stoch Environ Res Risk Assess 24, 1–7 (2010). https://doi.org/10.1007/s00477-008-0296-5

Download citation

Published: 12 December 2008
Issue Date: January 2010
DOI: https://doi.org/10.1007/s00477-008-0296-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Comparison of ordinary kriging and artificial neural network for spatial mapping of arsenic contamination of groundwater

Abstract

Similar content being viewed by others

A Comparison of Some Interpolation Techniques for Determining Spatial Distribution of Nitrogen Compounds in Groundwater

Comparison of interpolation methods for the estimation of groundwater contamination in Andimeshk-Shush Plain, Southwest of Iran

Assessment of the Potential Contamination Risk of Nitrate in Groundwater Using Indicator Kriging (in Amol–Babol Plain, Iran)

1 Introduction