Introduction

A study is presented to explore risks to aquifers from the aggregated geogenic and anthropogenic contaminants, by taking on board the methodology presented by Nadiri et al. (2018a), which introduces soft modelling and risk cells. These are extended further by introducing Total Information Management (TIM) using five dimensions to the aquifer risk aggregation problem under sparse data as depicted in Fig. 1 and outlined as follows: (i) a perceptual model by taking an overview of existing data and extracting relevant information from past studies; (ii) a conceptual model by studying contaminants using both traditional and state-of-the-art techniques (statistical, graphical, multivariate analysis and geological survey); (iii) risk cells by delineating the zone of influence of each contaminant; (iv) ‘soft modelling’ by firming up the learning from convergences and/or divergences of the above conceptual techniques; and (v) investigating the processes in each risk cell through the framework of Origins, Sources, Pathways, Receptors and Consequence (OSPRC) .The paper also uses Human Health Risk Assessment (HHRA) introduced by USEPA (1989), as the baseline in terms of existing risks to encourage further research. As the TIM capability considers a wide range of processes, a critical procedure is built in to examine its reliability by different techniques and not by just one technique. Subsequently, the firmed-up information is studied through the OSPRC framework and not just by sampled data.

Fig. 1
figure 1

Five dimensions of Total Information Management for aquifer risk aggregation problems

Available approaches for identifying the origins and source processes of contamination are fragmented, as summarised in Table 1, since they include discrete techniques, each of which handles a certain aspect. The major features of the ongoing research activities are presented in the table without dealing with detailed information about their studies. The table is a gap analysis and implies that (i) knowledge integration is not quite feasible and (ii) risk aggregation is yet to be developed as a structured methodology. Currently, the tendency is to identify single/multiple contaminants without aggregating the associated risks. For example, Grassi et al. (2014), Nakaya et al. (2018) and Javadi et al. (2020) describe the state-of-the-art as identifying the origin of contaminations in groundwater and using the results in managing contamination.

Table 1 Review of published research works

Until now, risk to aquifers exposed to multiple contaminants is carried out by HHRA given by USEPA (1989), which relies broadly on sampling data and prescribed parametric values. The USEPA methodology is now a standard approach, but the authors are incrementally developing the TIM capability to fill the gaps on aggregating risks from multiple contaminants using the above 5 dimensions, within which knowledge integration is intrinsic. The concept of risk aggregation was introduced by the authors to aggregate risk indices from geogenic origins like fluoride and arsenic and from anthropogenic origins like nitrate (Nadiri et al. 2017, 2018c; Sadeghfam et al. 2018). Nadiri et al. (2017) employed qualitatively the concept of risk cells; Nadiri et al. (2018a) developed the concept to quantitative risk index from diffused sources of geogenic arsenic and anthropogenic nitrate; and Sadeghfam et al. (2018) quantified risk index for an aquifer in response to point-source and diffused contaminants.

The study identifies the above gap and treats them by employing the TIM framework through the five dimensions as per Fig. 1. Although these dimensions have been discussed by the authors in varying details (see Nadiri et al. 2017; Sadeghfam et al. 2018), these applications are at their infancy, especially studies dealing with the last three dimensions. TIM provides a capability to aggregate risk indices of aquifers exposed to multiple contaminants. The paper makes explicit the five dimensions of TIM unlike authors’ past studies, which are implicit. The justification for five TIM dimensions is provided in due course but references may be made to above studies for further details.

Global perspective on risks to aquifers provides better understanding to aquifer risk problems that fall within the scope of the study through the two basic developments since the World War II as follows: (i) significant increase in livestock and crop production by the green revolution since the 1950s affected water and soil quality by unmanaged consumption of nitrate-based fertilisers (Tilman et al. 2001; Galloway et al. 2008) and (ii) past ad hoc management procedures were ineffective to control impacts but policy-driven planning systems capable of putting a grip over impacts of the green revolution are yet to penetrate globally. Policy-driven planning systems accompanied by participatory decision-making, e.g. the Aarhus convention, as adopted in Europe (https://ec.europa.eu/environment/aarhus/legislation.htm) are yet to be taken up globally.

The Aharchay basin is investigated, which is located in the East Azerbaijan province, northwest Iran. Notably, planning and management practices are poor in the study area and the data availability are sparse. In the absence of a monitoring program for the aquifer, the study inevitably was carried out based on recent samples taken by the authors. The paper aims to clarify impacts of major, minor and trace ions and to take the advantage of the information derived from the sparse data to formulate perceptual and conceptual models and to aggregate risk indices for arsenic and nitrate anomalies at the anthropogenic and geogenic origins.

Study area

Geographical and geomorphological context

Ahar plain, located in the East Azerbaijan province, is approx. 1000 km2 and the elevation in the study area varies from 1220 to 3123 m (A.M.S.L). The plain is drained by Aharchay (or Ahar Chayi, the River Ahar) and is embanked by Sattarkhan dam, located to the west of the historic city of Ahar, the main city in the study area, 110 km northeast of Tabriz. The annual rainfall in the basin is 294 mm for the period of 1991–2016.

Sedimentation of alluvial materials along the plain has not been fully established, as the aquifer is not a wide and deep single continuum but a series of patchy aquifer units with significant contributions to water resources. The general slope of the plain varies from 0 to 8% and increases from the thalweg of the river towards its upper lands. Since the 1980s, the region is subjected to considerable anthropogenic activities including mining activities (e.g. Sungun mines at the northwest of the study area), agricultural activities, animal farming and livestock and an embankment reservoir at Sattarkhan dam, all with impacts on the Ahar basin. There is not much recorded data on hydrology/environmental factors to relate its status to its past baseline.

Hydrology and hydrogeology of study area

Aharchay, the main river in the study area (see Fig. 2), flows in the direction from west to east and drains the area to the Caspian Sea, as a tributary of Qarasu, which in turn is a tributary of the River Araz and that of the River Kur estuary at the Caspian Sea. The river rises from Pir Shafa Mount and is mainly rain-fed and snow-fed with known overflows in springs. The river has numerous tributaries including Mustafachay and Kashanchay and is a permanent stream but now dries up in summer seasons due to abstractions in its lower reaches.

Fig. 2
figure 2

The study area and locations of withdrawal wells and water resource

The aquifer system in the study area is characterised as patchy for being composed of many smaller aquifers scattered throughout the plain. The aquifer along the main river of Aharchay is rather extensive than those of its tributaries. The data obtained from the excavation of 24 exploratory and observation wells indicate that the overall aquifer is unconfined and the maximum and minimum value of groundwater elevation in the 24 observation well is 1744 and 1241 m above mean sea level, respectively.

Abstraction of groundwater in the study area is through wells, springs, and qanat system. The highest thickness of the alluvium occurs at the south of the study area, which is approx. 60 m but varies to 30 m away from the river. The thickness of the alluvium in the eastern half of Aharchay is less than its western parts, where the minimum thickness is on the outcrops of the formations of the region and is less than 10 m. The alluvium of the region is a mixture of gravel and sand and silt and clay, where fine grains are located mostly on the banks and floodplain, and especially near the city of Ahar. In Ahar plain, the highest amount of transmissivity occurs at the western and central parts of the plain and Aharchay, which is approx. 250 m2/d. The transmissivity decreases gradually towards the outer part of the plain and due to the low thickness of the aquifer; it reaches less than 100 m2/d near the outcrop around the plain. Within the city of Ahar, estimated transmissivity is 150 m2/d. The estimated specific yield is at 3–5% as per grain size of the alluvium in the area.

Geology

Ahar sub-basin is located in the Azerbaijan-Alborz type of tectonics unit. The oldest deposits in the study area belong to the upper Cretaceous era. The recent alluvial deposits of Quaternary are found as glacial deposits in most parts and in the vicinity of Aharchay, as well as recent alluvial sediments such as travertine and other alluvial deposits throughout the study area. Geologically, the lithological units that outcrop in the study area mainly include igneous rocks, where their water potentials depend on the aperture of fissures due to fragmentation and rock solution. Their effects on water resources are often explained in terms of structure, texture and basic lithological properties. Formations and lithological units that outcrop in the study area include limestone and Cretaceous marl to Quaternary igneous rocks. According to the geological map of 1: 250,000 Ahar (Fig. 3), the main geological units in the region are as follows, in order of age:

Fig. 3
figure 3

Geology map of the study area

Cretaceous

The Cretaceous sequence includes basic volcanic rocks, borderline to acidic, and sedimentary deposits that have a relatively limited number of outcrops in the region. The Cretaceous sequence in this region includes lithological units belonging to the Upper Cretaceous.

Palaeocene-Lower Eocene

The outcrops of this sequence mostly include the outer igneous rocks associated with the continental and the shallow marine environments.

Miocene-Oligocene

The 5 lithological units of the Oligocene-Miocene sequence consists of only two outcrop units in the study area, one of which includes Monzonite, Granite and Aplite.

Pliocene

The Pliocene sequence consists of four lithological units with only two units having outcrops in the study area. One of the units consists of a conglomerate with poor roundness with the siltstone unit and is widely found in the northern and southern highlands overlooking Aharchay, especially in the upper reaches of the city of Ahar. This formation has a relatively smoother morphology than the other units in these areas. The lithology of the other unit comprises ignimbrite with limited outcrops at the western and southwestern ends of the study area.

Quaternary

The lithological units and deposits of Quaternary include volcanic rocks, altered hydrothermal units, and discontinuous deposits. Quaternary volcanic rocks include basalt and alkaline andesite. The discontinuous deposits, seen at the foot of the heights in the form of the long alluvial terrace and alluvial fans, are present on both sides of Aharchay. New alluvial terraces are located at lower levels, which include sediments containing cobbles spread over a large part of Ahar plain. Additionally, there are also river sediments in riverbeds and riversides.

Data availability

According to the East Azerbaijan Regional Water Authority (EARWA), there are 19 deep wells, 623 semi-deep wells, 99 springs, and 41 qanats in the study area. There are 27 sampled data taken from springs, qanats, rivers and wells at the study area, distributed over the entire region shown in Fig. 4a, which is a pie diagram of the samples over the study area and provides a first-hand evidence for contaminants. These were taken in 2018 to measure water quality for the study area.

Fig. 4
figure 4

Measured ions to form a conceptual model for the study area using the 27 samples: a pie charts of the ions, b OSPRC cells for nitrate minor ions and c OSPRC cells for trace ions

As can be seen from Fig. 4a, the size of the circles shows the distribution of the chemical of groundwater in the plain, in which the circle size is relative to the quantity of total dissolved solids (TDS). Further results are displayed in Table 2 and Fig. 4b–c, which show that the study area is impacted by two sets of contaminations: (i) nitrate: there are nearly 4 hotspots of nitrate pollutions but most of the other observation wells also suffer from excessive concentrations with respect to permissible values set by WHO (2004) and hence this risk is distributed system-wide with variable concentrations; (ii) arsenic, lead and iron: there are 4 hotspots of these trace elements concentrations but each surrounded by rather low values and hence this risk is currently viewed as local (but under further investigations).

Table 2 Statistical summary of hydrogeochemical characteristics in samples

Methodology

Human Health Risk Assessment (HHRA) by USEPA (1989) is outlined to serve as a benchmark to argue the need for exploring further developments through Total Information Management (TIM). This section presents both and outlines 5 dimensions of TIM, which lays down the approach for aggregating risks to aquifers from multiple contaminants.

EPA Human Health Risk Assessment

Human Health Risk Assessment (HHRA), developed by USEPA (1989), combines the concepts of human health and risk to investigate the degree of harm to human body exposures against carcinogenic or non-carcinogenic ions through inhalation, oral and dermal or other similar ‘pathways’, where the term pathways here represent quite a different continuum than those in the OSPRC framework. Four steps are required to implement HHRA as follows:

  1. (i).

    Hazard identification identifies the pollutants with exceedance from allowable contamination level, whether they are carcinogenic or non-carcinogenic.

  2. (ii).

    Dose-response assessment establishes a relationship between the degree of exposure and adverse health responses. The dose-response relationships are described in terms of cancer slope factor (CSF) and reference dose (RfD) for carcinogenic and non-carcinogenic impacts, respectively. Notably, CSF and RfD values for different ions are prescribed by (USEPA 2004).

  3. (iii).

    Exposure assessment measures human exposure to a pollutant by considering intensity, time and frequency. Exposure assessment is carried out by considering the two oral and dermal pathways. Notably, Chronic Daily Intake (CDI) and Dermal Absorbed Dose (DAD) quantify exposures in oral and dermal pathways, respectively. CDI and DAD calculations require a set of prescribed parameters as detailed in USEPA (1991, 2004).

  4. (iv).

    Risk characterisation quantifies the probability of harmful impacts on humans exposed to a characteristic pollutant in the form of carcinogenic and non-carcinogenic risks (USEPA 1989):

$$ {NCHQ}_{Oral}=\frac{CDI}{\mathrm{R} fD},\kern0.75em {NCHQ}_{Dermal}=\frac{DAD}{RfD},\kern0.75em NCHQ={NCHQ}_{Oral}+{NCHQ}_{Dermal} $$
(1)
$$ {Risk}_{Oral}= CDI\times CSF,{Risk}_{Dermal}= DAD\times CSF,\kern0.75em Risk={Risk}_{Oral}+{Risk}_{Dermal} $$
(2)

where NCHD is non-carcinogenic hazard quotient or non-carcinogenic risk and Risk is the carcinogenic risk. Notably, when NCHQ and Risk exceed 1 and 10-4 respectively, impacts on health are likely on exposed individuals (Zhang et al. 2019). Risk and NCHQ can be calculated for different ‘receptors’, e.g. adult, children and infants, where the term receptor here refers to quite a different continuum than those in the OSPRC framework.

Total Information Management (TIM) under sparse data

Dimension 1: perceptual model

The term perceptual model is used widely by practitioners, although it does not refer to any explicit procedure. The generic features of the processes include the identification of past desktop studies, gathering observations during site visits, past desktop risk assessment exercises, collecting general-purpose data including geological formations of the site and any other relevant information. A perceptual model may identify primarily building blocks of the new study but without producing evidence as the aim is to formulate a starting point for the risk aggregation problem, which may include details of aquifers and their connections, potential pollutants to highlight anthropogenic hazards, land use, mineral composition to anticipate geogenic processes. Perceptual models systematise experts’ initial opinions.

Dimension 2: conceptual model

Conceptual models refer to each of the techniques used in hydrogeochemical studies, which are diverse techniques and include statistical techniques, graphical techniques, multivariate analysis and geological survey. Normally, all of these techniques are used by practitioners, who select those to gain a sufficient insight into a particular problem. Researchers are often focussed on the state-of-the-art techniques, e.g. multivariate analysis. However, the authors promote TIM practices through soft modelling, as outlined in the ‘Dimension 2: conceptual model’ section 3.2.4. Conceptual models provide evidence for identifying risk cells for a study area, as outlined below.

Dimension 3: delineating risk cells

A risk cell defines a complete domain, where appropriate OSPRC processes take place towards risk exposures and this builds up on the works by Nadiri et al. (2017, 2018c) and Sadeghfam et al. (2018). A study area is broken down into as many risk cells as required, each of which occupies a spatial layout and may partially coincide with other risk cells but each allows the passage of information on an individual risk through the process of OSPRC. One analogy to this is telephone lines, through which different communication lines are transmitted without interference.

Dimension 4: soft modelling

The trend in existing hydrogeochemical studies is to focus on the state-of-the-art techniques by investigating samples of concentrations of ions (statistical analysis, graphical methods, multivariate analysis or isotope analysis) without taking the benefit from the full range of techniques. The problem is that the results of different techniques may partially converge, but their divergences are also a real possibility. Soft modelling, introduced by Nadiri et al. (2018a), uses collectively existing techniques, where the term ‘soft modelling’ is an adaptation for hydrogeochemical studies from the analogy with soft systems by Checkland and Scholes (1999). Each technique studies aspects of the problem, e.g. dissolutions identify chemical processes (reduction or oxidation) by ion exchange, reverse ion exchange to detect their origins. Knowledge integration in hydrogeochemistry is not topical, but any inherent barriers can be removed by soft modelling.

Knowledge integration by soft modelling, as introduced by Nadiri et al. (2018a), is based on classifying the available techniques into increasing levels of complexity, similar to the basic idea by Khatibi (2012). Thus, models are sequentially firmed up as outlined in Appendix Table 8, the summary of which is as follows: Level 0, data are parsed out and the basis for perceptual/conceptual models are formulated; Level 1, statistical analysis is carried out using the available sample data, which provides information on learning inherent processes and chemical dissolutions; Level 2, graphical diagrams (Hounslow 1995) are constructed from the data, which are driven by a degree of top-down knowledgebase and their results identify types and sources of the ions in the dissolution in aquifers; Level 3, more sophisticated mathematical techniques are used, e.g. multivariate analysis (Cloutier et al. 2008; Delgado-Outeiriño et al. 2009) and produce a bottom-up approach to learn from pollutant data. Research papers often suffice to the Level 3 techniques or higher, but the paper promotes the full hydrogeochemical techniques.

Dimension 5: OSPRC risk cells

The OSPRC framework was introduced recently to groundwater contamination studies by Nadiri et al. (2017, 2018c) and Sadeghfam et al. (2018), where a framework refers to the consensual use of each of the dimensions, as there is no theoretical or empirical basis for the choice of each dimension. The ‘Origin’ dimension is their suggestion for the generalisation of existing SPRC framework, reviewed by Khatibi (2008) in detail and suggested its suitability for the aggregation of multiple flood risks. OSPRC is the key to unify the study of multiple risk processes in risk aggregation problems, and the authors published works offer a proof-of-concept for aggregating risk from geogenic arsenic anomalies contaminating a series of patchy aquifers, where the risk was local but transformed into a system-wide risk by an impounding reservoir (see Nadiri et al. (2017)).

The use of OSPRC risk cells requires a knowledgebase similar to expert systems to study the processes for each contaminant. There is no such a system yet, but the authors have put together one such knowledgebase in a tabular form using their experience for the contaminants identified in this study, as given below in Table 3. Risk cells serve as a way to delineate the domains for each of the contaminants, whereas the OSPRC framework takes on board the processes to study coherently inherent processes. The SPRC frameworks are normally used as descriptive tools for linking hazard to consequence of a particular risk and more so of risk of floods (e.g. see Nathanail et al. 2005; Thorne et al. 2007; Khatibi 2008). The SPR dimensions refer to the physical processes, but consequence is a matter of societal values. It is important to note that Origin is not analogous to hazard, as these two terms refer to quite different concepts.

Table 3 OSPRC Framework for minor and trace contaminants

Descriptive and quantitative risk aggregation

If conditions are right, risks from multiple sources over a study with multiple aquifer types can be aggregated by quantitative approaches, as follows. The OSPRC framework can be applied in each risk cell by the further step of dividing each risk cell to grids. This enables the study of vulnerability at each grid cell to both anthropogenic contaminants using, say, the DRASTIC framework and geogenic contaminants using, say, the SPECTR framework (e.g. for As, Pb and Fe), similar to Nadiri et al. (2018b) and Sadeghfam et al. (2018). These can be transformed into risk mapping tasks by appropriate changes in the mathematical formulations, the proof-of-concept for which has been given by Nadiri et al. (2018b) and Sadeghfam et al. (2018). However, the quantitative approaches require the continuum to be capable of a system-wide diffusion of the contaminant. When the latter conditions cannot be guaranteed, a descriptive TIM is still informative.

Results for the study area

Two sets of results are presented: preliminary results bring together the analysis at Level 0 through the contribution of Dimensions 1, 2 and 3 to facilitate the next stage; detailed results by using the information from the latter four dimensions with the outcome of identifying risk cells and decisions on risk aggregation.

Results using USEPA

The HHRA framework was implemented through a programmable platform, and the summary of results is presented in Table 4. The health risk values were calculated for 27 samples within the study area and include (i) non-carcinogenic risk (NCHQ) for nitrate in oral and dermal pathways, (ii) carcinogenic risk (Risk) for arsenic in oral and dermal pathways, and (iii) carcinogenic risk (Risk) for lead in oral pathway. Notably, the following values are not available: CSF for lead in the dermal pathway and CSF and RfD for iron. The prescribed values are taken from the EPA publications, particularly USEPA (1989), USEPA (1991) and USEPA (2004) (Zhang et al. (2019)).

Table 4 provides results that mean values of carcinogenic/non-carcinogenic risks for 27 samples exceed allowable limits (10-4 for carcinogenic risk and 1 for non-carcinogenic risk), in which 70, 52 and 37% of nitrate, arsenic and lead samples exceed their allowable risk limits. Figure 5 illustrates spatial distributions of estimated health risk within the study area. Although the study area is a series of patchy aquifers, mapping the results by a system-wide distribution contains uncertainty but may be considered good enough to provide a visual representation of spatial pattern of health risk values.

Table 4 Summary of HHRA results
Fig. 5
figure 5

Spatial distribution of health risks: a non-carcinogenic risk by nitrate, b carcinogenic risk by arsenic and c carcinogenic risk by lead

In spite of the flags above on the accuracy of the results in Fig. 5, they should be fit for indicative observations, as follows. Risks from individual pollutants are likely to be adverse near their sources but distributed over larger areas and the sources for each pollutant are at different locations. Also, aggregated risks would be quite significant over the study area, but this a simple numerical aggregation of these values, which are unlikely to be defensible. This justifies the need for TIM.

Preliminary results using TIM

Dimension 1—contextualisation of the study area by perceptual model

The basis for the perceptual model is the main characterisation of the study area, presented in the ‘Data availability’ section, where the aquifer system is found to be composed of a series of patchy unconfined aquifers often isolated from one another.

Relevant to the perceptual model of the study area is the preliminary knowledge on its baseline, and as such the basin was in a rural agrarian region up to the 1970s with well-tested balance between land use and its rural agrarian economy with a sustainable way of life. However, water records show that there is some decline in water table in recent years due to increased pumpage from the aquifers, and this has given rise to some subsidence within the plain. There are no known hydrogeochemical investigations on the study area prior to recent years, and no data is available on the history of geogenic contaminations. However, there is no notable settlement in the study area to be exposed to health problems, but the area was thriving, though badly neglected.

Dimensions 2 and 3—results of conceptual model and risk cells

A conceptual model (Dimension 2) of the study area draws on from the data at the 27 samples measured in 2018, as displayed in Fig. 4 with the map of the pie diagram of the samples as well as in Table 2. The assessment of data quality is specified in Table 2, which summarises statistical parameters, including maximum permissible concentrations for natural water (WHO (2004)).

The above results are indicative of four main contaminants in the study area, which are nitrate often from anthropogenic origins, and arsenic, lead and iron, often of geogenic origins, but these need to be identified. The salient findings from Table 2 include (i) most of the sampled major ions exceed their permissible values for drinking water standard (WHO 2004) by a moderate margin but they are not alarmingly high yet; (ii) hotspots exposed to nitrate pollutions reach as much as 17 times the maximum allowed by the World Health Organisation (WHO 2004) standards at 50 mg/L; (iii) hotspots exposed to arsenic contamination reach 0.12 mg/L, which is as much as 12 times greater than the World Health Organisation (WHO 2004) at the permissible limit of (0.01 mg/L); (iv) exposures to iron concentration exceed 3 times the maximum allowed by the World Health Organisation (WHO 2004) standards at 0.3 mg/L; (v) lead contamination is approx. 1.5 times the maximum allowed by WHO standards at 0.01 mg/L; and (vi) the emerging information is essential to delineate risk cells.

The results further shows that electrical conductivity (EC) of the samples range from 373 to 4500 μS cm-1, in which its high values notably at central parts of the plain and are associated with fine-grained particles with a considerable impact on residence time. The pH values range from 6.23 to 8.5, which indicate that the water in the aquifer changes from neutral to basic.

Delineating risk cells

A closer view of the data presented in Table 2 and Fig. 4 indicates that arsenic, iron and lead concentrations are seemingly local and not overly diffused in the environment. This is a discontinuity and attributable to the following: (i) the aquifer system is patchy, (ii) the contaminants are triggered recently or (iii) contaminants are of local significance even if they are of an old anomaly due to peculiar formation characteristics. A side effect of this key finding is that a quantitative application of the TIM capability to the study area is not feasible at this stage until further samples are taken to explain the discontinuity in the diffusion of the sample data. Therefore, the paper suffices to a descriptive application of the TIM capability.

The above are sufficient to delineate risk cells in the basin using geological formations and these are depicted in Fig. 4b–c, which comprise four broad risk cells, as follows: (i) Risk Cell 1 (N1 and T1): exposed to high risk from nitrate, arsenic and iron located; (ii) Risk Cell 2 (N2 and T2): exposed to low risk from nitrate, arsenic and iron but exposures to lead are moderate; (iii) Risk Cell 3 (N3 and T3): exposed to high risk on lead and moderate risk on nitrate, arsenic and iron; and (iv) Risk Cell 4 (N4 and T4): exposed to low risk from lead and iron and moderate risk from nitrate and arsenic. Notably, N refers to nitrates and T for trace elements.

Detailed results using TIM

Overview

The parsing of the data in the ‘Data availability’ section facilitated the delineation of broad risk cells in Fig. 4b–c, which are the basis to integrate knowledge and study Dimensions 2, 3 and 4 and (see Table 5). In reality there are 16 risk cells, four for each contaminant.

Table 5 Integration of the results by soft modelling at each risk cell

The information integrated by Table 5 is justified through the results presented below.

Levels 1–2

Due to the discontinuity in the diffusion of arsenic, iron and lead concentrations, the TIM capability is applied broadly to contaminants in each risk cell and outlined below.

Techniques at Level 1: Statistical Analysis

At this level, Pearson correlation, r and scatter diagrams of binary ions are employed. The bivariate correlation analysis between pairs of hydrochemical parameters has been used for measuring the r-values of ions, see Table 6. These are presented in four designated bands associated with the strength of the r-values. Overall, a positive r-value between two ions is suggestive of their common origins but negative or close to zero values are of differing origins and processes.

Table 6 Correlation matrix of the 16 chemical variables

As per Subba (2002), strong correlations of EC with Cl- (0.94), Ca2+ (0.83), Na+ (0.93) and Mg2+ (0.94) indicate the trend for chemical activities and this may be explained by (i) a common trend of groundwater though the flow direction due to water-rock interactions; (ii) the decline of water table in the basin due to over-abstraction and a gradual loss of dilution; and (iii) an increase in residential time of groundwater and water-rock interactions in low hydraulic conductivity area. As per Drever (1997) and Mahlknecht (2003), the r-value of 0.93 of sodium with chloride indicates that sources of sodium are halite solution in groundwater and the r-value of 0.71 and 0.87 of calcium and magnesium with chloride indicates the groundwater aquifer system encourages a possible ion-exchange process.

Scatter Diagram of Binary Ions: As can be seen in Fig. 6a and 6b, four hydrochemical process are detected, which include (i) dissolutions of calcite and dolomite from limestone formation (Fisher and Mulican, 1997) and dissolutions of anhydrite or gypsum from marl and siltstone formation with evaporate interbedded (Kumar et al. 2006) referred to as the simple dissolution process (Venugopal et al., 2009); (ii) reverse ion exchange in fine-grained sediments; (iii) ion exchange in fine-grained sediments; and (iv) halite dissolution from Pliocene Formation. However, approximately 80% of the water samples have Na+/Cl ratios significantly greater or lower than 0.5 indicating existence of another sodium and chloride source (see Fig. 6b).

Fig. 6.
figure 6

a Level 1: binary ions (Ca2+ + Mg2+ versus \( {\mathrm{HCO}}_3^{-}+{\mathrm{SO}}_4^{2-} \)). b Level 1: scatter diagrams of binary ions (Na+ versus Cl-)

Techniques at Level 2: graphical analysis

Piper diagram

As can be seen from Fig. 7a, hydrogeochemical types of groundwater from qanats, springs, abstraction wells and river are analysed by the Piper diagram. To distinguish different types of groundwater, the diamond plot is divided into five zones (A, B, C, D and E). Each zone is associated with certain anions and cations associated with an appropriate type (see Fig. 7a). The zones in the study area comprise: A (temporary hardness) and E (mixing zone).

Fig. 7.
figure 7

a Level 2: Piper diagrams to identify types and sources of groundwater. b Level 2: Stiff diagram to identify types and sources of groundwater. c Level 2: Durov diagram to identify types and sources of groundwater

Stiff diagram

Based on Hounslow (1995), the Stiff diagram (Fig. II.2b) indicates that groundwater samples of this study area are from six diverse origins: (i) Class C1: reverse ion exchange, (ii) Class C2: evaporate formation, (iii) Class C3: dolstone or mafic rock formation, (iv) Class C4: limestone origin, (v) Class C5: acidic rock, (vi) Class 6: Mixing origin. According to the above results and those in Fig. 7, the gradual decrease in the quality of groundwater can be realised, which is reflected by high EC values in the east part of Aharchay valley and subsequent increasing residence time.

Expanded Durov diagram

This diagram shows three processes (Fig. 7c): (i) general processes of groundwater, which indicate simple solution (water-rock interaction) or mixing of groundwater with different origins; (ii) ion exchange; and (iii) reverse ion exchange process.

Level 3: hierarchical cluster analysis—HCA

One multivariate technique employed by the paper is hierarchical cluster analysis (HCA), which clusters the data in convenient classes (Reghunath et al. 2002) to similar samples. HCA uses 27 samples in terms of concentrations of Ca2+, Mg2+, Na+, K+, Cl-, \( {\mathrm{SO}}_4^{2-} \), \( {\mathrm{HCO}}_3^{-} \), \( {\mathrm{NO}}_3^{-} \), EC, F-, As, Fe, Cu and Pb, as well as chemical properties of EC, TDS and pH. It uses z-transformation to scale the data and groups them by the Ward’s method (Ward 1963) to calculate similarity among the samples by linkage with Euclidean distance (Deza and Deza 2009). Figure 8 gives the dendrogram of HCA results, in which a threshold value of 6.5 is adopted for the linkage distance, and this value is selected on the basis of expert opinion. It produces a dendrogram with four clusters, and their cluster analysis shows the influence of EC values on the classification, where such an analysis is beyond the capability of the graphical methods. The salient features of each cluster are outlined, as follows:

  1. Cluster (I).

    corresponds to 4% of all samples and comprise Sample 6, in which an examination of the results show that EC has a value more than 4000 μs/cm and Cl has its highest value. There is a strong association between Cluster 1 and nitrate and Arsenic concentrations.

  2. Cluster (II).

    corresponds to 7% of all samples and comprises Samples 4 and 5, in which an examination of the results shows that chlorine is at a high level. Cluster II includes samples with EC between 2000 to 3000 μs/cm. There is a strong association between Cluster II and nitrate and Arsenic concentrations too.

  3. Cluster (III).

    corresponds to 67% of all samples and comprise Samples 1, 2, 3, 7, 8, 9, 10, 11, 12, 14, 16, 17, 18, 19, 20, 21, 24 and 26, in which EC varies in the range from 1000 to 2000 μs/cm. There is a strong association between Cluster III and nitrate and Arsenic concentrations.

  4. Cluster (IV).

    corresponds to 22% of all samples and comprises Samples 13, 15, 22, 23, 25 and 27, in which EC values are less than 1000 μs/cm with no nitrate or arsenic contamination.

Fig. 8
figure 8

Level 3 Outputs: Clusters I-IV identified by HCA

Level 3: factor analysis—FA

Factor analysis (FA) seeks effective factors to study hydrogeochemical impacts and the significance of correlation between factors and data variables. Table 7 presents the loading bars of the principal components and their representative variance. The rotated factors are identified by high positive and negative loadings and near-zero loadings. Following Davis (1986) and Selvam et al. (2020), maximum variance of the factors is extracted by the highest range of the positive or negative loadings. The four Factors explain 71%, of the variance.

Table 7 Results of factor analysis for the study area (*loading > 0.5)
  1. Factor (I)

    This factor, given in Table 7, is associated with high positive loadings of Cl-, K+, Ca2+, Na+, NO3 and EC. They are suggestive of (i) water-rock interactions and (ii) a general trend for dissolutions in groundwater at the study area and (iii) Nitrate concentration with anthropogenic origin increase through natural dissolution. Nitrate concentration increases in the groundwater system through the leaching from fertilisers in agricultural lands, which is related to infiltrated surface water interacting with geological formations (i.e. shale, marl, etc.) and cause an increase in groundwater EC. This impacts on groundwater quality and creates contaminations stemming from anthropogenic activities largely related to agriculture with minor contributions from domestic sewage. An examination of detailed results shows that water-rock interactions through the above ions would control approximately 34% of the groundwater chemical processes.

  2. Factor (II)

    This factor, given in Table 7, is associated with high positive factor loadings of SO42+, As and Fe. Arguably, the presence of these three ions in factor II is indicative of sulphate mineral origin at high concentration of As and Fe, which affects the dissolution of sulphate minerals contained in igneous and volcanic rocks. It controls approximately 18% of the data variance.

  3. Factors III and IV

    These factors, given in Table 6, control approximately 9.6 and 9.1% of data variance, respectively. Factor III is associated with high positive factor loadings of F- and factor IV is associated with high positive factor loadings of Pb. The origin of high fluoride and lead is not associated with other ions and has special characteristics which need more investigations.

Overview of mechanism to disperse contaminants

An important focus of the results at Level 2 (graphical methods) is that groundwater samples in the Ahar basin are largely located in Zone A associated with ‘temporal’ hardness and of good quality. This is explained, as follows: (i) Ahar aquifer is not deep and has low groundwater residence time; (ii) recharge areas of the aquifer comprise hard rock formations with minimum water-rock interactions for the infiltrated water; (iii) high-quality surface water through the plain is in interaction with groundwater, where the watercourses contribute to recharging the aquifer in the eastern, northern and southern parts of the plain, but the aquifer drains to watercourses at its western parts and thereby reduces its residence time.

Nitrate pollution

The highest nitrate concentration in groundwater occurs at Samples 5 and 6 in Cell N2; moderate concentrations at Cell N1 and N3; and low concentrations at Cell N4. The distribution of nitrate is explained as follows: (i) as agricultural activities are the main preoccupation in the study area, they give rise to widespread nitrate at the ground surface due to fertilisers; (ii) groundwater from the surface at the plain is likely to percolate uniformly and this would act as a diffuse-source by infiltrating waters washing high nitrate concentration through percolation. Notably, final concentration values are outcomes of the following intrinsic processes: (i) nitrate concentration at the ground surface; (ii) amounts of groundwater recharge at the surface, and, (iii) the characteristics of aquifer media.

Arsenic and iron contamination

Based on the results of multivariate analysis, the origin of arsenic and iron contaminants is geogenic and attributable to porphyry copper deposits, which differ from that of nitrate. These are found widely at the Ahar basin and can spread through joints and faults. They are also activated by hydrothermal activities. Arsenic loads are likely to be found at joints and faults of northern parts of the study area (see OSPRC Cells T1, T 3 and T4). The samples with high arsenic concentration at Cell T1 show high nitrate, high non-carcinogenic and carcinogenic health risk and highest EC and TDS (Samples 5 and 6). The study identifies arsenic concentration hotspots, which do not seem to be diffused widely in the study area.

Lead contamination

Based on the results of multivariate analysis, the origin of lead contaminant is geogenic. The highest lead concentration is at Cells T1, T2 and T 3. Lead loads show random behaviour, but their hotspots do not seem to be diffused widely in the study area.

Dimension 5: OSPRC view of risk cells

The authors have produced proof-of-evidence for quantitative risk aggregation problem by using the DRASTIC framework (for nitrate) and the SPECTR framework (for As, Pb and Fe), similar to Nadiri et al. (2018b) and Sadeghfam et al. (2018). The basic assumption is that contaminants can be diffused into the risk cell and/or basin. However, the paper identified a discontinuity in the spatial distribution of geogenic contaminants (As, Fe and Pb), and therefore a quantitative study of potential risk exposures is not technically feasible until the domains of contaminants are fully definable. Nonetheless, the paper presents a descriptive OSPRC assessment of risk for nitrate, arsenic, iron and lead contaminants for the study area. This is presented in Table 5, but in reality, it can be carried out for each risk cell.

Discussion

The initial objective of this research work was to compare the risk to health from contaminants at the study area by USEPA procedure (USEPA 1989) with risk mapping by Total Information Management (TIM) using the five dimensions as depicted in Fig. 1. However, the preliminary perceptual and conceptual models revealed the presence of some discontinuity in the diffusion of contaminant concentrations. Subsequently, a quantitative risk mapping was not possible. However, the authors aim to carry out further samples to explain the nature of diffusion to enable risk mapping by TIM. The paper suffices to a descriptive risk aggregation problem at this stage.

The emerging insight from the study area within the global context is that although risks to aquifers have been amplified since the green revolution for the return of increased food availability (Vitousek et al. 1997; Agren and Bosatta 1988; Galloway et al. 2008; Bui et al. 2020), planning system have been put in place to successfully to control impacts (Sheikhipour et al. 2018; Nerantzis et al. 2020). As the uptake of planning systems is not global, the amount of nitrogen transported into the oceans by the rivers in the world has roughly doubled since the nineteenth century, and rates of nitrogen transport from developed areas have increased 10- to 50-folds (Meybeck 1982). Arsenic contaminations have been noted since the 1980s and as per RGS (2008), where arsenic in drinking water was recognised as a serious problem in Argentina, Chile and Taiwan circa the early 1980s. Research has also focussed on other trace element, e.g. impacts of Fe on human health through long-term ingestion of high Fe dosage causing haemochromatosis diseases, which stem from geogenic or natural origins (Blarasin et al. 1999; Zhang et al. 2020). Similarly, Pb exposures are related to both geogenic and anthropogenic origins, which are toxic ions and noted in different countries (Nicholson et al. 2003; Ju et al. 2007; Siegle 1979).

Risk exposures to the study area stem from (i) chemical and organic fertilisers, (ii) uncontrolled or ineffectively controlled mining practices; and (iii) geogenic contaminations from hydrogeochemical processes encouraging the release of geogenic contaminant such as arsenic, iron and lead. Prior to the arrival of mechanisation in the area, traditional subsistence approaches were sustainable, where local anthropogenic activities did not pose any regional risks. There were no sampling data available prior to those commissioned by this study, and therefore the baseline is unknown, but now the risks from nitrate, arsenic, iron and lead are real. Table 5 suggests a basic mitigation measure.

As the tailing dam of the Sungun Mines, one of the largest copper mines in the Middle East, located in the Ahar basin, the study does not seem to directly show contaminations from the mining activities but this sounds peculiar as mining activities are invariably pollutants. In spite of the location of the tailing dam being at the Aharchay basin, the mine and mining activities are outside this basin, where there are no tailing dams and therefore their impacts ought to be sought in another tributary of the River Araz.

Management issues

A study of the impacts on the recipients and consequences of contaminations are outside the remit of this paper. The reported aquifer contamination problems have not yet been translated into a remediation project and the response of the appropriate authorities remains to be seen. This research project is only one stepping stone towards possible future studies to plan an action plan. Sadeghfam et al. (2019) discuss a management perspective with a particular reference to the special procedure for contamination lands in developed countries. For instance, in the UK, the procedure is to designate the site with a special status of ‘contaminated land’ and take tiered steps to reduce the risk to an acceptable level, as follows. Tier 1: Preliminary Risk Assessment; Tier 2: Generic Quantitative Risk Assessment; and Tier 3: Detailed Quantitative Risk Assessment. Mitigation projects would be followed by a requirement for verification, in which information management is the key with the aim of continually reducing uncertainty, but these are outside the remit of this paper.

The descriptive risk aggregation problem in the paper is planned to be transformed into a quantitative risk aggregation model at the next phase. To this end, further detailed data sampling will be required to explain inherent discontinuity in the diffusion of the contaminants. Also, social data will be needed to study the consequences of the nitrate pollution and arsenic, iron and lead contaminations, although currently this may not be likely. Gathering more geological data will also help to a focus on the distribution of trace element.

Conclusion

Groundwater in the study area serves as the main water resources for 128,000 inhabitants of the Ahar basin for drinking and agriculture, as well as for industry and mining. The aquifer is now distressed for the absence of an effective planning system, where water table is declining and the paper presents evidence for anthropogenic nitrate contaminations as well as arsenic, lead and iron contaminations. The research is driven by academic goals to understand the scale and scope of the problem and to produce tools that can help planners in time to manage risks. The paper used the EPA approach for Human Health Risk Assessment (HHRA) and provides evidence that there are both carcinogenic and non-carcinogenic risks to human health at the study area. However, a greater insight was sought by exploring the Total Information Management (TIM) capability on aggregating risks. The TIM capability integrates several topical research activities together for defensible modelling results. The capability was applied to the study area but owing to some discontinuity in the degree of diffusion of contaminants, a quantitative risk mapping was not possible.

A descriptive application of the TIM capability identified 8 risk cells and provided the following insights into the study area: (i) the baseline prior to 1970 was likely to have been good quality water; (ii) with respect to major ions, groundwater quality of the study area now remains acceptable but variations in EC signifies active hydrochemical processes; (iii) with respect to minor ions, nitrate pollution originates from intensive agricultural activities and exposes the porous media of, and the population at the study area to unacceptable risks from diffuse-source of anthropogenic origins; (iv) with respect to trace ions, arsenic, iron and lead contaminants are likely to originate from geogenic processes and expose the porous media of and the population of the study area to unacceptable risks.