Introduction

Water security is a huge challenge for the global society’s long-term prosperity. As a result of inaccurate water resource management due to environmental pressures, developing countries are today challenged with large population growth, rapid urbanisation and insufficient water-sector services (Yousif 2015). The north-west coast of Egypt is more sensitive to future sustainable development and this, on the other hand, depends primarily on the occurrence and maintenance of water resources (Ali et al. 2007). Due to the climate change, there have been insufficient precipitation on the north-western coast of Egypt for agricultural activity and residents have started drilling water wells (Solomon et al. 2007). In such instances, groundwater is the only source of good quality freshwater, and it is widely used for residential, agricultural and industrial purposes. Due to population increase and rapid development, several places have suffered over-exploitation and uncontrolled usage of groundwater resources (Adimalla and Wu 2019). As a result, the availability of fresh groundwater has decreased, and the quality of groundwater in some locations has deteriorated. The majority of surface water is found in streams, rivers, springs, ponds, lakes and reservoirs. Surface water is gathered from rain in watershed areas, flows through streams and rivers, and settles in ponds and lakes on occasion (Manahan 2010). In the study area, surface waters are relatively inadequate, uniformly scattered or unsuited to human consumption in many coastal locations (Yidana and Yidana 2010). Further research have found that direct dumping into the surviving surface water bodies of various polluted materials from home, farming and industrial wastewater eventually pollutes them (Edokpayi et al. 2017). Surface water also faces serious exposures to salinity due to sea backwater in coastal zones (Vijay et al. 2011). The surface water is quite limited in the coastal zone because of the winter precipitations (El Bastawesy et al. 2008). Agricultural production may have several factors; however, water is the most important factor (Oweis and Hachum 2003). Naturally, water can convey various heavy metals across diverse geological formations (Mohankumar et al. 2016). Biological activity, soil leaching, weathering, and rock disintegration are many instances of natural processes that produce changes in the quality of groundwater (Rao et al. 2020). Appreciation of the water body’s hydrochemistry exposes water to a variety of uses (Alexakis 2011; Gamvroula et al. 2013). Groundwater quality is critical in assessing its suitability for various uses. A variety of geochemical processes, including natural and anthropogenic activities, can have an impact on the quality of groundwater. Weathering and dissolving of rocks, leaching from the soil and biological activities are all natural processes that cause variations in groundwater quality (Khatri and Tyagi 2015; Rao et al. 2020). In addition, the interaction of contaminated surface water with groundwater could threaten groundwater resources (Brindha et al. 2014). In the research field, groundwater quality may be impacted by the construction of a sewage treatment plant with a capacity of 25,000 m3/day at a height of around 60 m above sea level. The pollution index of groundwater (PIG) aims to investigate and understand the status of water quality in a water body (Horton 1965; Brown et al. 1970). The multivariate statistical tools enable to manage water resources consistently (Bora and Goswami 2017). It aims to provide the water quality of a body a single value, which can help understand the quality of the water for various purposes (Smita et al. 2018). Other approaches, such as cluster analysis (CA) and factorial analysis (FA), help to assess the spatial and temporal variants of water quality to identify potential factors affecting water quality (Gamble and Babbar-Sebens 2012). The current investigation was shown with the following objectives: (1) make an evaluation of groundwater and surface water; (2) estimate their drinking and irrigation appropriateness; (3) identify the state of aquifers for intrusion/freshening phases that occurs over time via using the hydrochemical facies evolution.

The study area

The study area is about 10 km away from Matrouh city. It is constrained by longitudes of 27°15′ and 27°25′ E, latitudes of 31°8′ and 31°25′ N with a total area of approximately 400 km2.

Three drainage basins were chosen for the present study; these basins from the northwest to the southeast are Wadi Samla, Wadi Khair and Wadi Naghamsh with an area of 26, 36 and 116 km2, respectively (Fig. 1a). The chosen basins are important because of their agricultural operations, which require more water supplies to be sustainable and not transitory. The sewage treatment plant in the study area is at an altitude of approximately 60 m from sea level with a capacity of 25,000 m3/day. The plant has 14 oxidation basins used in aerobic and anaerobic treatment. The wastewater is treated and directed to three untreated dirt reservoirs used for water storage and tree forest irrigation. The wastewater treatment plant provides 9 million m3/year of water to irrigate 4 km2 of tree plantations. Groundwater interaction with polluted surface water can potentially put groundwater sources at risk. Different geochemical processes may affect ground water quality, including natural and anthropogenic activity. The study area is considered as a part of the south Mediterranean region where it is characterised by a moderate climate. The typical monthly maximum air temperature is 29.6 °C in August and 8.9 °C in January, with an average annual air temperature of 19.4 °C. The month of July has the highest relative humidity (73%) and the month of March has the lowest (63%) (Barseem et al. 2013). The evaporation rate records its maximum peaks in summer time (June–August). Wind speed ranges from 15.01 km/h in October to 22.04 km/h in January, resulting in 2420 mm of pitch evaporation per year (Masoud 2000). Rainfall starts from October to March, while the summer season is almost dry. The average annual rainfall varies from 64 to 412 mm (CLAC 2015), with an average annual cumulative precipitation of 155 mm. Many authors have researched the geomorphology, hydrology and geology of the coastal area of the northwest Mediterranean (Raslan 1995; Masoud 2000; Barseem 2006; Mohamed et al. 2011).

Fig. 1
figure 1

a Location map of the study area. b Geological map of the study area with water point of groundwater samples

Geomorphologically, the northwestern Mediterranean coast is distinguished into three geomorphological units. These units were classified into the coastal plain, Piedmont plain and structural plateau (tableland) (Raslan 1995) (Fig. 1a). The elevation is between sea level and approximately l00 m in coastal and Piedmont plains. There were coastal dunes, ribs and sand dunes in the coastal plain. At the foot of the structural plateau, the Piedmont plain is growing. It has thick fine deposits of calcareous grounds derived from the alluvial deposits of various wadis. The main catchment region of the drainage line is the structural plateau (tableland). The construction of the plateau ranges from 100 to 175 m from the south to the north side of the Piedmont.

The geology of the study area has a considerable impact on the occurrence and quality of groundwater. The research region is covered with Tertiary and Quaternary sediments. The former includes sediments from the Middle Miocene (Marmarica Formation) to Pliocene. The Middle Miocene layers are made mainly of fossiliferous chalky and dolomitic limestone with marl and clay intercalations (El Shazly 1964; Yousif et al. 2014). In the studied area, Pliocene sediments have a limited distribution (El Shazly 1964; Hammad 1966). The Quaternary sediments contain Pleistocene and Holocene deposits. Pleistocene deposits are made up of oolitic limestone, which is composed of oolitic grains coupled with quartz sands and shell pieces bonded together by fine calcium carbonate. A variety of unconsolidated deposits, such as alluvial, aeolian and sabkha deposits, make up the Holocene deposits. Alluvial deposits are made up of muddy sands, silt and clay, and are rich in carbonate grains, rock pieces and gravel. Quartz sands make to the carbonaceous composition of coastal dunes (Fig. 1b). Based on Hammad 1966, 1972), the Marmarica fractured limestone comprises carbonate minerals, including dolomite and calcite with minor clay and silicate minerals. The Quaternary oolitic limestone of marine origin has been formed along with the transgression and regression of shoreline of the Mediterranean (Zeuner 1959; Butzer 1959; El Shazly 1964; Hammad 1966).

Hydrogeological, Pleistocene and Middle Miocene aquifers are the major productive aquifers in the study area (Raslan 1995). The aquifer Pleistocene is composed of oolitic calcareous sand with shale interbeds. The groundwater of the oolitic aquifer occurs in a porous media under unconfining condition. The main recharge for the aquifer from the local annual rainfall and runoff water that comes from the upstream table land plateau located in the southern side (Eissaa et al. 2018). The groundwater of the Middle Miocene is (Marmarica limestone) intercalated with alternate clay beds and groundwater occurs in fractured media where recharge rainwater percolates through joints and fissures (Mustafa et al. 2016). Hydrogeological cross-sections A–A′ reveal that groundwater level in the Pleistocene aquifer flows from the upstream at southern side towards the Mediterranean at the northern downstream side. The locations of the two hydrogeological cross-sections in the study area help clarify local and regional landforms. The X–X′ cross-section illustrates the various stages of the tableland and the scarp with its foot slope. Terraces were reported in the A–A′ cross-section, which depicts a valley running through the tableland (Fig. 2) (Yousif et al. 2014). These aquifers are recharged by direct rainfall infiltrations and/or surface discharge (Sewidan 1978; NARSS. 2005). In the Pleistocene aquifer, the total depth of the wells ranges between 8 and 25 m, the depth to water ranges between 2 and 10 m, and TDS concentrations range between 8590 and 19,410 mg/l. On the other hand, in the Miocene aquifer, the total depth of the wells ranges between 23 and 42 m, the depth to water varies between 12 and 35 m, and TDS concentrations range between 4290 and 7920 mg/l (Table 3).

Fig. 2
figure 2

Hydrogeological cross-section of the study area clarifies local and regional landforms with the locations of the two cross sections. b X–X′ cross-section shows the different stages of tableland and the scarp with its footslope. c A–A′ Hydrogeological cross-section showing the piezometric level in the oolitic and Miocene fractured aquifer in the study area (modified after Al-Sayed et al. 2016)

Materials and methods

Fieldwork began in June 2019 with an inventory survey of 31 existing water samples. These water samples were collected in 250-ml pre-washed polyethylene bottles with deionised water. The samples were kept at 4 °C in the laboratory to prevent microbial changes in water chemistry. The samples were geo-referenced using GPS (Trimble, Juno S-3 model). These samples are represented by 6 surface water samples and 25 groundwater samples (Fig. 3a). Samples were analysed at the hydrogeochemistry department of Desert Research Center (DRC) according to the methods adopted by the United States Geological Survey (Rainwater and Thatcher 1960; Fishman and Friedman 1989) and American Society for Testing and Materials (ASTM 2002). Sodium (Na+) and potassium (K+) were determined by flame photometer. Calcium (Ca2+) and magnesium (Mg2+) were determined by titration against (Na2EDTA) by complexometric method. Carbonate (CO32−) and bicarbonate (HCO3) were determined by titration against sulphuric acid using the neutralisation method. Chloride (Cl) was determined volumetrically by titration against silver nitrate. Sulphate (SO42−) was determined by the turbidity method using a double beam spectrophotometer. Trace element contents (Al3+, Cd2+, Cr3+, Cu+, Fe2+, Mo2+, Mn2+, Ni2+, Pb2+, V5+ and Zn2+) of the water samples were determined using inductively coupled argon plasma (ICP). The obtained chemical data are expressed in milligrams per litre (mg/l) (Tables 1, 2, 3, and 4). Some parameters including the depth to water, total well depth, pH, temperature, EC, CO32−, HCO3, TOC, COD and NO3 were determined in situ using pH, EC meter, 3510 Jenway—UK, for CO32−, HCO3 titrimetrically against sulphuric acid by neutralisation and measure of TOC, COD and NO3 by using compact photometer PF-12Plus (Macherey–Nagel GmbH & Co. KG, filter photometer). Each sample was run twice and standards were checked for each sample in order to ensure analytical quality control.

Fig. 3
figure 3

a Map of sampling points of the area under investigation. b Heavy metal pollution index classification map of groundwater samples in the area under investigation. c Pollution index classification map for the groundwater samples in the study area. d Classification map of ecological risk index of the studied groundwater in Pleistocene and Miocene aquifer

Table 1 Unit weightage (Wi) and standard permissible value (Si) in mg/l according to Egypt Guidelines according to the Minister of Health decree number 108 for 1995 and 458 for 2007
Table 2 Assigned weight (wi) and relative weight (Wi) of water quality parameters according to WHO (2017)
Table 3 Physiochemical parameters for groundwater and surface water samples
Table 4 Heavy metal analysis for groundwater and surface water samples, all concentrations (mg/l)

The error % = [∑ Cations − ∑ Anions]/[∑ Cations + ∑ Anions] was less than 5%.

Multivariate statistical analysis

All mathematical and statistical calculations were implemented using SPSS version 16.0 software to carry out the statistical analysis of the data; the data sets were log-transformed to accommodate a wide range of parameters (Matiatos et al. 2014).

Cluster analysis

The objective of cluster analysis is to group several objects to such an extent that they are more similar to one another in the same group (called a cluster) (Otto et al. 1998).

Principal component analysis/factor analysis

Factor analysis (FA) is a way used to analyse variability between observable and correlated variables as a result of a potentially lower range of variables termed factors (Shrestha and Kazama 2007). The correlations between the physico-chemical characteristics and the sample locations were investigated using principal component analysis (PCA). According to Bartlett’s and KMO’s tests, the statistical significance of PCA was determined. For optimum variable participation, the varimax rotation technique was also used (Matiatos et al. 2014). The principal components (PCs) connected the parameters and sample locations in terms of factor loadings and factor scores. If an eigenvalue is less than 1, it is kept in the model (Kaiser 1974).

Pollution indices

Heavy metal pollution index

Heavy metal pollution index (HPI) is a comprehensive tool used for overall water quality determination, according to calculated weights of each metal; HPI was calculated according to Horton (1965) and Mohan et al. (1996). HPI is classified into five classes as follows: excellent, extended from 0 to 25; good, ranged from 26 to 50; poor, ranged from 51 to 75; very poor, ranged from 76 to 100; unsuitable, more than 100. It was calculated according to the following equation:

$$\begin{array}{c}\mathrm{HPI}=\left(\sum W_i\times Q_i\right)/\sum W_i\;\left(i=n\right)\\\end{array}$$
(1)

Wi is the unit weightage of the heavy metal i (Table 1), n is the number of heavy metals and Qi is the sub-index of the heavy metal.

$${W}_{i}=k/{S}_{i}$$
(2)

Here, k is the proportionality constant; Si is the standard permissible limit of the heavy metal.

$$k=1/\sum {S}_{i}$$
(3)

S1, S2, S3 and Si represent standards for different heavy metals in the groundwater samples (Table 1).

$${Q}_{i}=100\times \left({V}_{i}/{S}_{i}\right)$$
(4)

Vi is the monitored value of the i parameter (mg/l).

Nitrate pollution index

Nitrate sources in the groundwater are classified to point sources such as irrigation of land by sewage effluents and nonpoint sources such as densely populated sanitation and intense farming practices (McLay et al. 2001). The nitrate pollution index (NPI) for the water samples was determined by Obeidat et al. (2012). The water quality according to NPI values was classified into five types: clean (unpolluted) (NPI < 0); light pollution (0 < NPI < 1); moderate pollution (1 < NPI < 2); significant pollution (2 < NPI < 3); very significant pollution (NPI > 3). The NPI for the water samples was determined by using the following relation:

$$\mathrm{NPI}=\left({C}_{s}-\mathrm{HAV}\right)/\mathrm{HAV}$$

where Cs is the analytical concentration of nitrate, and HAV is the threshold value of anthropogenic source (human affected value) taken as 50 mg/l.

Pollution index of groundwater

Drinking water quality can be assessed with the use of pollution index of groundwater (PIG) (Rao 2012), and were utilised successfully in several locations to monitor and evaluate variations in drinking water quality (Rao et al. 2018; Rao and Chaudhary 2019). In the current study, the PIG values for each water sample were calculated using the standard limit of the World Health Organization (WHO 2017) prescribed for safe drinking water (Table 2). All the observed chemical variables in each sample of groundwater are determined in the PIG values. Thus, the effects of chemical pollution on the aquifer system are distinct. PIG calculation involves four steps which are determined according to Rao (2012):

$${\mathrm{W}}_{\mathrm{p}}=Rw/\sum \mathrm{Rw}$$
(5)
$$\mathrm{Sc}=\frac{\mathrm{C}}{\mathrm{Ds}}$$
(6)
$$\mathrm{Ow}=\mathrm{wp}\times \mathrm{Sc}$$
(7)
$$\mathrm{PIG}=\sum \mathrm{Ow}$$
(8)

Ecological risk index

In consideration of the pollution and toxic response factor, the potential ecological index (ERI) for the heavy metals analysed has been quantitatively assessed. In the present study, the ERI for each groundwater sample was calculated using Eq. 9 and Eq. 10:

$$\mathrm{ERI}=\sum \mathrm{RI}=\sum \mathrm{Ti X PI}$$
(9)
$$\mathrm{PI}=\frac{\mathrm{Cs}}{\mathrm{Cb}}$$
(10)

where RI is the potential ecological risk factor of each heavy metal, Ti is the toxic-response factor of heavy metal, PI is the pollution index, Cs is the concentration of heavy metals in the sample and Cb is the corresponding background values. The toxic-response factor of heavy metals is given as Cd = 30; Co, Cu, Ni and Pb = 5; Fe, Cr, Zn and Mn = 1 (Bhutiani et al. 2017; Adimalla and Wang 2018; Taiwo et al. 2019).

Hydrochemical facies evolution of groundwater

The evolution diagram for hydrochemical facies, proposed by Giménez-Forcada (2010), offers a convenient manner of recognising the status of aquifers in temporal intrusion/refreshing phases, which is identified through the distribution of anion and cation levels in the square diagram (Fig. 4a). Three heteropic facies are identified in this plot: Na–Cl (sea water), Ca–HCO3 (fresh water) and Ca–Cl (water salinized with direct bases exchange).

Fig. 4
figure 4

a Hydrochemical facies evolution diagram (HFE-D) (Gimenez-Forcada 2010). b HFE diagram in groundwater of the Pleistocene aquifer and c HFE diagram in groundwater of Miocene aquifer

Water quality for irrigation purpose

Water quality assessment for irrigation is an essential technique for sustainable development as it gives essential information for water management. To classify the water into irrigation water categories, the first utilises sodium percentage (Na %), although it also uses sodium adsorption ratio (SAR) (Richards 1954; Wilcox 1955). Sodium percentage is calculated by dividing the sum of Na+ and K+ concentrations by total cations (Eq. 11; Raghunath 1987). SAR is calculated as the ratio between Na+ and the square root of the average of Ca2+ and Mg2+ concentrations (Eq. 12; Richards 1954). For both calculations, the ion concentrations (mEq/l) are used.

$$\mathrm{Na \%}=\frac{\left(Na+ +K+\right)x 100}{(Ca2+ +Mg2+ +Na+ +K+)}$$
(11)
$$SAR=Na+ +\sqrt{Ca2+ +Mg2+/2}$$
(12)

Results and discussion

Hydrogeochemistry

The hydrochemical data provides an overview of the physical–chemical parameters measured in the groundwater and surface water samples. Figure 5a, b shows the box plots of the water parameters, which indicate the fluctuations in the analysed parameter values. The pH values, ranged from 7.2 to 8.1 in the Pleistocene aquifer, from 7.1 to 7.6 in the Miocene aquifer and from 7.1 to 8.3 in surface water, reflected that the groundwater and surface water samples are somewhat neutral to slightly alkaline. The electric conductivity (EC) values represent water’s dissolved salt content and higher values typically reflect higher concentrations of ions in the water (Prasanth et al. 2012). The EC values ranged from 8520 to 19,410 μS/cm in the Pleistocene groundwater, from 4290 to 7920 μS/cm in the Miocene groundwater and from 3970 to 64,410 μS/cm in the surface water. The EC value differences are attributable to the composition of the aquifer rocks. The total dissolved solids (TDS) values ranged from 5041 to 11,741 mg/l in the Pleistocene aquifer, from 2360 to 4742 mg/l in the Miocene groundwater and from 2455 to 41,958 mg/l in the surface water. Major ions are observed in Na+ and Cl correspondingly as prominent cation and anion species in both the groundwater and surface water sample concentrations. The concentration of Na+ varies from 1450 to 3500 mg/l in the Pleistocene groundwater, from 580 to 1380 mg/l in the Miocene groundwater and from 740 to 13,400 mg/l in the surface water. The ionic concentration of Cl is highest among all the ions and the concentration of Cl in Pleistocene groundwater is between 2211 and 6076 mg/l, in Miocene groundwater is between 951 to 2262 mg/l, and in surface water from 1054 to 20,567 mg/l. Both the surface water and the groundwater display significantly higher Na+ and Cl concentrations, indicating that seawater most likely affects the water quality in the area under investigation and this indicates the mixing of groundwater with the matrix of marine aquifers. The concentration of K+ shows the least variation with the range of 52 to 100 mg/l in the Pleistocene groundwater, from 32 to 55 mg/l in the Miocene groundwater and 28 to 600 mg/l in the surface water. In the Pleistocene groundwater, Ca2+ and Mg2+ concentrations range between 131–767 mg/l and 152–270 mg/l, respectively. Also, in the Miocene groundwater, Ca2+ and Mg2+ concentrations range between 129–306 mg/l and 113–164 mg/l. Similarly, the concentrations of SO42− and alkalinity vary from 344 to 2650 mg/l and 50 to 205 mg/l in the Pleistocene groundwater, while in the Miocene groundwater they range from 140 to 1340 mg/l and 135 to 275 mg/l, in each case. The dissolving of marine sediments is the result of the high salinity of groundwater (Table 3). Also, to determine the hydrogeochemical characteristics of the research region, the analytical values were plotted on a Piper diagram (Piper 1944). The Piper trilinear diagram has two triangles, one for cations and the other for anions, as well as a diamond-shaped area for cations and anions together. Chemical analysis data (expressed in meq/l) of both major cations (Ca2+, Mg2+ and Na+  + K+) and major anions (Cl, SO42− and CO32−  + HCO3) are plotted in the diamond shape. The plotting of different types of groundwater quality in different sub-areas of the diamond-shaped plot makes it easy to identify them. The data of the chemical analysis of groundwater are plotted on one diagram. Figure 6 shows that groundwater samples of Pleistocene and Miocene aquifers are located in sub-area 7, where the groundwater is dominated by noncarbonated alkali and strong acids (primary salinity) exceed 50%. This reflects that the main groundwater salinisation is mainly attributed to leaching and dissolution processes of aquifer matrix rich with minerals. The graphical method described by Stiff (1951) makes it possible to illustrate this evolution (Fig. 7). The use of a Stiff diagram allows the mapping of a polygon that assumes geometry based on hydrochemical element content and provides an estimate of the dominating species for each well. The three axes of the diagram are, from top to bottom, as follows: (1) Na+–K+–Cl; (2) Ca2+–HCO3–CO32−; (3) Mg2+–SO42−. The use of Stiff diagrams allows for the chemical classification of waters determined by the presence of anion and cation facies. As shown in Fig. 8, higher Cl and Na+ contents were measured in groundwater from G4 to G16 wells in Pleistocene aquifer and from G1 to G26 in Miocene aquifer. It may be determined that there is a lot of mixing of different water types in the study region, which is caused by marine intrusion that comes through groundwater resources and the mixing of groundwater with the matrix of marine aquifers (Guesdon et al. 2016). The dominant water types that correspond to the groundwater sampled in wells were Na+–K+–Mg2+–Cl. Except well no. G1, the water type is Na+–K+–Ca2+–SO42−. It can also be determined that the mixing of diverse water types in the research area is severe, caused by saltwater due to excessive groundwater abstraction (Sherif et al. 2006). A multi-rectangular hydrochemical facies evolution diagram (HFE) can be employed to determine the dynamics of marine intrusion, considering the percentages of major ions, showing the intruding and freshening phases. Figure 5b and c shows that the majority of samples are appropriate for a phase of marine intrusion. The Na–Cl facies signifies that the state of aquifer is probably controlled by water–rock interaction. The samples shown in HFE-D confirm the hypotheses regarding salinisation with the exception of one sample (G1), which is scattered in the field of freshening. The methodology of classification proposed in the present study takes into account that of Giménez-Forcada (2010).

Fig. 5
figure 5

Box plots of TDS, Ca, Mg, Na + K, CO3 + HCO3, SO4 and Cl in the Pleistocene aquifer (a) and the Miocene aquifer (b). All concentrations are given in milligrams per litre (mg/l)

Fig. 6
figure 6

Piper diagram of the groundwater samples

Fig. 7
figure 7

Stiff diagram for the groundwater types

Fig. 8
figure 8

Dendogram based on agglomerative hierarchical clustering for Pleistocene aquifer and Miocene aquifer

For proper management of an aquatic environment, a water quality guideline must be defined. When settling on a water quality goal, the proposed use of water is taken into account. Table 3 lists the main physiochemical parameters measured, together with their WHO (2017) permissible limits for drinking purposes. The values of TDS, Cl, SO42−, TH, TOC, PO43−, BOD, COD, and Na+ were found to be above permissible limits in the groundwater and surface water according to WHO (2017). The high levels of contamination shown by the physiochemical parameters point is due to the mixing of groundwater with the matrix of marine aquifers, municipal sewage discharge from waste treatment plant in the study area, and land runoff. According to the World Health Organization (WHO 2017), high levels of PO43− are responsible for nutrient enrichment of water bodies, and are contributed by detergent-containing residential wastewater and fertiliser land runoff. The COD and TOC values indicate that both oxidisable organic and inorganic pollutants have contaminated the water body (Mohamed et al. 2015). BOD values indicate poor water quality, which can be linked to waste discharges containing high levels of organic and nutrient content, as well as increased microbial activity owing to organic matter breakdown. The concentrations of several heavy metals that were analysed, as well as the permitted limits established by WHO (2017), are given in Table 4. The cadmium (Cd2+) contents were found to be above the WHO’s permissible limit in eight Pleistocene sites, four Miocene sites and four surface water sites. It ranges from ND to 9.434 mg/l, which could be attributed to wastewaters from the treatment plant. Chromium (Cr3+), aluminium (Al3+), molybdenum (Mo2+), vanadium (V5+), iron (Fe2+) and zinc (Zn2+) contents in all groundwater samples were within permissible limits. Copper (Cu+) and lead (Pb2+) concentrations, on the other hand, were both over permitted levels in some sample locations. The copper (Cu+) contents range from ND to 3.369 mg/l in groundwater. The lead (Pb2+) contents range from ND to 0.3053 mg/l in groundwater. Manganese (Mn2+) is a secondary water pollutant, and the concentration of Mn2+ in groundwater samples was found to be over the permissible limit. It ranges from ND to 12.38 mg/l. High amounts of Mn2+ can cause water to become black or brown and have a bitter metallic taste (Dutta et al. 2018). Heavy metal concentrations seen in the research region could be due to municipal sewage water and wastewater from a waste treatment plant.

Statistical analysis

Correlation of water parameters

In accordance with the database of 13 different variables, which includes TDS, TH, Cl, SO42−, Na+, PO43−, BOD, TOC, COD, Cd2+, Cu+, Pb2+ and Mn2+, we tried to figure out how sewage treatment plants affect groundwater. Table 5A shows the Pearson correlation matrix for the Pleistocene aquifer. Total dissolved solids have a strong positive correlation with Cl, Na+, TOC, COD, Cu+, Pb2+ and Mn2+. Total hardness showed strong and positive correlation with SO42−, PO43− and BOD. Chloride showed positive correlation with Na+, TOC, COD, Cd2+, Cu+ and Mn2+. Sulphate showed positive correlation with PO43−. Sodium showed positive correlation with TOC, COD, Cd2+, Cu+, Pb2+ and Mn2+. Pearson’s correlation matrix is presented in Table 5B for the Miocene aquifer. A strong positive correlation was found between total dissolved solids with TH, Cl, SO42−, Na+, Cu+, Pb2+ and COD. Total hardness showed a strong and positive correlation with Cl, SO42−, Na+, COD, Cu+ and Pb2+. Chloride showed a positive correlation with SO42−, Na+, COD, Cu+ and Pb2+. Sulphate showed a positive correlation with Na+, COD, Cu+ and Pb2+. Sodium showed a positive correlation with COD, Cu+ and Pb2+. Total hardness showed a strong positive correlation with Cl, SO42−, and Na+, indicating the presence of Ca2+ and Mg2+ predominantly in the form of chloride and sulphate salts. There was also a strong positive correlation between sodium and chloride. Sodium was generally present in the form of sodium chloride, which was mostly caused by mixing with seawater intrusion. Cu+, Pb2+, Cd2+, Mn2+, COD, TOC and BOD all exhibit a significant positive correlation, indicating that these contaminants in water may have common origins such as industrial effluents and municipal wastewater discharge.

Table 5 (A,B)Correlation matrix, Kaiser–Meyer–Olkin (KMO) and Bartlett’s test results for 13 physico-chemical variables shortlisted for multivariate statistical analysis

Principal component analysis/factor analysis

The influence of sewage treatment plant on groundwater was determined using the same database, which comprised 13 variables. The eigenvalue for each principal component is depicted in the scree plot in Fig. 9. This was utilised to determine which of the principal components should be kept in order to comprehend the basic data structure (Dutta et al. 2018). In the scree plot, there is a noticeable change in slope. Significant principal components are those with eigenvalues greater than unity and the first after unity (Hair et al. 2006).

Fig. 9
figure 9

Scree plots of Pleistocene and Miocene aquifer

The factor loadings of the retained principle components are shown in Table 6. Factor loadings are related to the correlation between original variables and principal component loadings in that they aid in understanding the fundamental character of a component (Vega et al. 1998). In various researches, different minimum criteria for factor loadings have been employed to determine which variables are important (Gamble and Babbar-Sebens 2012). Factor loadings greater than 0.5 were deemed to have a significant contribution to the related factor in this investigation. As a result, factor loads are classified as ‘strong,’ ‘moderate’ and ‘weak,’ with absolute loading values of > 0.75, 0.75–0.50, and 0.50–0.30, respectively (Liu et al. 2003).

Table 6 Loadings of the experimental variables on the significant principal components (PCs)

First, we can see that there are two principal components in the Pleistocene aquifer. Pb2+, Na+, Cl, Cu+, Cd2+, TDS, Mn+, BOD, COD, TOC and BOD all contribute significantly to PC1, which explains 57.285% of the variance. In the correlation matrix, these variables were proven to be correlated. PC2 explains 26.199% of the variance and is contributed significantly by PO43−, SO42− and TH. Second, at the Miocene aquifer, we can observe that there are four principal components. PC1 explains 55.918% of the variance and is contributed significantly by TDS, Cl, Na+, TH, SO42−, Cu+, COD and Pb2+. The correlation matrix revealed that these factors were linked. PC2 explains 14.673% of the variance and is contributed significantly by Pb2+ and BOD. PC3 explains 9.954% of the variance and includes PO43− as the only significant positive loading. Finally, PC4 explains 9.258% of the total variability and is contributed significantly by TOC only. All the principal components have highly random variables, which makes hydrochemical and biological interpretation difficult. As a result, a rotation of the principal components was carried out to get a simpler and relevant portrayal of the underlying factors by increasing the more significant variables. Rotation modifies the variance explained by each factor (Singh et al. 2013). Tables 7, and 8 shows the factor loadings of the varimax rotated components (called varifactors). First, at the Pleistocene aquifer, we can observe that there are two varifactors. Varifactor 1 explains 47.603% of the total variance and is taken part by TDS, Cl, Na+, COD, Cu+, Mn2+, TOC and Pb2+; this can be interpreted as metal segment and nutrient contamination in the water body. Varifactor 2 explains 35.881% of the variance and is profoundly contributed by BOD, Cd2+ and Pb2+. It can be defined as heavy metal and biological pollution of a water body as a result of industrial and municipal wastewater discharges. These heavy metals are substantially associated with each other, as seen in the correlation table. Second, at the Miocene aquifer, we can observe that there are four varifactors. Varifactor 1 explains 52.398% of the variance and is taken part by SO42−, TDS, Na+, Cl, Cu+, TH, COD and Pb2+. Varifactor 2 explains 13.716% of the variance and is chiefly contributed by TH, BOD and PO43−. Varifactor 3 explains 13.403% of the variance and includes TOC and SO42−. Finally, varifactor 4 explains 10.286% of the total variability and is contributed significantly by TOC only. Organic contamination, which results from the regular flow of residential wastewater into groundwater, is represented by varifactors 3 and 4.

Table 7 Loadings of the experimental variables on the rotated significant principal components
Table 8 The overall quality (Ow) of the groundwater samples

The study used the Kaiser–Meyer–Olkin (KMO) and Bartlett tests of sphericity to ensure that the dataset was suitable for principal component analysis (PCA) and factor analysis (FA). KMO is a sampling adequacy calculation that indicates the amount of variance produced by underlying principal components (PCs) (Mitra et al. 2018). Generally, KMO values below 0.5 are undesirable, whereas values ranging from 0.5 to 0.7 are considered sufficient and higher values (above 0.7) are exceptionally good (Ustaoğlu et al. 2020). The current study achieved KMO value of 0.577 and 0.444 for Pleistocene and Miocene aquifers, respectively. Bartlett’s test examines the possibility of the correlation matrix being an identity matrix. If such a possibility exists, Bartlett’s test of sphericity assumes that all variables are unrelated and dimensionality reduction is not feasible, thus making PCA and FA inapplicable. Bartlett’s test values of less than 0.050 are favourable, indicating that there are substantial correlations between variables (Tripathi and Singal 2019). In the current case, Bartlett’s significance level is 0.000 for both aquifers, thus confirming the appropriateness to perform principal component analysis and factor analysis (Banda and Kumarasamy 2020).

Cluster analysis

Based on the database of 13 variables that include TDS, TH, Cl, SO42−, Na+, PO43−, BOD, TOC, COD, Cd2+, Cu+, Pb2+ and Mn2+, the water samples were grouped into clusters according to similarities. The hierarchical clustering analysis (HCA) was carried out using surface water and groundwater samples (Pleistocene and Miocene) from several classes, based on similarities within a class and dissimilarities between different classes. The results of HCA showed that 31 water points in the Pleistocene, the Miocene and the surface water, respectively, were classified into four types of cluster groups (Fig. 8a–d) according to observations and variables. In the dendogram (Fig. 8a) according to observations, three distinct clusters (C1, C2 and C3) are formed. G2, G9, G5, G16, G17, G6, treatment plant 2 and S10 form the first cluster (C1). The second cluster (C2) consists of G4, G23, G24, G8, G11, G21, G13, G14, G15, G19, G18, S28, S22 and treatment plant 1. These two clusters show low levels of heavy metal pollution and moderate levels of faecal and organic pollution, and the third cluster (C3) consists of two surface water S12 and S20; this cluster shows high level of heavy metals and faecal contamination. As shown in Fig. 8b, two distinct clusters are formed according to variables. C1 and C2 comprised the upstream section and downstream of the study area that received wastewater from water treatment plant. Therefore, the influence of surface water and treatment plant on groundwater is identified. In the dendogram in Fig. 8c, also three distinct clusters are formed according to observations. The first cluster consists of G1, G7, G26, S10, G3, G27, G29, G25 and treatment plant 2; this cluster shows low level of pollution. The second cluster (C2) consists of S25 and treatment plant 1; this cluster shows high level of pollution. The third cluster (C3) consists of S12 and S20, and this cluster shows moderate level of pollution. The first cluster shows treatment plant effects on groundwater. In the dendogram in Fig. 8d, four distinct clusters are formed according to variables. These clusters (C1, C2) show high level of heavy metals and faecal contamination in the dendogram. HCA showed that the possible pollution sources for the most polluted water sources were natural sources such as water treatment plant and surface runoff, with high contributions of PO43−, BOD, TOC, COD, Cd2+, Cu+, Pb2+ and Mn2+, outside of WHO norms.

Pollution indices

Heavy metal pollution index

The concentrations of heavy metals in groundwater such as Fe2+, Mn2+, Pb2+, Cu+, Cd2+, Ni2+, Cr3+, Co2+, Mo2+, V5+ and Zn2+ are listed in Table 5. From the results, it has been observed that concentrations of heavy metals such as Fe2+, Ni2+,Cr3+, Co2+, Mo2+, V5+ and Zn2+ were well below the permitted limits established by WHO (2017) for drinking water. The concentration of Mn2+, Cd2+, Cu+ and Pb2+ has been found to be more than the desirable limit of drinking water standard at many places, in both aquifers. The mean concentrations were calculated to guess the heavy metal pollution index (Panigrahy et al. 2015). Calculated index values and unit weightage values are listed in Table 1. The HPI for the study area is intended by integrating the mean concentration values of confirmed heavy metals. The particulars of the calculation are shown in Table 2B. HPI is categorised into five classes: excellent (0–25), good (26–50), poor (51–75), very poor (76–100) and unsuitable (100). Moreover, 77% of the Pleistocene aquifer samples is considered unsuitable for drinking purposes, 6% very poor and the remaining samples (17%) is considered good; on the other hand, 72% of the Miocene aquifer samples is considered unsuitable for drinking purposes, and the remaining 28% is considered good. The results were assessed that in both aquifers (Pleistocene, Miocene), the heavy metal pollution index exceeds 100 in the majority of the samples. Wells are shown to be contaminated by heavy metals. It was estimated that the region of the research would be affected by heavy metal leakage from the water treatment plant, as shown in Fig. 3b. The water treatment plant has not treated the inorganic matters especially the heavy metals.

Nitrate pollution index

Nitrate levels were ranged from 1 to 13.2 mg/l with an average of 4.03 mg/l in the study area. Nitrate was organized into three groups: (1) low (< 20 mg/l), (2) medium (≥ 20 to < 50 mg/l) and (3) high (≥ 50 mg/l). The concentration of nitrate in all the samples in the studied area is less than 20 mg/l. Five classifications of water have been determined according to NPI values: clean, light pollution, moderate pollution, significant pollution and highly significant pollution, with NPI values of < 0, 0–1, 1–2, 2–3 and > 3, respectively. NPI is smaller than zero for all groundwater samples in the class clean (Table 10).

Pollution index of groundwater

The relative contribution of the pollutants from each ground water sample was evaluated in a Pollution Index (PIG) assessment. The chemical water quality (Ow) of pH and NO3 of less than 0.1 (Table 8) shows a low impact on groundwater contamination in the current study. Based on the data in Table 8, Na+, Cl−, Ca2+, Mg2+, SO42−, Cd2+, Cu+, Mn2+ and Pb2+ had the most significant on sample water quality. This is evident in the values of Ow and PIG achieved. In the present study, the final PIG values were around 2.6 and 57.7. In five categories, the level of drinking water pollution is divided: PIG < 1.0 indicates insignificant pollution; 1.0–1.5 refers to the low pollution; 1.5–2.0 is moderate pollution; 2.0–2.5 signifies high pollution; PIG > 2.5 shows very high pollution (Table 9) (Rao 2012; Rao et al. 2018; Rao and Chaudhary 2019). Based on this classification, all groundwater samples were found to be very high polluted and therefore are very unsuitable for drinking purposes. Unfit-to-drink samples are detected in both the north and south of the research area, which indicates a substantial anthropogenic influence on the water supply (Fig. 3c)

Table 9 Groundwater classification based on the PIG and ERI

Ecological risk index

For each heavy metal in Cd2+, Cu+, Mn2+ and Pb2+ and water sample, the RI (potential ecological risk) was initially identified during the ERI evaluation (Table 9). According to Bhutiani et al. (2017), Adimalla and Wang (2018) and Taiwo et al. (2019), RI is divided into five, to reflect its impact on sample quality of heavy metal: RI < 40 (low potential risk), 40 ≤ RI < 80 (moderate potential risk), 80 ≤ RI < 160 (considerable potential risk), 160 ≤ RI < 320 (high potential risk) and ≥ 320 (very high potential risk). In the current study, depending on the classification, Cu+ and Mn+ pose low potential ecological risk. However, it was observed that Cd2+ poses high potential ecological risk to samples G2, G9, G16, G19, G21, G7, G26 and G27 while Pb2+ poses moderate ecological risk to sample G6, G8, G13, G19, G23, G1, G26, G27 and G29. Table 10 depicts the ecological risk (Er) due to individual metals and ERI by location, while Fig. 3d shows spatial variation of ecological risk due to heavy metals in the study area. The final ERI values achieved from this analysis ranged from 10.9 to 387.5 (Table 10). The ground water can be divided into four categories based on the ERI values: ERI < 150 (low ecological risk), 150 < ERI < 300 (moderate ecological risk), 300 < ERI < 600 (considerable ecological risk) and > 600 (very high ecological risk) (Adimalla and Wang 2018; Taiwo et al. 2019). According to this classification scheme (Table 9), 66.7% of the Pleistocene aquifer samples have low ecological risk, 22.2% have moderate risk and 11% are considerable risks. However, 42.8% of the Miocene aquifer have low ecological risk, 42.8% have moderate risk and 14.4% are a considerable risk.

Table 10 The potential ecological risks (RI) and heavy metal pollution index (HPI) of the groundwater samples

Irrigation water qualities

The physical and chemical qualities of water, especially dissolved salts, determine its suitability for irrigation. Water evaporates normally, leaving the dissolved salts in the soil complex. The gradual deposit of salt in the soil increases after a few years (Srinivasamoorthy et al. 2014), resulting in a toxicity and salinity hazard. Indices that assist in determining the suitability of irrigation water is explained in the following parts accordingly.

Sodium percentage (Na %)

All concentrations are given in milliequivalents per litre (meq/l) (Table 11). Water is categorised as safe or harmful based on its salt content. For agricultural activities, a Na% of greater than 60 is regarded dangerous, whereas a Na% of less than 60 is considered safe (Eaton 1950; Ravikumar et al. 2011). The Na% the study area is given in Table 9. Moreover, 88.2% of groundwater in Pleistocene aquifer was doubtful and 11.8 was unsuitable, while in Miocene aquifer 28.6% groundwater was permissible and 71.4% was doubtful. EC and Na% are plotted in Fig. 10 which showed that most of the groundwater samples were doubtful for agriculture. Irrigation water is classified according to its soluble sodium level because irrigation water with higher sodium content has lower permeability. Increased sodium and salinity hazards cause the quality of irrigation water to deteriorate.

Table 11 Suitability for irrigation based on sodium percent (Na%)
Fig. 10
figure 10

Suitability of groundwater for irrigation purpose based on sodium percentage (%)

Sodium adsorption ratio (SAR)

High SAR values (Table 10) indicate a tendency for water to replace adsorbed Ca2+ and Mg2+ with salt, affecting irrigation quality and harming soil structure. This will also lead to a decrease in infiltration and permeability of the soil to water leading to problems with crop production. According to the U.S. salinity classification (Table 12), the water is divided into four classes on the basis of salinity (C1, C2, C3 and C4) and four classes on the basis of SAR (S1, S2, S3 and S4) (Richards 1954). Groundwater samples of the Pleistocene aquifer in the study area lie in the fields C4-S2 (17.6%), C4-S3 (41.2%) and C4-S4 (41.2%), while in Miocene aquifer groundwater samples lie in the fields C4-S1 (42.8%), C4-S2 (28.6%) and C4-S3 (28.6%). Both groundwater samples are not suitable for irrigation under ordinary conditions, but may be used occasionally under special conditions as the soils must be permeable. The use of such water for irrigation may damage the soil structure, which in turn affects the water infiltration capacity and permeability of soil (Prasanth et al. 2012).

Table 12 The water quality classes according to the U.S. Salinity Laboratory Staff, (Richards, 1954)

Conclusion

In the present study, hydrogeochemical, statistical and pollution index analysis can be employed to better understand the water quality. The influence of surface water on groundwater has been identified. The concentrations of Cl and Na+ in groundwater were higher in study wells due to marine intrusion entering groundwater resources and combining with the marine aquifer matrix. This reflects mixing of groundwater with the matrix of the aquifers which have marine origin. The high salinity values of the groundwater in both aquifers could be attributed to the geomorphologic and geologic settings of the study area. The values of TDS, Cl, SO42−, TH and Na+ were found to be above permissible limits in the groundwater for drinking purposes. This water is unsuitable for human consumption. The data of Na+, Cl, Ca2+, Mg2+, SO42−, Cd2+, Cu+, Mn2+ and Pb2+ had the most significant effect on sample water quality. In the present study, the final pollution indexes of groundwater (PIG) values were around 2.6 and 57.7. All groundwater samples were very highly polluted and unfit for human use. Multivariate statistics were successfully applied to evaluate the variation in the water quality of groundwater and to identify the factors responsible for the pollution in the study area. Principal component analysis (PCA) identified several variables and varifactors, giving forward a hydrochemical meaning. First, at the Pleistocene aquifer, minerals and nutrient pollution were identified for varifactor 1, and heavy metal and biological pollution for varifactor 2. Second, at the Miocene aquifer, minerals and nutrient pollution were identified for varifactor 1, and biological pollution for varifactor 2. Organic contamination, which results from the regular flow of residential wastewater into groundwater, is represented by varifactors 3 and 4. To determine the impact of surface water and treatment plant on groundwater, water samples were classified into clusters. Water quality in the research area is influenced by both natural and manmade environmental factors. The HCA found three water pollution clusters: low, moderate and high. HCA revealed that natural sources such as water treatment plants and surface runoff were likely contamination sources for the most polluted water sources. Ecological risk index (ERI) of most groundwater samples showed moderate ecological risk due to metal contamination. Regarding sodium percent (Na%) and sodium adsorption ratio (SAR), the groundwater is unsuitable for irrigation purposes.

Recommendations (suggested solutions)

This research recommends the treatment of contaminated groundwater before human consumption. In addition, groundwater protection strategies should be implemented because the aquifers are rather shallow. Chemical analyses must be carried out periodically for groundwater to determine any water quality changes. Monitoring of the seepage of groundwater from the contaminant drains in such area will be necessary, as well as the development of a treatment plant that includes the inclusion of a triple-stage filtration and disinfection process, which includes the use of chlorine gas for sanitation before and after filtration to assure the elimination of all viruses, bacteria, and worms, and that the water must meet the standard set by the Ministry of Health.