Introduction

Groundwater is the most essential natural resources particularly in arid and semi-arid regions due to the inadequacy of surface water (Dube et al. 2020; Re et al. 2017; Sajedi-Hosseini et al. 2018). In many countries around the globe, groundwater serves as the main sources of freshwater utilized for a variety of purposes such as domestic, agricultural, and industrial uses (Alharbi and El-sorogy, 2021; Zhang et al. 2020a, b). However, the aquifers may be degraded by a variety of factors and processes, comprising hydrogeochemical processes and human activities including agricultural practices, industrial activities, mining, and urbanization (Dugga et al. 2020; Emenike et al., 2020; Zhang et al. 2021). Moreover, these factors progressively endangered the groundwater quality, which was hitherto considered a clean and safe source of water (Mostaza-Colado et al. 2018; Wali et al. 2019). Hence, the aquifers of many countries around the world were vulnerable to increasing land use activities (Liu et al. 2021; UNESCO, 2018; WHO, Unicef, 2017).

In recent years, many studies focused on hydrogeochemical characterization and ascertaining the sources of groundwater pollutants over the world with the view of understanding the relationships between natural water and the environment, and evaluating the effect of human activities on water quality (Aliyu et al. 2020; Rashid et al. 2019; Yang et al. 2016; Zhang et al. 2020a, b). Hydrogeochemical processes and anthropogenic sources of subsurface water pollutants such as inorganic fertilizer (phosphate and nitrate), industrial effluents, landfill leachates, oil and gas explorations, production, and refining of petroleum products degraded the groundwater quality in Nigeria (Ighalo and Adeniyi, 2020; Turajo et al. 2019). The sources of these pollutants are either from agricultural, domestic, or industrial sources which are otherwise broadly classified as point and non-point sources. Agriculture is the single largest user of fresh water resource. This activity is partly a cause and at the same time victim of water pollution. It is a cause through discharge of pollutants and sediments to the surface and ground water, and it is a victim through the use of waste water which contaminates crops and transmits diseases to consumers and farm workers. Irrigation and rain-fed agricultural activities is being practice in the Galma sub-watershed with wide in dry and wet season respectively with wide range of cropping. This may contribute pollutants mainly nutrients from excessive application of fertilizer to the crops. Similarly, agricultural activities have been shown to produce high nitrate and phosphate concentrations in many agricultural sites of the world. Farmers apply nutrients such as nitrogen, phosphorus, and potassium in the form of chemical fertilizers, manure, and sludge, and they may also grow legumes and leave crop residues to enhance productions. These nutrients may be washed and find its way through non-point source of pollution to nearby surface and subsurface water. Basically, phosphate and nitrate pollution indicate intensive agricultural activities; however, pollutants from chloride, potassium, and sulfate indicate industrial, mining, and sewage discharges. Domestically, different pollutants are being released from the households ranging from detergents, oil and grease, and solids waste. These pollutants have negative effects on both surface and groundwater. Of particular concern is the domestic sewage which has constant strength and pollution ability and it is the combination of grey water and municipal waste which may contain pathogens. Industries consume large quantities of water and discharge considerable amount of quantities of waste water during their production process. Industrial waste generally has assumed exponential increases as a result of rapid growth of industrial development in the urban centers where large volume of effluent are generated due to multiple production of goods in the industrial sectors. Some industrial waste contain poisonous chemicals which may become dangerous to both surface and groundwater, and most of these waste from the industries contain heavy metals. Heavy metals are among the most toxic pollutants present in marine, groundwater, and industrial waste water.

In the Galma sub-watershed of North-Western Nigeria, the aquifers are mostly unconfined, permeable, and more vulnerable to surface contamination. The excessive use of fertilizers for both rain-fed and irrigation farming, other anthropogenic sources such as household leachates, sewage effluent, and animal manure may leach into the aquifer, eventually contributed to groundwater quality degradation. The interaction between the groundwater and aquifer minerals such as dissolution of minerals may influence the groundwater chemistry, due to changes in hydrological and climatic condition in the area. Hence, hydrochemical facies are essential in understanding the past and the current trend in groundwater quality.

Classic chemical methods (Piper, Durov, Schoeller diagrams, etc.) were used to understand the hydrogeological processes that determined the groundwater quality (Ismail et al. 2020; Mustapha et al. 2019; Wisitthammasri et al. 2020). Moreover, multivariate statistical analysis including principal component analysis (PCA) and hierarchical cluster analysis (HCA) were successfully applied to assess the groundwater chemistry regarding the influence of natural and anthropogenic factors (Bouteraa et al. 2019; Liu et al. 2017; Njuguna et al. 2020; Zhang et al. 2020a, b). Hence, multivariate statistical methods are gaining popularity in the field of environmental investigation to evaluate the groundwater contamination in recent decades (Machiwal et al. 2018; Nyam et al. 2020; Scheiber et al. 2020; Tripathi and Singal, 2019). However, the use of modern and advanced statistical techniques for groundwater quality evaluation, especially in the tropical Savanna climate such as Galma sub-watershed, North-Western, Nigeria, remains rudimentary. Moreover, logical and scientific approach such as the graphical methods for groundwater quality assessment has received little attention. This is in spite of the apparent vulnerability of groundwater to contamination in the study area largely influenced by increased agricultural activities.

Hence, this study attempt to fill these gaps by employing a robust and integrated modern technique of classic chemical as well as statistical approaches for groundwater quality assessment. Therefore, the main objectives of the current study are ascertaining the hydrochemical characteristics of a tropical savanna groundwater and its pollution sources. The present study is expected to provide a significant contribution to knowledge regarding the efficacy of advanced statistical and graphical techniques for groundwater quality appraisal to ensure sustainable development and effective groundwater management.

Materials and methods

Overview of the study area

Location and climate

The Galma sub-watershed is positioned in the North-Western part of Nigeria and covers an area of approximately 6778.375 km2 (Fig. 1). The climate of the area is within the tropical humid with distinct wet and dry seasons. The wet season spans from April to October, marked by a mean annual rainfall of approximately 1000 mm (Badamasi et al. 2016).

Fig. 1
figure 1

Galma sub-watershed showing the sampling points

The mean monthly maximum temperature varies between 28 and 40 °C in January and April respectively. The relative humidity decreases during the dry season (November-March), but then rises from April (40.5%) to a peak in August (85.0%) and then drops again to about (57.1%) in October (Sawa and Buhari, 2011). The duration of the storm is usually short, although with moderately high intensity. However, the dry season occurs between November and March, associated with harmattan dust due to low level of anti-cyclones, particularly in December to January. At this period, the North-East Trade Winds start blowing southwards into the country from Sahara region. Therefore, visibility on certain periods is restricted due to the airborne dust, and humidity is low (Badamasi et al. 2016).

Geological and hydrogeological setting

Geologically, the Galma sub-watershed is underlined by a bedrock of the Precambrian crystalline basement complex (Nassef, and Olugboye, 1979). In the basement complex, the rocks types are predominantly composed of granite gneiss and migmatite as well as a small portion of coarse-grained porphyritic biotite and granite (Fig. 2). Although the porosity of the rocks in the area is virtually low, fracturing and weathering have enhanced to varying degree of porosity and permeability of these rocks in different parts within meters of the ground surface. In the field, the foliation is mostly focused toward the North–South direction of the sub-watershed and manifested by a sub-parallel of extended alignment as well as closely packed feldspar phenocrysts (Offodile, 2002).

Fig. 2
figure 2

Lineament and geological formation of Galma sub-watershed

The groundwater flow systems, storage, and hydraulic conductivity in the Galma sub-watershed were influenced by the development of secondary structural features including fractures and overburden of weathered materials. Moreover, fractured zones are found at the deep depth which may contain a substantial amount of groundwater, while the zone of weathered materials are mostly at shallow depth and contain a small amount of aquifer water (Ebenezer and Martins, 2017). Groundwater in the area mainly occurs in three major bearing zones within the basement complex. These are zones of compositional change in highly weathered areas along veins or dykes, inter-granular permeability in moderately decomposed rock, and fractures in the poorly decomposed basement complex (Offodile, 2002). Thus, the annual precipitation serves as the major source of recharge to the unconfined aquifers of the study area. The slope of the terrain corresponds to the flow of groundwater, which runs from the southeast to northwest region of the sub-watershed. Meanwhile, the drainage pattern is dendritic and the streams are all vulnerable to seasonal water level fluctuations (FMI, 2000).

Soil and vegetation

The soil of the area is composed of highly leached ferruginous that formed on weathered regolith, overlain by a thin deposit of silt. The aggregates of the soils are very small, unstable, and probably compact in wet condition. Therefore, the physiography of the soil is poor (Kowal and Kassam, 1978). The sub-watershed lies in the Northern Guinea Savannah, mainly characterized by grasses, and herbs with few deciduous trees. The grasses and herbs are evergreens along the rivers and streams. However, the natural vegetation of the area is tempered due to poor management practices and other anthropogenic activities including intensive cultivation, animal grazing, fuelwood harvesting, and annual bush burning (Ugumanim et al. 2015).

Land use/land cover settings

The dominant land use in the studied area is agricultural land, subjected to intensive rain-fed farming. However, irrigation cultivation is mostly practiced along the river Galma and other tributaries (Aliyu et al. 2020). Cereals (maize, sorghum, and millet) are the main types of crops grown in the watershed. Meanwhile, the grassland of the area supports the rearing of animals such as cattle. Thus, agricultural activities such as rain-fed and irrigation farming are the prominent land use in the Galma sub-watershed (Ugumanim et al. 2015). The cultivation practices comprise subsistence and large-scale commercial farming. The other land use/land cover in the area consists of settlements, bare soil, water bodies, and vegetation such as shrubs, grasses, and scattered trees (Fig. 3). Thus, excessive application of fertilizer and herbicides are the major cause of groundwater degradation in the area.

Fig. 3
figure 3

Galma sub-watershed showing the land use/land cover pattern

Data collection and hydrochemical investigation

In this study, groundwater samples were mainly collected from 57 boreholes sampling points of Galma sub-watershed in June to July 2019. The water samples were collected in triplicates to ensure the accuracy and integrity of the hydrochemical investigation. Hence, a total of 171 samples (Documented the average value) were collected from unconfined aquifers, with depths ranging from 35 to 70 m. At the point of collection, a handheld GPS device (Garmin GPSMAP 64sx) was used to record the sampling points of each borehole location. Prior to collection, all bottles were rinsed 2–3 times in the field using representative groundwater samples as part of quality control measures. Groundwater samples were collected in 0.75-L plastic containers. After 5–10 min of pumping, the water sample was obtained from the borehole to ensure that the samples were typical representative of the aquifer. Each bottle was labelled with the sampling site, and all of the samples were well-preserved in a cooler for 24 h before being analyzed. Meanwhile, portable equipment pieces were used to measure the in situ parameters including pH, turbidity (Tur), electrical conductivity (EC), temperature (Temp), static water level (SWL), total dissolved solids (TDS), and dissolved oxygen (DO). However, the major cations (Na+, K+, Ca2+, Mg2+) and anions (Cl, SO4−2, NO3) were measured and analyzed using atomic absorption spectrometer and ion chromatography respectively. The titrimetric method was used to determine the total alkalinity (AT), total hardness (TH), and bicarbonate (HCO3). In essence, the physicochemical parameters were investigated using standard techniques as prescribed by the American Public Health Association (APHA, 2018).

Methods

Classic chemical method

A classic chemical method such as stiff diagram (Stiff, 1951), Piper trilinear diagram (Piper, 1944), Gibbs diagram (Gibbs, 1970), and Durov diagram (Durov, 1948) were used for hydrogeological characterization. Thus, the classic chemical method is primarily concerned with the logical display of different water types on a diagram (Kura et al. 2018). However, the piper trilinear diagram is the most widely utilized method among the families of graphical techniques for the evaluation of water types. Hence, the piper trilinear plot is essential for understanding the controlling factors that influence groundwater chemistry. The hydrochemical facies of groundwater is classified based on the relative abundance of major ionic composition (Butaciu et al. 2017). Many researchers have successfully employed the piper plot for the characterization of groundwater chemical compounds (Ismail et al. 2020; Mustapha et al. 2019; Sheikhy Narany et al. 2018). In this study, hydrochemical characterization of the groundwater quality was evaluated using classic chemical methods. Thus, different water types were presented on the piper plot using AqQA (version 1.5) (Fig. 4).

Fig. 4
figure 4

Piper diagram of the groundwater samples for Galma sub-watershed

Multivariate statistical analysis

Multivariate statistical analysis is primarily used to evaluate the sources of pollution and factors influencing the groundwater quality of many aquifers around the globe (Njuguna et al. 2020; Rashid et al. 2019). Thus, HCA and PCA are vital techniques for the extraction of variables regarding the natural and anthropogenic processes that control groundwater chemistry (Vasanthakumari Sivasankara Pillai et al. 2020). In this study, multivariate statistical analysis was performed using Microsoft Excel for Windows and IBM SPSS (Version 22) statistical software respectively. Thus, HCA (dendrogram plot) and PCA (scree and loading plot) were used (Bouteraa et al. 2019; Sunkari et al. 2020).

Cluster analysis (CA)

CA is a statistical method that involves two modes of Q and R. The spatial relationships between sample points are defined by Q-mode clustering analysis (CA), while R-mode clustering analysis (CA) is used to classify parameters based on similarities with each (Soltani et al. 2017) Thus, a dendrogram is used to determine the number of clusters in which the levels of the similarity of hydrochemical variables are grouped. Meanwhile, the dendrogram offers a visual description of the process of clustering, showing an illustration of clusters and their proximity, with a drastic reduction in the dimension of the original data (Aliyu et al. 2020). Therefore, hierarchical cluster analysis (HCA) is widely used to categorize water samples into groups of chemical parameters, characterize the individual aquifer, and identify the sources of groundwater contamination (Bouteraa et al. 2019; Wisitthammasri et al. 2020). In this study, HCA was achieved in the form of Q-mode using Ward’s linkage method and Euclidean distance as a measure of similarity to physicochemical parameters (Butaciu et al. 2017).

Principal component analysis (PCA)

PCA is a valuable technique to determine the relationship between hydrochemical parameters analyzed in the water sample and to understand the sources of pollution. Hence, it is a useful tool for data reduction into components (Elumalai et al. 2019; Li et al. 2020). In PCA, groundwater data are interpreted using the scree plot and the first few principal component loadings (Nyam et al. 2020). Kaiser–Meyer–Olkin (KMO) measure is used to evaluate the suitability of the data set for PCA (Kaiser, 1974). Thus, principal components (PCs) with an eigenvalue of greater than 1.0 were taken into consideration, and contain most of the variability of the original data set. The mean idea of PCA is to reduce the dimension of the whole data set comprising a large number of variables connected with minimum loss of original information (Zhang et al. 2020a, b). In order to make the variables easier to interpret according to hydrochemical or anthropogenic processes that regulate groundwater quality, Varimax rotation was performed on these PCs (Bouteraa et al. 2019). PCs loadings were redistributed and polarized by rotation of the factor axis, and the new variables constructed were called varifactors (VFs).

Results

Descriptive statistics

The groundwater samples were analyzed for 18 parameters, which are expressed in mg/L for the cations and anions, except pH (unit), turbidity (NTU), EC (µS/cm), temperature (°C), and static water level (m). The pH values of samples are between 4.80 and 6.80 which signify a weak acidic condition (Table 1). However, EC values of the samples vary from 86.00 to 432.00 µS/cm, with a mean value of 207.69 µS/cm.

Table 1 Descriptive statistics of groundwater quality parameters in the study area (n = 57)

Groundwater chemistry

Hydrochemical facies

The Piper trilinear diagram (Piper, 1944) was used to infer the hydrochemical facies of the groundwater; it classified the groundwater types based on ionic composition. Thus, based on the hydrochemical analysis as plotted on the central diamond of piper diagram, two dominant hydrochemical facies were identified, namely: the mixed Ca–Mg–Cl type of water, which means that no cations and anions exceed 50%. The second dominant water type was Ca–Cl. Moreover, Mg–HCO3 water type was found in BH 9 and Na–Cl water type in BH 29 respectively (Fig. 4). Essentially, the hydrochemical analysis revealed the order of abundance of elements as revealed by their mean concentration values, and the Piper diagram simplified the comprehension of the results. The mechanism of groundwater evolution is that the fresh groundwater which interacted with the region dominated by Ca–Mg–Cl rich minerals within the weathered regolith in which the Calcium and magnesium are precipitated after certain reaction. While chloride usually reached the groundwater via leaching from chemical fertilizer on agricultural soils, or from wastewater discharged onto the land surface, the chloride concentration in the groundwater can thus exceed the permissible limit via this process. The second dominant water type is Ca–Cl which originates from the same dissolution process and the anthropogenic activities. Thus, the chemistry of the groundwater is influenced by their geology. Water chemistry from the basement is influenced by geogenic activities, while in other formations, the water chemistry is influenced by both geogenic and anthropogenic activities.

Identification of water pollution sources using HCA and PCA

In this study, HCA and PCA were mainly considered reasonable and significant to render a substantial reduction in the dimension of data to classify the sources of contamination in the aquifer system (Li et al. 2020). Thus, we analyzed the sets of data specific to gain accurate information regarding the groundwater quality for the effective management of groundwater resources in the entire area.

Hierarchical cluster analysis (HCA)

In this study, hierarchical cluster analysis (HCA) was performed to identify groups with similar characteristics and dissimilar to other groups, to obtain a dendrogram. Therefore, groundwater samples were investigated mainly to identify the sources of pollution. Thus, Ward’s linkage method and Euclidean distance were employed; HCA obtained an optimum of five distinct clusters (Cs) with groundwater variables as presented in the dendrogram. The first cluster comprises pH, Temp, SWL, K+, Mg2+, NO3, Fe, DO, and SO4−2. The second cluster consists of AT, Na+, Ca2+, Cl, and HCO3. The third cluster contains the Tur, the fourth cluster involves TDS and TH, while EC was mainly found in the fifth cluster (Fig. 5).

Fig. 5
figure 5

The dendrogram for groundwater samples of Galma sub-watershed

Principal component analysis (PCA)

PCA is a data reduction technique that converts a large number of potentially correlated variables into a smaller numerical value of uncorrelated variables known as principal components (PCs). Therefore, PCA reduces the variables and dimensions through Varimax rotation and Kaiser normalization techniques.

Extraction of components

In this study, PCA was applied to the groundwater samples for the identification of the principal components of groundwater pollution sources. PCA requires that each variable, including the set of variables for the Kaiser–Meyer–Olkin (KMO) measure of sample adequacy (MSA) to be more than 0.50 regarding the numerical value (Kaiser, 1974). Moreover, component loading of more than 0.50 is regarded as significant (Liu et al. 2003). Thus, anti-image correlation matrices (AICM) for groundwater parameters are presented in Table 2. KMO-MSA for HCO3, Cl, and pH variables in column A were less than 0.50. These variables were excluded for unsatisfying the requirement of PCA, and then PCA was re-run. Hence, KMO-MSA for Na+, SO4−2, Temp, DO, Tur, SWL and TDS, TH, Fe variables in column B and C respectively, were below the significant values. As such, all the variables were removed. Finally, groundwater variables including Ca2+, Mg2+, K+, EC, NO3, and AT of column D were more than 0.50 KMO-MSA and satisfied the requirement of PCA. Therefore, the overall Kaiser–Meyer–Olkin (KMO) measure of sample adequacy (MSA) has progressed from 0.625 to 0.730 values, and significant for PCA requirement (Kaiser, 1974). Hence, the main objective of PCA has been achieved by decreasing the number of observed variables to a relatively smaller number of components or factors without compromising the actual interpretation of the data.

Table 2 Anti-image correlation and Kaiser–Meyer–Olkin (KMO) measure of sample adequacy (MSA)
Fig. 6
figure 6

Scree plot of principal components

Fig. 7
figure 7

Loading plot showing the principal components

The first component explained that 33.608% of the total variance has a strong positive loading on EC, AT, and Ca2+. However, the second component accounts for 30.808% of the total variance with strong positive loading on Mg2+, K+, NO3, and AT.

Discussion

The hydrochemical characteristics of a tropical Savanna groundwater and its pollution sources have been assessed using multifaceted hydrochemical analytical and graphical tools. Thus, the use of modern and advanced statistical techniques for groundwater quality evaluation, especially in the tropical Savanna climate such as Galma sub-watershed, North-Western, Nigeria, becomes imperative. Moreover, logical and scientific approach such as the graphical methods for groundwater quality assessment has received little attention. Additionally, the aquifers have been the reservoir containing water which may be degraded by a variety of factors and processes, comprising hydrogeochemical processes and human activities including agricultural practices, industrial activities, mining, and urbanization. Thus, evaluation of hydrochemical characteristics and identification of groundwater pollution sources requires an integrated and robust method for the appropriate planning and management of groundwater resources.

The results trend in this study reflects the influences of natural and human activities especially the agricultural practices affecting the quality of groundwater. The groundwater of the Galma sub-watershed is weakly acidic with a pH mean value of 6.30. Most of the pH average values from the previous studies were above 7.5 (Butaciu et al. 2017; Zhang et al. 2020a, b). However, the low pH mean value in groundwater of the current investigation was probably attributed to the increase of H+ concentration, resulted from oxidation of pyrite in sediments. Furthermore, the dissolution of clay minerals may serve as H+ buffers and influence the pH in groundwater. The acidity of the groundwater may suggest an effect of infiltration from acidic rainwater. Thus, pH value is vital for regulating the solubility, alkalinity, and complexation of many ions in groundwater.

The high amounts of cation Ca2 + and anion HCO3 in the studied area were caused by rock mineralization and dilution. The amount of HCO3 in groundwater, for example, could be linked to the weathering of silicate rocks and minerals (feldspars) that react with carbonic acid. Thus, in the carbonic equilibrium, HCO3 in the aquifer can be derived from dissociation H2CO3 (Table 1).

The piper trilinear plot is an effective tool, frequently being used to understand the hydrochemical regime and classification of groundwater (Mostaza-Colado et al. 2018). The results from hydrochemical facies of this study suggest that the Ca–Mg–Cl and Ca–Cl of groundwater types (Fig. 5) are rich in migmatite, granite gneiss, and porphyritic granite among others, which are consistent with the geology of the study area (Fig. 2). Thus, the high contents of these minerals in groundwater may be related to the natural dissolution of rock and soil sediments in the sub-watershed. Similar findings of these groundwater types were reported in the Tripura district of the Northeastern part of India (Paul et al. 2019).

Based on the result of the hierarchical cluster analysis, the first cluster explains multiple processes influencing groundwater chemistry, including the high concentration of NO3-, which is undoubtedly indicates anthropogenic pollution, particularly synthetic fertilizer application, and leaching of agricultural waste, which has been established as the main indicator of the impacts of human activities on both soil and groundwater chemistry. Meanwhile, DO is related to groundwater contamination especially from landfill leachate and sewage runoff, and shallow aquifers are vulnerable to pollution from built-up areas and irrigated lands. However, pH, Temp, and SWL are directly linked with the climate condition and recharges to the aquifers respectively (Fig. 6). For instance, the higher temperature rises microbial and chemical activities in groundwater. Besides, an increase in water level is responsible for changes in groundwater temperature. The Mg2+, SO4−2, K+, and Fe show the natural geochemical processes (Mustapha et al. 2019).

The second cluster is more related to natural processes. Previous studies have indicated that the Na+ and Cl are derived from rock-water interaction involving the groundwater recharge in the permeable zone over the underlying rocks (Panno et al. 2006; Raiber et al. 2012). Furthermore, the mineral composition of chloride in the aquifers suggest low hydraulic gradient and fine-grained sediments in the area (Barzegar et al. 2019). However, the findings of the current investigation may have suggested the presence of Na+ and Cl in groundwater of the studied area to the influences of agricultural fertilizer, animal waste, septic effluent, and landfill leachate, although the anthropogenic factors may have less significance compared to geogenic factors. Meanwhile, the concentration of HCO3 and Ca2+ contents in groundwater of the area may be due to the weathering of carbonate minerals from the unsaturated zone associated with the flushing of CO2 rich water, where it is formed by the decomposition of organic matter and degradation of silicate minerals. However, the dissolution of albite minerals that are associated with biotite and hornblende may be responsible for the HCO3.

The third cluster contains the Tur, which is related to dissolve substances including fine-grained soil particles due to accelerated groundwater recharge, particularly in the shallow aquifers. The fourth cluster comprises TDS and TH, which may be originated from natural and anthropogenic processes in the area. For instance, the TDS parameter is probably responsible for the hardness, taste, and corrosive property of the groundwater, due to geogenic origin. However, anthropogenic sources of TDS comprises agricultural and urban contaminants leached into the aquifers (Wali et al. 2020). The fifth cluster may have signified the geogenic processes. EC may influence the groundwater salinity when salts reach the subsurface water through the infiltrated recharge water in the area.

Regarding the results from the principal component analysis, the first component may have suggested the influences of natural processes on groundwater chemistry from water–rock interaction. On this note, the groundwater quality in some of the aquifers was mainly controlled by the dissolution of rock minerals in the Galma sub-watershed. It is interesting to note that the interrelationship between classification by HCA and grouping by PCA has existed. For instance, the first component reflects the fifth cluster with additional variables. The grouping of variables by PCA has further justified the previous classification by HCA. Therefore, the strong positive loadings of electric conductivity may have revealed the evidence of increased groundwater salinity occurrence due to irrigation agriculture practiced in the Galma sub-watershed (Barzegar et al. 2017). The strong positive loading on Ca2+ was essentially due to mineral dissolution reaction where an ion is released into groundwater by dissolving aquifers minerals; thus, calcium is released when calcite dissolved in limestone (Rajesh et al. 2012).

The second component explained the effect of anthropogenic activities especially agricultural practices and natural processes. The area’s soil and geological composition may be responsible for the high Mg2 + loading in groundwater, especially when the groundwater comes into touch with certain rocks and minerals, particularly limestone and gypsum. When these elements are dissolved, magnesium is released (Table. 3). Strong loading on K+ may account for contamination derived from wastewater or common fertilizers (Wisitthammasri et al. 2020). Meanwhile, positive loading on NO3 suggests the influence of human activities particularly the cultivation of crops. This indicated that excessive application of synthetic fertilizers and animal wastes may contribute to groundwater contamination (Goni et al. 2019; Spalding et al. 2019). Hence, nitrate contamination of subsurface water particularly in the watershed due to agricultural activities may constitute a significant threat to the groundwater system.

Table 3 Factor loadings after varimax rotation

Previously, most methods of determining the effects of hydrogeochemical processes and anthropogenic activities on groundwater quality in the region remain rudimentary. For instance, the study conducted by Olukosi et al. (2017) focused mainly on the assessment of a few physiochemical parameters of water quality in the sub-watershed. However, the results of their study failed to demonstrate specifically the controlling factors that influence the groundwater chemistry of the area. In other previous research, the study has attempted to characterize the groundwater quality in the studied area (Yakubu, 2013). However, the findings of the study have not determined the hydrochemical facies and ionic compositions in the drinking water wells. In this regard, the current investigation has successfully filled these gaps by employing a robust and integrated modern technique of classic chemical as well as statistical applications for groundwater quality evaluation in the Galma sub-watershed. The method offers a comprehensive approach for the evaluation of hydrochemical characteristics and identification of groundwater pollution sources. The method may be employed in any area with similar characteristics to evaluate the influences of hydrogeochemical processes and human activities on groundwater quality.

Conclusion

Using a variety of hydrochemical analytical and graphical tools, the hydrochemical properties of a tropical Savanna groundwater and its contamination sources were examined. This is in acknowledgement of the importance of using current and advanced statistical approaches to assess groundwater quality, particularly in tropical Savanna climates like the Galma sub-watershed in Nigeria’s northwestern region.

This study demonstrates the reliability of an integrated approach involving the multivariate statistical analysis and the classic chemical method to assess the hydrochemical characteristics and identify sources of groundwater pollution in the Galma sub-watershed of the tropical savanna. The hydrochemical composition of groundwater was influenced by the weathering of basement rocks. A mixed Ca–Mg–Cl type and Ca–Cl type were confirmed as the dominant water types essentially controlled by rock-water interaction. Moreover, the ionic concentration in the groundwater samples indicates the importance of geological influence on the hydrochemistry of the aquifer water of the area. The weakly acidic regarding the low pH mean value of 6.30 in the groundwater may be linked to the increase of H+ concentration, resulting from the oxidation of pyrite in sediments.

The cations in the groundwater samples were primarily in the following order of dominance: Ca2 +  > Na +  > Mg2 +  > K + . The anions, on the other hand, have the following order: HCO3– > Cl– > SO4–2 > NO3–. The hydrochemical facies defined the sub-groundwater watershed’s types as mixed Ca–Mg–Cl, which means there are no cations or anions in excess of 50%. Ca–Cl was the second most common water type. The Mg–HCO3 water type was discovered in BH 9 of the study area, while the Na–Cl water type was discovered in BH 29 as indicated in the piper diagram. The quantities of these ions in the groundwater chemistry of the sub-watershed were caused by weathering of the basement rocks.

Moreover, the application of HCA and PCA to assess the groundwater chemistry of the area has simplified ascertainment of the dominant groundwater types and pollution sources, thus proving the adoption of robust integrated methods for groundwater investigation in increasing precision in results and effective understanding of the hydrochemical characteristics and identification of the sources of aquifer contamination in the area.