Introduction

Oil and natural gas are considered indispensable natural energy sources for modern life. Most oil production in Turkey occurs within the Southeastern Anatolian Region. The extracted oil is transferred to the Dörtyol terminals through the Batman-Iskenderun pipeline. The majority of the associated environmental problems occur during the processes of oil production, stocking, and transference. Oil, which empties into water sources or leaks into the soil, causes environmental pollution. This takes many years to clean, thereby causing damages that are difficult to recycle (Haspolat et al. 2013).

A large amount of liquid waste is produced during oil and natural gas production. Worldwide, nearly 80 million barrels of oil are produced daily, but at the same time, some 250 million barrels of liquid waste are also generated (Sahoo and Baruah 2013). This water, which is known as “oilfield wastewater” or “production water,” contains various organic and inorganic components. The inorganic substances found are anions, cations, heavy metals, and radioactive substances (Igunnu and Chen 2014). Oil production water contains phenol, polycyclic aromatic hydrocarbons (PAH), various heavy metals such as silver (Ag), cadmium (Cd), chromium (Cr), copper (Cu), mercury (Hg), nickel (Ni), lead (Pb), and also zinc (Zn), which are above the permissible limit values, and in addition, it also has a quite high salt content (Veil et al. 2004). Heavy metal concentrations in the production water depend on the age of the oil well and the formation geology (Utvik 2003). The salinity of the production water in the oilfields of Dibrugarh and Tinsukia, in the Assam region, is between 1000 and 3000 mg/L (Patel et al. 2003).

The main sources of heavy metals are local mining activities, industrial waste, oil exploration, untreated domestic wastewater, and agricultural activities (Demir Yetis et al. 2021a; Yetis et al. 2021; Ilhan et al. 2021; Varol and Sünbül 2020; Yetiş et al. 2019; Bing et al. 2019). Surface water, groundwater, soil, and the environment can become contaminated during the discharge of oil production water, which is generated in considerable quantities as a result of oilfield production (Fakhru’l-Razi et al. 2009) and then injected underground. The potential for this production water to contaminate the groundwater is an issue that needs to be studied, as both surface water and groundwater are quite sensitive to this type of pollution (Asia et al. 2007).

Metals and hydrocarbons, which result from the extraction processes of oil platforms, are highly toxic for the ecosystem and for marine and aquatic life. Heavy metals also present a significant threat to both human health and to the natural ecosystem (Demir Yetis et al. 2021b; Demir Yetis and Ozguven 2020; Derin et al. 2020; Ezemonye et al. 2019; Vrhovnik et al. 2013). Heavy metals can accumulate in aquatic environments after entering the environment through anthropogenic sources, and can damage human and animal health should it enter the food chain (Bayhan et al. 2020; Yetis et al. 2018; Selek and Demir Yetis 2017; Demir Yetis et al. 2014). As a result of heavy metal exposure exceeding established threshold values for humans and animals, health issues such as cephalalgia (headaches), neurological problems, as well as liver and kidney diseases may occur (Farmer et al. 2011). Studies have proven that when heavy metals are adsorbed, into the body, they cause inflammation that can result in difficulties in swallowing, as well as respiratory, skin, lung, and heart diseases, in addition to damage at the DNA level (Xu et al. 2016). Having high concentrations of Pb in the blood can inhibit enzyme activities, and traces of Cr, Cu, and Zn can cause nerve damage, headaches, and liver diseases. Ni is also closely linked to cardiovascular and respiratory diseases, whilst Hg, which can accumulate in fat, can damage the central nervous system (Xu et al. 2018). The discharge of heavy metals has become a significant problem, because these contaminants can cause various long-term problems due to their accumulation in biological organisms (USEPA 1997). For these reasons, the presence of heavy metals should be constantly monitored and kept under strict control according to environmental factors.

Almost all (99.5%) oil production in Turkey takes place in the Southeastern Anatolian Region, and Diyarbakir is the most significant oil-producing province in Turkey after Batman. Oil production works are carried out intensely in Diyarbakır, hence research is required to ascertain what effect this level of production activity has on the drinking water supply surrounding the production area, especially in terms of heavy metal content. In a study conducted by Turkish Engineering News (2001), heavy metal contamination was investigated at two points in the groundwater, especially in the Beykan oilfield, where extraction activities take place. However, the study was very limited and it was suggested that the region should be tested to a greater extent to also encompass the Kurkan and Shaban regions. In this context, the current study was planned, with the objectives being; (1) to compare the temporal and spatial variation of heavy metals using multivariate statistical techniques and drinking water standard values, (2) to explore the degree of heavy metal contamination in the groundwater using heavy metal pollution index (HPI), heavy metal evaluation index (HEI) and Cd (contamination index), and (3) to identify and assess the pollutant effect and toxicity risks that such metals pose as a result of oil exploration and production activities.

Materials and methods

Study area

The area of study is located in the north of the Upper Mesopotamian Basin at the Diyarbakır provincial borders. It is known as an area where all stages (systems) of oil production coexist, including search, production, transmission, storage and purification. The study area covered an estimated area of approximately 3000 km2 (see Fig. 1). Hazro, Kayakoy, Kirtepe, Kurkan, Sahaban, Kartaltepe, Kastel, Kayayolu (Derdere), Kayayolu (Hazro-F4), Saricak, Yenikoy (Derdere), Yenikoy (Sabunsuyu), Mehmetdere, Beykan areas and their surroundings constitute the working area. With almost all Turkish oil production carried out in Southeastern Anatolia, Diyarbakır’s daily crude oil production represents the second largest crude oil production operation after Batman. There are a total of 42 oilfields operated by different companies in Diyarbakır, with in excess of 260 oil wells (Aba and Kavak 2019; Kara 2018).

Fig. 1
figure 1

Map of study area and view of sampling stations

Geological formations in the Diyarbakır Plain, from the bottom to the top, respectively, are the Mardin Formation from the Cretaceous–Paleocene age, the Gercüş Formation from the Lower Eocene age, the Midyat Formation from the Middle Eocene age, the Germik Formation from the Lower Miocene age, the Celmo Formation from the Upper Miocene to Pliocene age, and a basalt unit from the Pliocene age. The formations were handled in seven groups as alluvium units. The oldest unit, the Mardin Formation, was reached from the deep oil wells of the study area. The Mardin Formation, which gives rise to the best outcrops in the Mardin and Mazıdağı area, consists of conglomerate, limestone and dolomitic limestone, and forms the reservoir of the oil extracted in the region. The Gerçüş Formation consists of alternating sandstone, conglomerate, marl, limestone and shale, and which also contains gypsum bands in some places. Midyat limestones, which outcrop in the south, are generally low sloping, thin and smooth bedded, and have karstic cavities. The Germik Formation begins with a light red base pebble, and is represented by pinkish-white, soft-clayed limestone towards the top. The Şelmo Formation generally consists of alternations of claystone, silty sandstone, conglomerate, and marl layers (Öztürk and Çelik 2008).

The surface water potential of the region is 6905 hm3, with 6520 hm3 from the Dicle Basin and 385 hm3 from the Fırat Basin. The groundwater potential is 350 hm3, whilst the surface and groundwater potential is 7255 hm3 in total. The total water flow is 7328 hm3, based on 7128 hm3 from the Tigris River and 200 hm3 from the Sinek Stream (Kara 2013).

Due to climate, topography and significant material differences, various large soil groups have formed in the Diyarbakır province, in addition to land types lacking ground cover. Most of the agricultural land consists of alluvial valley bases. Soils are mostly clayey in the plains outside the small plains such as Mermer, Hani, and Lice plains. In the Tigris Valley, medium and light soils can be found in places. Although the lime rate is generally considered to be good, all or part of the upper lime has been washed. The soil structure is clayey loam and clayey silt, and is generally deep in terms of limestone, has good permeability and low organic matter content. Lice and Hani plains are on a limestone base. Its soils are in the group of brown forest soils (TMEU 2019).

A harsh land and subtropical plateau climate prevails throughout the study area. There is a rainy and cool winter season, with a dry and hot summer season due to low volume rainfall. Winters are not considered very harsh, as the southeast Taurus Mountains prevent cold air from the north (Kara 2013). The summers are very hot, resulting in a dry and hot character low pressure system, and an average annual temperature of 15.8 °C. The highest temperature on record stands at 44.8 °C and the lowest at − 23.4 °C. From an annual precipitation of approximately 500 mm, some 400 mm falls during the winter and spring seasons alone (TMEU 2019).

Sample collection and analysis

Sample collection from the oil exploration and production area in Diyarbakir province was drawn from a set of 19 parameters, obtained seasonally from 18 different sampling sites between December 2012 and October 2013. The geographic information and a map locating the sampling stations are provided in Table 1 and Fig. 1, respectively. Water samples were taken using -one litre polypropylene bottles. HNO3 was then added to the samples to slow down microbial activities of organisms and thereby prevent the possibility of chemical reaction. The pH was also lowered to pH ≤ 2 and the samples then stored under refrigeration at + 4 °C until the analysis had been completed. The water samples were taken based on the conditions specified in the American Society for Testing and Materials (2001, 2005) and TSE (Turkish Standards Institute 1997a, b) standards until all the analyses had been completed. The groundwater samples were analyzed in an accredited laboratory that was certified according to the ISO 17025 quality system. The water potential of hydrogen (pH) and conductivity (EC) of each groundwater sample were measured in situ with a pH probe and a conductivity probe, respectively, using a WTW multimeter device (WTW Multimeter 3630 IDS; Date 2010). The total dissolved solids (TDS) were analyzed according to the “drying at 180 °C/weighing” standard method (American Public Health Association 1998). All reagents and solutions were prepared using deionized water with a conductivity of 0.055 µS/cm. The chemicals used in the preliminary preparation and extraction processes were obtained from Sigma Aldrich and Merck companies as extra pure. After the ICP–OES (Perkin Elmer 7000) device calibration was achieved according to ICP standards (Al, As, B, Ba, Cd, Co, Cr, Cu, Fe, Hg, Li, Mn, Ni, Pb, Se, Zn) (1000 mg/L Merck), the measurements of each heavy metal type was then assessed. The accuracy of the analysis was checked using certified reference materials (NIST Standard Reference Material 1640a for Trace elements in natural water). Relative standard deviation (RSD) varied between 5 and 10%, but did not exceed 10%. In standard solutions, this rate was below 5%. Device readings were made based on three replications. All the water parameter levels were compared to both the Turkish Standards (TSE 2005) and those of the World Health Organization (WHO 2017) for drinking water.

Table 1 Geographical information for groundwater samples in Diyarbakir oil exploration–production area

Statistical analysis

Multivariate statistical techniques were employed as the methods used to measure and explain the relationships between multiple variables (Sojka et al. 2008). Pearson correlation matrix and principal component analysis/factor analysis (PCA/FA) and cluster analysis (CA) for heavy metal concentrations were calculated using IBM’s SPSS-25 software (Statistical Package for the Social Sciences). PCA/FA is essentially used for the reduction of large complex data matrices so as to provide meaningful information on the important parameters and better interpretation of variables (Barakat et al. 2016; Filik Iscen et al. 2008). PCA is a method that takes the eigenvalues and eigenvectors from the covariance matrix of the original variables. Components are obtained using Varimax rotation method where the eigenvalues > 1 is statistically accepted for interpreting results (Kaiser 1960; Shrestha and Kazama 2007). CA was employed to classify the heavy metals on the basis of their chemical property similarities (Radu et al. 2020; Li et al. 2018). For the purpose of identifying sources of pollution, an exploratory hierarchical cluster analysis was formed based on heavy metal concentrations, and monitored according to chemical and physical parameters. Hierarchical agglomerative clusters are used to identify intuitive similarity relationships between any one sample and a dataset using a dendrogram which gives a visual of the clustering process. The clustering procedure is formed by Ward’s linkage method and similarity distance is measured by squared Euclidean distance on standardized raw data (z transformation) (Rajkumar et al. 2020; Makokha et al. 2016).

Pollution indices

The classical indices approaches, namely heavy HPI (Mohan et al. 1996), HEI (Edet and Offiong 2002), and Cd (contamination index) (Backman et al. 1998) were used to state the contamination degree of heavy metals found in the groundwater.

Heavy metal pollution index

HPI is a very useful tool to identify the general effect of metals contamination in groundwater. HPI depends on numerous factors such as unit weightage of a metal (Wi) and the prescribed standard permissible limits (Si) for each metal (Horton 1965). HPI is calculated using Eq. (1):

$$\mathrm{HPI}=\frac{\sum_{{i}=1}^{{n}}\mathrm{WiQi}}{\sum_{{i}=1}^{{n}}{Wi}},$$
(1)

where Qi is sub-index of the i-th parameter, Wi is the unit weight of the i-th parameter, and n is the number of parameters considered (Mohan et al. 1996). The critical pollution index is taken as 100 in this indexing approach (Prasad and Bose 2001). The Qi is calculated using Eq. (2)

$${Qi}=\sum_{{i}=1}^{{n}}\frac{\left\{{Mi}\left(-\right){Ii}\right\}}{{Si}-{Ii}}\times 100,$$
(2)

where Mi is the concentration of the i-th heavy metal, and Ii is the highest desirable limits of the i-th heavy metal (Mohan et al. 1996). The Si, Ii, MAC, and Wi values are listed in Table 2 for HPI and HEI.

Table 2 Data used to calculate HPI and HEI values

Heavy metal evaluation index

According to Edet and Offiong (2002), HEI offers information on overall water quality with respect to heavy metals, and is computed using Eq. (3):

$$\mathrm{HEI}=\sum_{i=1}^{n}\frac{{H}_{\mathrm{C}}}{{H}_{\mathrm{MAC}}},$$
(3)

where Hc is the monitored value, and HMAC is the maximum admissible concentration (MAC) of the i-th parameter (Edet and Offiong 2002).

Contamination index

The Cd summarizes the combined effects of several water quality parameters that are considered harmful within domestic water (Backman et al. 1998). Therefore, Cd is the summation of all contamination factors that exceed the upper permissible values, as shown in Eq. (4).

$${C}_{\mathrm{d}}=\sum_{i=1}^{n}{C}_{\mathrm{fi}},$$
(4)
$$\tt \it \it \it \it \it {C}_{\mathrm{fi}}=\frac{{C}_{\mathrm{Ai}}}{{C}_{\mathrm{Ni}}}-1,$$
(5)

where Cfi is the contamination factor (Eq. 5), and CAi is the monitored value of the i-th component and CNi is the upper permissible concentration of the i-th component (Backman et al. 1998; Edet and Offiong 2002).

Results and discussion

Groundwater quality and classification

The assessment of physical and heavy metal composition around the oil production area in Diyarbakir was drawn from a set of 19 parameters obtained seasonally from a total of 18 sampling stations between December 2012 and October 2013. The physical parameter and heavy metal values of the groundwater samples presented in Table 3 provide the statistical summary [mean, min, max, and standard deviation (SD)], and also include sample percentages that exceed the permissible and desirable limits as stated by the Turkish National Standard (TSE 2005) and the International Standard (WHO 2017) for drinking water. These percentages were calculated according to the number of samples exceeding the total values measured in all seasons. For the 18 seasonal groundwater samples analyzed, the pH ranges measured between 7.01 and 8.59, and the recorded EC values ranged between 314 and 950 µS/cm. Accordingly, both the pH and EC values exceeded the established guideline values (WHO 2017) and permissible limits (TSE 2005) in 6.94% and 13.89% of the samples, respectively. The obtained TDS value was found to exceed the guideline value (WHO 2017) in 1.38% of the tested samples. Heavy metal concentrations were found to decrease as: Cd > Mn > Fe > Pb > B > As > Zn > Al > Se > Hg. The analyses showed the Co, Cr, Cu, Ba, Li, and Ni concentrations to be within the permissible limits and guideline values for drinking water (WHO 2017; TSE 2005). Cd concentrations exceeded the guideline values in 100% of the samples according to WHO (2017), and exceeded the desirable limits in 59.72% of the samples according to TSE (2005). Mn exceeded the desirable limits in 100% of the samples, and the permissible limits in 11.11% of samples according to the TSE (2005), and by 4.17% of the samples according to the WHO (2017). Fe exceeded the desirable limits in 90.27% of the samples and permissible limits in 30.55% of the samples according to the TSE (2005), and exceeded the guideline values in 8.33% of the samples according to the WHO (2017). Pb exceeded the limit value of 10 µg/L in 76.39% of the samples, and B exceeded the limit value of 1 µg/L in 75% of the samples. The limit values were exceeded in 19.44% of the samples for As and by 18.06% for Zn. Al was found to exceed the guideline values in 15.28% of the samples, and exceeded the desirable limits in 4.17% of the samples. In 4.17% of the samples for Se and 1.38% of the samples for Hg, the values were found to exceed the permissible limits. While the limit values for Cr, Cu, Ba, and Ni were not found to have been exceeded, there is no limit value in the standards for either Co or Li. Excess concentrations of Fe, Pb, B, As, Zn, Al, Se, and Hg were found above the permissible limits, and higher contents of Mn and Cd were found to be above the desirable limits in the tested groundwater samples.

Table 3 Descriptive statistics of physical and heavy metals parameters and regulatory limit values

These findings may be due especially to the oil production activities (Kara 2013), as related to the geological formation of the area, and the effect of settlements and agricultural activities (Demir Yetiş 2019; Kumar et al. 2019; Ozguven and Demir Yetis 2020). Parameters such as Pb and Cd, which are known as the basic parameters in oil production, together with macro elements such as Al, Fe, and Mn, which exceeded the limit values, were among the heavy metals expected to show high values in bodies of water by moving into surface waters or leaking into the groundwater (Demir Yetis and Akyuz 2021; Aba and Kavak 2019; Asia et al. 2007; Fakhru’l-Razi et al. 2009).

To classify these water sample findings, the method of Ficklin et al. (1992), which was modified by Caboi et al. (1999), was then applied using the metal load (mg/L) and water pH values (see Table 4). In the current study, the metal load was calculated as Al + As + B + Ba + Cd + Co + Cr + Cu + Fe + Hg + Li + Mn + Ni + Pb + Se + Zn. This calculation was made according to the seasonal averages of all metals. On the other hand, seasonal average values of pH were used for Fig. 2. Figure 2 shows the relationship between pH and total metal contents of the analyzed samples. The results indicate that the majority of samples (88.89%, 16 of 18 samples) were classified as “near neutral–low metal,” while the other two samples (11.11%) showed a “near neutral–high metal” classification (G10: Kayakoy and G17: Sahaban). G10 and G17 are sampling points close to where the extracted oil is collected, separated, and then transferred. High metal values at both of these stations can be considered as posing serious health threats to those who consume drinking water sourced from that area (Rezaei et al. 2019).

Table 4 Physicochemical and heavy metal analysis of groundwater
Fig. 2
figure 2

Classification of groundwater samples based on diagram of water pH and metal load

Evaluation of pollution indices

The heavy metal pollution index values were computed, respectively, for all sampling stations, and reported in Table 5. The mean values from all the sampling points were 71.175, which is below the critical index value of 100. Also, classification of the groundwater samples based on HPI, HEI, and Cd are reported in Table 6, and while the HPI values of the sampling points at G1, G12, G16, and G18 were found to be below 70, the samples taken at the other points were calculated as exceeding 70. Sampling point G1, which is where the HPI values were found to be low, was opened to extract oil; however, it was then abandoned due to excessive water outflow. In addition, the fact that this sampling point has a depth of approximately 1500 m minimizes the possibility of heavy metal contamination from oil. Another sampling point with a low HPI value was G12, which is a considerable distance away from the oil production field. At the same time, the fact that the amount of oil produced in this field is less compared to other fields is thought to be effective in the low value registered at the sampling point. Due to the high flowrate at sampling point G16, which revealed a low HPI value and is a source of spring water, the dilution effect may be applicable. Sampling point G18 showed a low HPI value due to its interaction with surface water.

Table 5 Pollution index values for all sampling points around the oil exploration-production area
Table 6 Classification of groundwater samples based on HPI, HEI and Cd

According to Mohan et al. (1996), the HPI value of the 18 sampling points in the current study were all classified as “medium.” While the HEI values revealed at sampling points G14, G16, and G18 were below 10, the values found at the other 15 sampling points were calculated as exceeding 10. Among these three sampling points where the HEI values were found to be low, the reason for points G16 and G18 being below 10 was as previously mentioned. For sampling point G14, it can be said that a dilution effect was evident due to the high flowrate of spring water.

According to Edet and Offiong (2002), the calculated HEI value of the 18 sampling points included in the current study were classified as “low” for three of the sampling points (G14, G16, and G18), and “medium” for the other 15 sampling points. While the Cd values of the sampling points at G14, G15, G16, and G18 were revealed to be below 10, the other 14 sampling points were calculated exceeding 10. Sampling point G15 had a low Cd value; however, it is a borehole with a depth of 50 m; therefore, it can be stated that this point is rich in flowrate with considerable dilution effect when compared to the other sampling points. According to Backman et al. (1998) and Edet and Offiong (2002), the calculated Cd value at the 18 sampling points in the current study was classified as “high.”

Potential heavy metal sources

Pearson correlation matrix, PCA, and CA statistical analyses were used to identify the different pollution sources affecting heavy metal parameters (Rezaei et al. 2019). In groundwater studies, correlation analysis is a common technique used to evaluate the correlation between measured parameters. In terms of this study, it was used to measure the degree of relationship observed between heavy metal results measured in groundwater samples around the oil production area with each other and among other analyzed variables Accordingly, it was seen that most of the heavy metal parameters had statistically significant correlations with each other, which indicated close relationships.

Data from the 18 sampling stations were combined in order to calculate the correlation matrix of the 16 heavy metal parameters with pH, EC and TDS, and the correlations among the variables were assessed using correlation analysis (see Table 7). The correlation between pH and heavy metals has the highest correlation with Cr (r = 0.630). The highest correlation with EC and TDS is the correlation made with Zn (r = 0.392). A strong positive correlation was observed between Li and Co (r = 0.961), and between Co and Cd (r = 0.813). Also, a strong positive correlation was observed between Se and B (r = 0.786), and between Li and Cd (r = 0.773). The Pb value was found to be positively correlated with Cd (r = 0.596) and with Cu (r = 0.413). The correlation studies showed that a negative relation was found between Li (r = − 0.611) and Co (r = -0.490) and with Se. A positive correlation was observed between Mn and Fe (r = 0.607), which shows that if a body of water contains iron, manganese is often found in that same environment (Demir Yetiş 2019). Also, a positive correlation was observed between Cr and As (r = 0.541).

Table 7 Correlation matrix of heavy metals present in groundwater of oil exploration–production area

The strong correlations between heavy metals found in the research area may be largely due to oil exploration and production processes. In addition, pesticides and chemical fertilizers are used extensively in agricultural activities carried out near all sampling points in the region. Mining activities (such as a stone pit) are also available near some points. Domestic wastes are effective in areas close to residential regions. Finally, industrial activities (such as a cement factory) also have an impact. It can be said that all these pollution sources are responsible for the high heavy metal values. Of course, the effect of geogenic origin should not be forgotten (Egbueri and Enyigwe 2020).

PCA was used to indicate the expected sources of 19 physicochemical and heavy metal parameters found within the Diyarbakir oil exploration and production area, as shown in Table 8. Varimax rotation was applied to the basic component analysis in the current study. Components with loading coefficient of > 0.75 are considered as having a “strong significance,” while those between 0.50 and 0.75 as of “medium significance,” and between 0.50 and 0.30 as having “weak significance” in terms of PCA interpretation (Liu et al. 2003).

Table 8 Varimax rotated factor analysis for physicochemical and heavy metal parameters in study area

The principal factors were extracted from the variables where eigenvalues > 1. According to the results of the initial eigenvalues, seven principal components extracted by scree plot explained 79.2% of the total variance as cumulative variance. The first factor (PC1) accounted for 18.17% of the total variance, where Cd, Co, and Li had a strong positive loading, while Pb had a moderate positive loading. Also, As and Se had a negative weak loading. The second factor (PC2) explained 30.28% of the total variance, where EC and TDS had a strong positive loading, and Zn had a weak positive loading. The third factor (PC3) explained 42.00% of the total variance, where pH and Cr had a strong positive loading. While As had a moderate positive loading, Al had a positive weak loading. The fourth factor (PC4) explained 53.62% of the total variance, where Cu, Pb, Fe, and Zn each had a positive moderate loading, and Mn had a positive weak loading. The fifth factor (PC5) explained 63.29% of the total variance, in which Se and B had a positive strong loading, and Mn had a negative weak loading. The sixth factor (PC6) explained 72.77% of the total variance, in which Ba had a positive strong loading, whilst Hg had a negative moderate loading, Cu had a negative weak loading, and both Al and Fe had a positive weak loading. The seventh factor (PC7) explained 79.25% of the total variance, in which Ni had a positive strong loading, and both Al and Mn had a positive weak loading.

To further investigate the potential sources of the heavy metals deteriorating the groundwater quality, cluster analysis was conducted. Figure 3 shows that three major clusters were identified. The first cluster was comprised of a group of heavy metals (Fe, Mn, Cu, Pb, Zn, Hg, and Ni) and EC, TDS. Values for both EC and TDS can be expected to be higher, especially at points close to oil exploration–production sites (Fakhru’l-Razi et al. 2009; Patel et al. 2003). In a study conducted by Turkish Engineering News (2001), two drilling wells were opened close to the Beykan field, where oil was extracted, and the groundwater in these wells were both found to be contaminated with heavy metals. There is also the possibility of geological origin. The same can be said for heavy metals. In addition, it can be stated that it may originate from anthropogenic sources, including industrial discharge and domestic waste discharge at points near to the larger settlements. Nonpoint sources such as agricultural activities and surface run-off could also be considered as a source of heavy metals (Makokha et al. 2016). Finally, it is possible to talk about the effect of existing mining activities in the region such as copper deposits near to Eğil (Kara 2013; WHO 2017).

Fig. 3
figure 3

Dendrogram using Ward’s linkage method for characterization of heavy metals

The second cluster was comprised of Co, Li, and Cd. This class of heavy metals is of anthropogenic and geogenic origin. From this group of heavy metals, Li can often be found in the form of Li compounds in high-salt waters, in areas where oil production takes place (Akgok and Sahiner 2017). Likewise, both Cd and Co metals may also originate from either oil production or mining activities (Celebi 2018; Edet and Offiong 2002). It is clear that human activities are more dominant than natural processes in reaching the groundwater of heavy metals such as Fe, Pb, B, As, Zn, Al, Se, and Hg concentrations (Egbueri and Enyigwe 2020).

The third cluster was comprised of pH, Cr, As, B, Se, Al, and Ba that are peculiar to both anthropogenic and geogenic origins. Although Cr is highly attributable to anthropogenic sources, its source in groundwater depends on the dissolution of chromium-bearing minerals such as chromite and mica (Barzegar et al. 2019) found in mud rock. Likewise, there may have been an effect from the phosphate deposits near to Silvan as an anthropogenic source (Kara 2013; Yeşilnacar et al. 2016).

Conclusions

The majority (99.5%) of oil production in Turkey takes place in the Southeastern Anatolian Region, with the province of Diyarbakir being the most oil producing after Batman. Groundwater samples were collected from 18 potentially contaminated sampling stations at sites near to the oil exploration–production area to analyze the ground water for physicochemical and heavy metal parameters.

The study compared heavy metal values according to the limit values of the Turkish National Standard (TSE 2005) and the International Standard (WHO 2017). In addition, the heavy metal pollution indices of HPI, HEI, and Cd were used to determine the heavy metal contamination found in the groundwater near to the Diyarbakir oil production–exploration area. To further investigate the potential sources of the heavy metals deteriorating the groundwater quality, multivariate statistical techniques such as correlation matrix, principal component analysis, and cluster analysis were conducted.

The testing results showed that the limit values for Cr, Cu, Ba, and Ni were not exceeded. However, the Fe, Pb, B, As, Zn, Al, Se, and Hg concentrations were found to be above the permissible limits as recommended for drinking water according to the TSE (2005), and that the Mn and Cd concentrations were above the desirable limits recommended for drinking water by the WHO (2017). Concentrations of the heavy metals were found to decrease as follows: Cd > Mn > Fe > Pb > B > As > Zn > Al > Se > Hg. This may be especially due to oil exploration–production activities related to the geological formation of the area, in addition to the effect of settlements and agricultural activities. Parameters such as Pb and Cd, which are known as the basic parameters in oil production together with macro elements such as Al, Fe, and Mn, which all exceeded the recommended limit values, were among the heavy metals expected to show high values in water bodies moving to the surface waters or leaking into the groundwater.

According to the diagram of metal load and pH, the majority of the samples (88.89%, 16 of 18 samples) were classified as “near neutral–low metal,” whilst two of the samples (11.11%) showed “near neutral–high metal” (G10: Egil-Kayakoy and G17: Egil-Sahaban, which are sampling points located close to where the extracted oils are collected, separated, and then transferred).

The heavy metal pollution index calculated with the mean values from all 18 sampling points was 71.175, which is below the permissible or critical index value of 100. The HPI value was classified as “medium” for all 18 sampling points. The HEI value was classified as “low” for three of the sampling points (G14, G16, and G18), and “medium” for the other 15 sampling points. The Cd value was classified as “high” for all of the 18 sampling points. A strong positive correlation was observed between Li and Co (r = 0.961) and between Co and Cd (r = 0.813). Principal component analysis of the water quality data produced seven principal components with eigenvalues > 1, which accounted for 79.2% of the total variance.

In the hierarchical cluster analysis, a three-cluster dendrogram was produced. Factors obtained from PC and clusters obtained from Hierarchical CA demonstrate that the parameters responsible for the heavy metal contamination were mostly related to oil exploration–production activities, to other anthropogenic origins (such as mining, industrial and domestic waste discharge and agricultural activities), and to geogenic origin. According to the results of the current study, which was performed for the first time, it was determined that contamination exists in terms of heavy metals in the groundwater near to the Diyarbakir oil exploration and production area. In this context, the relevant authorities should take the necessary precautions to protect the groundwater and surface waters around the oil production fields as a matter of vital importance for the sustainability of the water resources.